C# Regex Tutorial: What Is A C# Regular Expression

This C# Regex tutorial explains what is a regular expression in C#, its syntax, Regex class methods, and how to use these methods with the help of examples:

The regular expression in the C# is used for matching a particular character pattern. Regular expressions are used whenever a user needs to find some repeating pattern or do a data validation or even for checking data formatting.

A RegEx is used to find whether a string contains or matches with a given character pattern. A regex is primarily a character sequence that denotes a pattern.

A pattern can be anything ranging from numbers, character, or a combination of all. Regex is widely used for validation. Parsing or matching strings, for example, finding if a string matches currency format, phone number, or date format.

=> Read Through The Easy C# Training Series.

Table of Contents:

Regex Class In C#
Conclusion

Regex Class In C#

Regex class is used in C# to perform regex operations. It contains several different methods that can be used to perform different operations related to regex.

It can be used to parse large text to find a particular character sequence by using methods that can be used to perform a match, to replace or can be used to split the character sequence.

The regex class is present inside the namespace; System.Text.RegularExpression. The class accepts a string in the form of a character sequence as a parameter.

C# Regex Methods

IsMatch

The simplest and most useful method in the Regex class is the IsMatch method. This method has different overloads for performing matching of characters based on different parameters.

The simplest one is IsMatch(string text), this method allows the user to provide a character sequence for matching a string.

The second overload IsMatch(string text, int position). This method returns a Boolean value and takes two (string and integer) parameters. The string text specified in the parameter is basically a regex constructor that will match with the character sequence from the start position specified by the integer parameter.

Thus, this method will try to match the string input at the position provided by an integer parameter in a given string.

The third overload, IsMatch(String text, String pattern) accepts two parameters and returns a Boolean value. The first parameter is text in which the user needs to find a pattern and the next parameter provide the pattern which the user is searching in the given text.

Recommened reading =>> Pattern matching in MySQL

Replace(String text, String replacementText)

The replace method accepts two parameters and returns a string value. The first parameter is the character sequence or regex that you want to use for match and the second one is the replacement of the regex.

The method works by finding a match of the given text and then replaces that with the replacement text provided by the user. The method signature is public string Replace(string text, string replacementText)

Public string[] Split(string text)

The split method from the regex class accepts string input as a parameter and returns an array containing substrings. The parameter passed in the method is the string that needs to be split.

The method finds the matching input pattern in the string and once it identifies any matching pattern, it splits the string at that place into smaller substring with each matching pattern being the breaking point. The method then returns an array containing all the substrings.

Usage Of Regex C# Methods

Let’s have a look at the usage of these methods by writing a simple program.

public static void Main(string[] args)
        {
            string patternText = "Hello";
            Regex reg = new Regex(patternText);

            //IsMatch(string input)
            Console.WriteLine(reg.IsMatch("Hello World"));

            //IsMatch(string input, int index)
            Console.WriteLine(reg.IsMatch("Hello", 0));

            //IsMatch(string input, string pattern)
            Console.WriteLine(Regex.IsMatch("Hello World", patternText));

            //Replace(string input, string replacement)
            Console.WriteLine(reg.Replace("Hello World", "Replace"));

            //Split(string input, string pattern)
            string[] arr = Regex.Split("Hello_World_Today", "_");
            foreach(string subStr in arr)
            {
                Console.WriteLine("{0}", subStr);
            }

        }

The output of the above program

True
True
True
Replace World
Hello
World
Today

The explanation for the above code:

At the start of the program, we have created an object and for the pattern that we will be using for the code matching in the subsequent string input, we have used text formatting to keep things simple in the beginning but if you are comfortable you can start using regular expression patterns. (We will discuss regular expression pattern in detail as we move forward in this tutorial)

Then, we will use match string to input the factor that we have declared as the specified object with the input string and if it matches then it will return to return false.

The next method we used is IsMethod(string input, int index). This method accepts two-parameter, and here we provide input string and the index from where the match has to start. For example, here we wanted to start the matching from the start of the input string.

Then we demonstrated the use of IsMatch(string input, string pattern). Here we provided the input string then we wanted to find that if the pattern text is present in the input or not. If its present then it will return true (as in our case) else it will return false.

Another method that we discussed is replaced. This method is quite useful in programs where you want to make changes to the input data or change the format of the existing data.

Here we provide two parameters, the first one is the input string and the second one is the string that can be used to replace the previous string. This method also uses the pattern defined in the regex object that we defined earlier.

Another important method that we used, is split. This method is used to split the given string based on some recurring patterns. Here, we have provided a string “Hello_World_Today”.

Let’s say we want to remove the underscore from the given string and get the substrings. For this, we specify the input parameter and then we give the pattern that we need to use as a splitting point. The method returns an array and we can use a simple loop like foreach to retrieve all the strings.

Regular Expression Syntax

There are several different syntaxes such as special characters, quantifiers, character classes, etc. that can be used to match a certain pattern from a given input.

In this part of the tutorial, we will be diving deep into the syntax offered by regex and will try to solve some real-life scenarios using them. Before we proceed, make sure that you gained the basic idea of regex and the different methods available within the regex class.

Special Characters

Special characters in a regex are used to assign several different meanings to a pattern. We will now look at some of the widely used special characters and their meaning in Regex.3

Special characters	Meaning
^	This is one of the most widely used syntax. It denotes the start, the word or pattern after this starts matching from the start of the input text.
$	This sign is used for matching words from the end of the string. Words/patterns denoted before this symbol will match with the words present at the end of the string.
. (dot)	Dot is used to matching a single character in the given string occurring once.
\n	This is used for a new line.
\d and \D	Lower case ‘d’ is used to match a digit character and upper case ‘D’ is used to match non-digit characters.
\s and \S	Lower case ‘s’ is used to match white spaces and upper case ‘S’ is used to match non-white space.
\w and \W	Lower case ‘w’ is used to match alphanumeric/underscore characters and upper case ‘W’ is used to match non-word characters.

Quantifier Syntax

Quantifier syntax is used to count or quantify the matching criteria. For example, if you want to check if a particular string contains an alphabet one or more times. Let’s have a look at some of the commonly used quantifiers in Regular expression.

Quantifier Syntax	Meaning
*	This symbol is used to match the preceding character.
+	This symbol is used to match one or more characters in a row.
{n}	The numeric digit inside the curly braces is used to match the number of the preceding character defined by numeric inside curly braces.
{n,}	The numeral inside curly braces and this symbol is used to make sure that it matches at least n (i.e. numeral value inside braces).
{n, m}	This symbol is used for matching from preceding character from n number of times to m number of times.
?	This symbol makes preceding characters match as optional.

Character Class

The character class is also known as character sets, and this is used to tell the regex engine to look for a single match out of several characters. A character class will match only one character and the order of the characters enclosed inside the character set doesn’t matter.

Character Class	Meaning
[ range ]	The square bracket symbol is used to match for a range of characters. For example, we can use it to define any character in range from the alphabet “a” to “z” by enclosing the range inside the bracket like [a-z] Or, we can also match with numeric “1” to “9” by denoting [1-9]
[^ range]	This denotes negate character class. It is used to match anything, not in the range denoted inside the bracket.
\	This is used to match special characters that may have their own regex symbols. The slash is used to match the special characters in their literal form.

Grouping

Round brackets or parentheses can be used to group a part of the regular expression together. This allows the user to either add a quantifier with the expression.

Grouping	Meaning
( group expression )	The round brackets are used for grouping an expression.
\|	The \| operator is used inside the round bracket for using an alternative for example (a \| b).

C# Regular Expression Examples

In the previous section, we learned about the regular expression symbols, in this section we will look in detail about the usage of different symbols in Regular expression and the combination in which they can be used to match different expressions.

In this tutorial, we will discuss some of the most widely encountered real-life scenarios that you may face as a developer while working on some application or in a simple program to get user input.

Regular Expression example with real-life scenarios

Let’s learn more about regular expressions using some real-time examples.

Scenario 1: Validate if the input string is composed of 6 digit case-insensitive alphabet characters.

A most common scenario for regular expression is finding and matching a given word. For example, let’s say I want a random alphabetic string from the user and that input should be exactly 6 digit long.

To validate that we can use a simple regular expression. Let’s write a program to understand the regular expression writing and usage in a better way.

public static void Main(string[] args)
        {
            string patternText = @"^[a-zA-Z]{6}$";
            Regex reg = new Regex(patternText);

            //When pattern matches
            Console.WriteLine(reg.IsMatch("Helios"));

            //When pattern doesnt match
            Console.WriteLine(reg.IsMatch("Helo"));

        }

Output

True
False

Explanation

In this example, we are trying to validate an input string, to check if it contains six-digit alphabetic characters. The characters can be both in lower and upper case, so we need to take account of that as well.

So, here we defined a regular expression pattern in variable “patternText” and then passed it into the regex object. Now, the next lines of code are pretty simple, we used the IsMatch method to compare the regular expression and the input string.

Let’s now have a look at the regular expression that we have devised. The expression (^[a-zA-Z]{6}$) is made up of 4 different parts. “^”, “[a-zA-Z]”, “{6}” and “$”. The second part denotes the matching characters, that is used to perform expression matching, “a-z” for lower case and “A-Z” for upper case letters.

The first part character “^” ensures that the string starts with a pattern defined in the second part i.e. lower and upper case alphabets.

The curly braces in the third part determine the number of characters in the string that can be identified by the defined pattern i.e. 6 in this case and the “$” symbol make sure that it ends with the pattern defined in the second part.

^[a-zA-Z]{6}$

Scenario 2: Use Regular expression to validate that a word that starts with “Super” and has white space after that i.e to validate if “Super” is present at the start of a sentence.

Let’s assume we are reading some user input and need to make sure that the user always starts their sentence with a particular word, number, or alphabet. This can be achieved quite easily by using a simple regular expression.

Let’s look at a sample program and then discuss in detail on how to write this expression.

        public static void Main(string[] args)
        {
            string patternText = @"^Super\s";
            Regex reg = new Regex(patternText);

            //When pattern matches
            Console.WriteLine(reg.IsMatch("Super man"));

            //When pattern doesnt match
            Console.WriteLine(reg.IsMatch("Superhero"));

        }

Output

True
False

Explanation

In this example as well, we used a similar code setup as we did in the first one. The regular expression pattern in this scenario requires matching with a combination of words or sentences that start with “Super”.

^Super

So, as we want to match from the start of the word series, we will start by putting the “^” symbol, then will give the pattern that we want to match, in this case, “Super”. Now the pattern that we created “^Super” can match with all the values of super, even superman or supernatural but we don’t just want the word “Super”.

This means there should be white space after the word to mark the end of the word and start of another word. To do that we will add symbol “\s” to the pattern and thereby making our final pattern as

^Super\s

Scenario 3: Use Regular expression to find valid file names with an image file type extension.

Another important real-time scenario that developers often face is the validation of file types. Let’s say we have an upload button in the UI, which can only accept image file type extensions.

We need to validate the user upload file and inform him in case he uploaded the wrong file format. This can be easily achieved by using Regular expression.

Given below is a simple program to check this.

public static void Main(string[] args)
        {
            
            string patternText = @"(\w+)\.(jpg|png|jpeg|gif)$";

            Regex reg = new Regex(patternText);

            //When pattern matches
            Console.WriteLine(reg.IsMatch("abc.jpg"));
            Console.WriteLine(reg.IsMatch("ab_c.gif"));
            Console.WriteLine(reg.IsMatch("abc123.png"));

            //When pattern doesnt match
            Console.WriteLine(reg.IsMatch(".jpg"));
            Console.WriteLine(reg.IsMatch("ask.jpegj"));

        }

Output

True
True
True
False
False

Explanation

Here we need to match a file name. A valid file name is composed of three parts (name of file + . + file extension). We need to create a regular expression to match all three parts. Let’s start by matching the first part i.e. the name of the file. A file name can contain alphanumeric and special characters.

As discussed earlier the symbol to denote that is “\w”. Also, the filename can be of one or more characters so will use the symbol “+”. Combine them and we get the symbol for the first part.

(\w+)

Bracket segregated this in parts. The next part is the dot symbol. As the dot symbol has its meaning in a regex, we will use a backslash before it to give it a literal meaning. Combine both and we have the first two parts of the regex covered.

(\w+)\.

Now, for the third and final part, we can directly define the required file extensions separated by “|” OR symbol. This can be then segregated by enclosing inside a circular bracket. A “$” sign at the end makes sure that the extensions defined are at the end of the string. Now, let’s combine them to get the final regular expression.

(\w+)\.(jpg|png|jpeg|gif)$

Now, if we use this in the program we can see that it matches the correct format and returns true but with invalid formats, it returns false.

Scenario 4: Use Regular expression to validate a website address format

Let’s assume we have a web form that accepts a web address or domain address. We want the user to enter the correct web/domain address while filling up the form. For determining if the user has entered a correct web address, a regular expression can be quite useful.

public static void Main(string[] args)
        {
            
            string patternText = @"^www.[a-zA-Z0-9]{3,20}.(com|in|org|co\.in|net|dev)$";

            Regex reg = new Regex(patternText);

            //When pattern matches
            Console.WriteLine(reg.IsMatch("www.selenium.dev"));

            //When pattern doesnt match
            Console.WriteLine(reg.IsMatch("ww.alsjk9874561230.movie.dont"));

        }

Output

True
False

Explanation

Here, we want to match a valid domain name using a regular expression. A valid domain name starts with the abbreviation “www” followed by a dot (.) then the name of the website after that a dot (.) and at the end a domain extension.

So, similar to the previous scenario we will try to match it part by part. Let’s first start by matching “www.” Part. So we start with the starting symbol, then as “www.” It is something that is fixed, so we use the starting symbol followed by the exact words to match.

“^www.”

Then we will start working on the second part. The second part of the web address can be any alphanumeric name. So, here we will use square brackets present in the character class to define the range that needs to be matched. After adding the second part with the second part will give us.

“^www.[a-zA-Z0-9]{3,20}”

Here we have also added curly braces to define the minimum and maximum character length for the website name. We have given a minimum of 3 and a maximum of 20. You can give any minimum or maximum length you want.

Now, having covered the first and second parts of the web address we are left with just the last part, i.e. domain extension. It’s quite similar to what we did in the last scenario, we will directly match with the domain extensions by using OR and enclosing every valid domain extension inside the circular bracket.

Thus if we add all these together we will have a complete regular expression to match any valid web address.

www.[a-zA-Z0-9]{3,20}.(com|in|org|co\.in|net|dev)$

Scenario 5: Use Regular expression to validate an email id format

Let’s assume that we have a sign-in form on our webpage which asks the users to enter their email address. For obvious reasons, we will not want our form to proceed further with invalid email addresses. To validate whether the email address entered by the user is correct or not we can use a regular expression.

Given below is a simple program to validate an email address.

public static void Main(string[] args)
        {
            
            string patternText = @"^[a-zA-Z0-9\._-]{5,25}.@.[a-z]{2,12}.(com|org|co\.in|net)";

            Regex reg = new Regex(patternText);

            //When pattern matches
            Console.WriteLine(reg.IsMatch("software_test123@gmail.com"));
            Console.WriteLine(reg.IsMatch("Special.Char@yahoo.co.in"));

            //When pattern doesnt match
            Console.WriteLine(reg.IsMatch("ww.alsjk9874561230.mo@vie.dont"));

        }

Output

True
True
False

Explanation

A valid email address contains alphabets, numerals, and some special characters like dot (.), dash (-), and underscores (_) followed by the “@” symbol which is then followed by the domain name and domain extension.

Thus, we can divide the email address into four parts i.e. email identifier, “@” symbol, the domain name, and the last being the domain extension.

Let’s start by writing a regular expression for the first part. It can be alpha-numeric with some special characters. Assume that we have an expression size ranging from 5 to 25 characters. Similar to how we wrote it earlier (in the email scenario), we can come up with the following expression.

^[a-zA-Z0-9\._-]{5,25}

Now, moving to the second part. It’s comparatively easy as we only have to match one symbol i.e. “@”. Adding it to the above expression gives us.

^[a-zA-Z0-9\._-]{5,25}.@

Moving to the third part i.e the domain name will always be a series of lower case alphabetic characters. If you want you can also include numeric or upper case alphabetic characters but for this scenario, we will go with lower case alphabets.

If we add the expression for lower case alphabets with length ranging from 2 to 12 characters, then we will have the following expression.

^[a-zA-Z0-9\._-]{5,25}.@.[a-z]{2,12}

Now, we are just left with the expression for domain extension, similar to the fourth scenario, we will handle some specific domain extensions. If you want you can add more of them by enclosing them inside a circular bracket and separating them with a “|” symbol.

Consolidating this expression with the previous expression will give us our final expression value for email validation.

^[a-zA-Z0-9\._-]{5,25}.@.[a-z]{2,12}.(com|org|co\.in|net)

Conclusion

In this tutorial, we learned what regular expression is along with the syntax/symbols that are used to denote, construct a regular expression. Regular expression allows the user to match a string with a given pattern.

This is quite helpful in situations that demand quick validation of the input like when a user enters his email address or phone number, regex can be used to quickly validate the format and inform the user about the issue if the user has entered the wrong format.

We also learned to tackle different scenarios that can be used for a variety of different applications. We looked at the step by step process to write expressions for matching words, alphabets, website addresses, email ids, and even file types and extensions.

These scenarios are quite useful in real-time validation of user inputs without writing numerous lines of code and thereby helps in saving time and reduce complexity. These examples have been used to guide the user to create their own set of regular expressions and thus help them in handling several other different scenarios.

Regex can be simple like using alphabet or numerals to match with a given series of characters or complex by using a combination of special characters, quantifiers, character classes, etc. to validate complex formats or to look for a specific pattern in the character series.

In a nutshell, a regular expression is quite a powerful tool for a programmer and helps in reducing the amount of code that is required to accomplish in a data matching or a validation task.

=> Check ALL C# Tutorials Here.

Was this helpful?

Thanks for your feedback!

Regex Class In C#

C# Regex Methods

IsMatch

Replace(String text, String replacementText)

Public string[] Split(string text)

Usage Of Regex C# Methods

Regular Expression Syntax

Special Characters

Quantifier Syntax

Character Class

Grouping

C# Regular Expression Examples

Conclusion

Was this helpful?

Recommended Reading

Leave a Comment Cancel reply