Java Regex Tutorial With Regular Expression Examples

This Java Regex tutorial explains what is a Regular Expression in Java, why we need it, and how to use it with the help of Regular Expression examples:

A regular expression in Java that is abbreviated as “regex” is an expression that is used to define a search pattern for strings.

The search pattern can be a simple character or a substring or it may be a complex string or expression that defines a particular pattern to be searched in the string.

Further, the pattern may have to match one or more times to the string.

=> Visit Here To See The Java Training Series For All.

Java Regex -Regular Expression

Regular Expression: Why We Need It

A regular expression is mainly used to search for a pattern in a string. Why do we search for a pattern in a string? We might want to find a particular pattern in a string and then manipulate it or edit it.

So in a computer application, we may have a continuous requirement of manipulating various patterns. Hence, we always require regex to facilitate searching for the pattern.

Now given a pattern to search for, how exactly does the regex works?

When we analyze and alter the text using a regex, we say that ‘we have applied regex to the string or text’. What we do is we apply the pattern to the text in a ‘left to right’ direction and the source string is matched with the pattern.

For example, consider a string “ababababab”. Let’s assume that a regex ‘aba’ is defined. So now we have to apply this regex to the string. Applying the regex from left to right, the regex will match the string “aba_aba___”, at two places.

Thus once a source character is used in a match, we cannot reuse it. Thus after finding the first match aba, the third character ‘a’ was not reused.

java.util.regex

Java language does not provide any built-in class for regex. But we can work with regular expressions by importing the “java.util.regex” package.

The package java.util.regex provides one interface and three classes as shown below:

java.util.regex

Pattern Class: A pattern class represents the compiled regex. The Pattern class does not have any public constructors but it provides static compile () methods that return Pattern objects and can be used to create a pattern.

Matcher Class: The Matcher class object matches the regex pattern to the string. Like Pattern class, this class also does not provide any public constructors. It provides the matcher () method that returns a Matcher object.

PatternSyntaxException: This class defines an unchecked exception. An object of type PatternSyntaxException returns an unchecked exception indicating a syntax error in regex pattern.

MatchResult Interface: The MatchResult interface determines the regex pattern matching result.

Java Regex Example

Let’s implement a simple example of regex in Java. In the below program we have a simple string as a pattern and then we match it to a string. The output prints the start and end position in the string where the pattern is found.

import java.util.regex.Matcher; 
import java.util.regex.Pattern; 
  
public class Main 
{ 
    public static void main(String args[]) 
    { 
        //define a pattern to be searched 
        Pattern pattern = Pattern.compile("Help."); 
  
        // Search above pattern in "softwareTestingHelp.com" 
        Matcher m = pattern.matcher("softwareTestingHelp.com"); 
  
        // print the start and end position of the pattern found 
        while (m.find()) 
            System.out.println("Pattern found from position " + m.start() + 
                               " to " + (m.end()-1)); 
    } 
}

Output:

Pattern found from 15 to 19

Regex Matcher In Java

The matcher class implements the MatchResult interface. Matcher acts as a regex engine and is used to perform the exact matching of a character sequence.

Given below are the common methods of the Matcher class. It has more methods but we have listed only the important methods below.

NoMethodDescription
1boolean matches()Checks if the regex matches the pattern.
2Pattern pattern()Returns the pattern that the matcher interprets.
3boolean find()This method finds the next expression to be matched to the pattern.
4boolean find(int start)Same as find () but finds the expression to be matched from the given start position.
5String group()Returns the subsequence matching the pattern.
6String group(String name)Returns the input subsequence. This is captured in the earlier match operation by capturing the group with the specified name.
7int start()Gives the starting index of matched subsequence and returns it.
8int end()Returns end position/index of matched subsequence.
9int groupCount()Return the total number of matched subsequence.
10String replaceAll(String replacement)Replace all subsequences of the input sequence that match the pattern by given replacement string.
11String replaceFirst(String replacement)Replace the first matching subsequence of the input sequence by the specified replacement string.
12String toString()Return the string representation of the current matcher.

Regular Expression Implementation Example

Let’s see an example of the usage of some of these methods.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MatcherDemo {
      public static void main(String[] args) {
      String inputString = "She sells sea shells on the sea shore with shells";
      //obtain a Pattern object
      Pattern pattern = Pattern.compile("shells");
      
      // obtain a matcher object
       System.out.println("input string: " + inputString);
      Matcher matcher = pattern.matcher(inputString); 
      inputString = matcher.replaceFirst("pearls");
      System.out.println("\nreplaceFirst method:" + inputString);
      //use replaceAll method to replace all occurrences of pattern
      inputString = matcher.replaceAll("pearls");
      System.out.println("\nreplaceAll method:" + inputString);
   }
}

Output:

input string: She sells sea shells on the sea shore with shells
replaceFirst method:She sells sea pearls on the sea shore with shells
replaceAll method:She sells sea pearls on the sea shore with pearls

Regex Pattern Class In Java

Pattern class defines the pattern for the regex engine which can then be used to match with the input string.

The following table shows the methods provided by the Pattern class that is commonly used.

NoMethodDescription
1static Pattern compile(String regex)Returns compiled representation of the regex.
2static Pattern compile(String regex, int flags)Compiles given regex using specified flags and returns pattern.
3Matcher matcher(CharSequence input)Returns a matcher by matching the input sequence with the pattern.
4static boolean matches(String regex, CharSequence input) Compiles the given regex and matches the pattern with a given input.
5int flags()Returns flags of the pattern with which the matching is done.
6String[] split(CharSequence input) The input string is split around matches found by a given pattern.
7String[] split(CharSequence input, int limit)The input string is split around matches found by a given pattern.
8String pattern() Returns the regular expression pattern.
9static String quote(String s)Returns a literal String(pattern) for the given String.
10String toString()Obtain string representation of the pattern.

The below example uses some of the above methods of Pattern class.

import java.util.regex.*; 
  
public class Main { 
    public static void main(String[] args) 
    { 
        // define a REGEX String 
        String REGEX = "Test"; 
  
        // string to be searched for given pattern 
        String actualString 
            = "Welcome to SoftwareTestingHelp portal"; 
  
        // generate a pattern for given regex using compile method 
        Pattern pattern = Pattern.compile(REGEX); 
  
        // set limit to 2 
        int limit = 2; 
  
        // use split method to split the string 
        String[] array 
            = pattern.split(actualString, limit); 
  
        // print the generated array 
        for (int i = 0; i < array.length; i++) { 
            System.out.println("array[" + i 
                               + "]=" + array[i]); 
        } 
    } 
}

Output:

array[0]=Welcome to Software
array[1]=ingHelp portal

In the above program, we use the compile method to generate a pattern. Then we split the input string about this pattern and read it into an array. Finally, we display the array that was generated as a result of splitting the input string.

Regex String Matches Method

We have seen the String.Contains () method in our string tutorials. This method returns a boolean value true or false depending on if the string contains a specified character in it or not.

Similarly, we have a method “matches ()” to check if the string matches with a regular expression or regex. If the string matches the specified regex then a true value is returned or else false is returned.

The general syntax of the matches () method:

public boolean matches (String regex)

If the regex specified is not valid, then the “PatternSyntaxException” is thrown.

Let’s implement a program to demonstrate the usage of the matches () method.

public class MatchesExample{
   public static void main(String args[]){
        String str = new String("Java Series Tutorials");

        System.out.println("Input String: " + str);

        //use matches () method to check if particular regex matches to the given input
       System.out.print("Regex: (.*)Java(.*) matches string? " );
       System.out.println(str.matches("(.*)Java(.*)"));

       System.out.print("Regex: (.*)Series(.*) matches string? " );
       System.out.println(str.matches("(.*)Series(.*)"));
       
        System.out.print("Regex: (.*)Series(.*) matches string? " );
       System.out.println(str.matches("(.*)String(.*)"));

       System.out.print("Regex: (.*)Tutorials matches string? " );
       System.out.println(str.matches("(.*)Tutorials"));
   }
}

Output:

Input String: Java Series Tutorials
Regex: (.*)Java(.*) matches string? true
Regex: (.*)Series(.*) matches string? true
Regex: (.*)Series(.*) matches string? false
Regex: (.*)Tutorials matches string? true

We use lots of special characters and Metacharacters with regular expressions in Java. We also use many character classes for pattern matching. In this section, we will provide the tables containing character classes, Meta characters, and Quantifiers that can be used with regex.

Regex Character Classes

NoCharacter classDescription
1[pqr]p,q or r
2[^pqr]Negation: Any character other than p,q, or r
3[a-zA-Z]Range:a through z or A through Z, inclusive
4[a-d[m-p]]Union:a through d, or m through p: [a-dm-p]
5[a-z&&[def]]Intersection:d, e, or f
6[a-z&&[^bc]]Subtraction:a through z, except for b and c: [ad-z]
7[a-z&&[^m-p]]Subtraction: a through z, and not m through p: [a-lq-z]

Regex Quantifiers

Quantifiers are used to specify the number of times the character will occur in the regex.

The following table shows the common regex quantifiers used in Java.

NoRegex quantifierDescription
1x?x appears once or not at all
2x+x appears one or more times
3x*x occurs zero or more times
4x{n}x occurs n times
5x{n,}x occurs n or more times
6x{y,z}x occurs at least y times but less than z times

Regex Meta Characters

The Metacharacters in regex work as shorthand codes. These codes include whitespace and non-whitespace character along with other shortcodes.

The following table lists the regex Meta characters.

NoMeta CharactersDescription
1. Any character (may or may not match terminator)
2\d Any digits, [0-9]
3\D Any non-digit, [^0-9]
4\s Any whitespace character, [\t\n\x0B\f\r]
5\S Any non-whitespace character, [^\s]
6\w Any word character, [a-zA-Z_0-9]
7\W Any non-word character, [^\w]
8\b A word boundary
9\B A non-word boundary

Given below is a Java program that uses the above special characters in the Regex.

import java.util.regex.*;  
public class RegexExample{  
public static void main(String args[]){  
   // returns true if string exactly matches "Jim"
   System.out.print("Jim (jim):" + Pattern.matches("Jim", "jim")); 
	
   // Returns true if the input string is Peter or peter
   System.out.println("\n[Pp]eter(Peter) :" + 
     Pattern.matches("[Pp]eter", "Peter")); 
	
    //true if string = abc
   System.out.println("\n.*abc.*(pqabcqp) :" + 
     Pattern.matches(".*abc.*", "pqabcqp"));
	
   // true if string doesn't start with a digit
   System.out.println("\n^[^\\d].*(abc123):" + 
     Pattern.matches("^[^\\d].*", "abc123")); 
	
   // returns true if the string contains exact three letters
   System.out.println("\n[a-zA-Z][a-zA-Z][a-zA-Z] (aQz):" + 
     Pattern.matches("[a-zA-Z][a-zA-Z][a-zA-Z]", "aQz"));
     
  System.out.println("\n[a-zA-Z][a-zA-Z][a-zA-Z], a10z" + 
     Pattern.matches("[a-zA-Z][a-zA-Z][a-zA-Z], a10z", "a10z")); //input string length = 4
 
   // true if the string contains 0 or more non-digits
   System.out.println("\n\\D*, abcde:" + 
     Pattern.matches("\\D*", "abcde")); //True
  
   // true of line contains only word this ^-start of the line, $ - end of the line
   System.out.println("\n^This$, This is Java:" + 
     Pattern.matches("^This$", "This is Java")); 
   System.out.println("\n^This$, This:" +
     Pattern.matches("^This$, This", "This")); 
   System.out.println("\n^This$, Is This Java?:" +
     Pattern.matches("^This$, Is This Java?", "Is This Java?")); 
}
}

Output:

Jim (jim):false
[Pp]eter(Peter) :true
.*abc.*(pqabcqp) :true
^[^\d].*(abc123):true
[a-zA-Z][a-zA-Z][a-zA-Z] (aQz):true
[a-zA-Z][a-zA-Z][a-zA-Z], a10zfalse
\D*, abcde:true
^This$, This is Java:false
^This$, This:false
^This$, Is This Java?:false

In the above program, we have provided various regexes that are matched with the input string. Readers are advised to read the comments in the program for each regex to better understand the concept.

Regex Logical or (|) Operator

We can use the logical or (| operator) in regex that gives us the choice to select either operand of | operator. We can use this operator in a regex to give a choice of character or string. For example, if we want to match both the words, ‘test’ and ‘Test’, then we will include these words in logical or operator as Test|test.

Let’s see the following example to understand this operator.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexOR {
    public static void main(String[] args) {
        // Regex string to search for patterns Test or test
        String regex = "(Test|test)";
        // Compiles the pattern and obtains the matcher object from input string.
        Pattern pattern = Pattern.compile(regex);
        String input = "Software Testing Help";
        Matcher matcher = pattern.matcher(input);

        // print every match
        while (matcher.find()) {
            System.out.format("Text \"%s\" found at %d to %d.%n",
                matcher.group(), matcher.start(), matcher.end());
        }
       //define another input string and obtain the matcher object
        input = "SoftwaretestingHelp";
        matcher = pattern.matcher(input);
        // Print every match
        while (matcher.find()) {
            System.out.format("Text \"%s\" found at %d to %d.%n",
                matcher.group(), matcher.start(), matcher.end());
        }
    }
}

Output:

Text “Test” found at 9 to 13.
Text “test” found at 8 to 12.

In this program, we have provided the regex “(Test|test)”. Then first we give the input string as “Software Testing Help” and match the pattern. We see that the match is found and the position is printed.

Next, we give the input string as “SoftwaretestingHelp”. This time also the match is found. This is because the regex has used or operator and hence the pattern on either side of | operator is matched with the string.

Email Validation Using Regex

We can also validate email id (address) with regex using java.util.regex.Pattern.matches () method. It matches the given email id with the regex and returns true if the email is valid.

The following program demonstrates the validation of email using regex.

public class EmailDemo {
   static boolean isValidemail(String email) {
      String regex = "^[\\w-_\\.+]*[\\w-_\\.]\\@([\\w]+\\.)+[\\w]+[\\w]$";    //regex to validate email.
      return email.matches(regex);                  //match email id with regex and return the value
   }
   public static void main(String[] args) {
      String email = "ssthva@gmail.com";
      System.out.println("The Email ID is: " + email);
      System.out.println("Email ID valid? " + isValidemail(email));
      
      email = "@sth@gmail.com";
      System.out.println("The Email ID is: " + email);
      System.out.println("Email ID valid? " + isValidemail(email));
   }
}

Output:

The Email ID is: ssthva@gmail.com
Email ID valid? true
The Email ID is: @sth@gmail.com
Email ID valid? false

As we can see from the above output, the first email id is valid. The second id directly starts with @, and hence regex does not validate it. Hence it is an invalid id.

Frequently Asked Questions

Q #1) What is in a Regular Expression?

Answer: A Regular Expression commonly called regex is a pattern or a sequence of characters (normal or special or Meta characters) that is used to validate an input string.

Q #2) What is the significance of the Matcher class for a regular expression in Java?

Answer: The matcher class (java.util.regex.Matcher) acts as a regex engine. It performs the matching operations by interpreting the Pattern.

Q #3) What is the pattern in Java?

Answer: The package java.util.regex provides a Pattern class that is used to compile a regex into a pattern which is the standard representation for regex. This pattern is then used to validate strings by matching it with the pattern.

Q #4) What is B in a regular expression?

Answer: The B in regex is denoted as \b and is an anchor character that is used to match a position called word boundary. The start of the line is denoted with a caret (^) and the end of the line is denoted by a dollar ($) sign.

Q #5) Is pattern thread-safe Java?

Answer: Yes. Instances of the Pattern class are immutable and safe for use by multiple concurrent threads. But the matcher class instances are not thread-safe.

Conclusion

In this tutorial, we have discussed Regular Expressions in Java. The regular expression that is also known as ‘regex’ is used to validate the input string in Java. Java provides the ‘java.util.regex’ package that provides classes like Pattern, Matcher, etc. that help to define and match the pattern with the input string.

We have also seen various special character classes and Metacharacters that we can use in the regex that give shorthand codes for pattern matching. We also explored email validation using regex.

=> Explore The Simple Java Training Series Here.