Learn about MySQL SUBSTRING and SUBSTRING_INDEX functions and its usage with multiple examples:
MySQL Substring function is used to extract a substring or part string against the input string. As the name suggests the Substring function operates on a string input and returns a smaller substring against the options specified.
We will also learn about another variant of the SUBSTRING called SUBSTRING_INDEX. This function accepts character or String value as the delimiter and executes a search against the given delimiter value.
=> Click here for the complete MySQL tutorial series
We will also look at some examples to understand how these functions can be used while querying data as part of the SELECT queries.
Table of Contents:
MySQL SUBSTRING
Syntax:
// format 1 SELECT SUBSTRING('{String Value}', {start_index}, {count}); // format 2 SELECT SUBSTRING('{String Value}'FROM {start_index} for {count});
There are 2 ways in which we can use the SUBSTRING function.
Let’s understand the syntax before we dive into the examples.
- {String Value}: This is the actual String value or the column name containing the String to be worked on.
- {start_index}: start_index is the index from left or right where the Substring would be started. The values can be positive or negative depending on whether we want to extract String from the left side or right side, respectively. This will become more clear with the examples.
- {count}: This field is optional and specifies the count of characters from the start_index field for which the Substring needs to be extracted. If not mentioned, then the entire String starting from the matching start_index would be returned.
The format2 is a variation of format1 except that it’s more readable using keywords FROM and FOR. So here, FROM is used before mentioning the start_index value, and FOR is used before mentioning the count.
Let’s see some examples to understand the MySQL SUBSTRING function.
Let’s take String Input value as Software Testing Help
The String Indexes will look like below table
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
S | o | f | t | w | a | r | e | T | e | s | t | i | n | g | H | e | l | p |
MySQL SUBSTRING Example
Without Count
Suppose we want to extract the Help text from the above String – H is having an index of 18
Let’s see the query below:
SELECT SUBSTRING('Software Testing Help', 18) as extracted_string; SELECT SUBSTRING('Software Testing Help' FROM 18) as extracted_string; // Output Help
It’s important to note here that we have not mentioned any value for the count, but we are still getting the correct String – this is because when the count is omitted, the entire String starting from the given index is returned.
In this case, if we mention the start Index, for example, for ‘T’ (Index 10)
SELECT SUBSTRING('Software Testing Help', 10) as extracted_string; //Output Testing Help
As you can see above, the output contains the entire text ‘Testing Help’.
With Count
Now, let’s look at an example where we will mention the count of the characters to be returned.
Suppose we just want to extract the ‘Testing’ word from the String ‘Software Testing Help’. Then we can mention count as – 7 (which is the length of the word ‘Testing)
Let’s see the query below:
SELECT SUBSTRING('Software Testing Help', 10, 7) as extracted_string; SELECT SUBSTRING('Software Testing Help' FROM 10 FOR 7) as extracted_string; //Output Testing
Here we have described both the starting index and count of characters as part of the query to get the required result.
With Negative Index Values
So far, for the examples we have seen, we have specified the start_index as a positive value.
We can also specify the start_index as negative, which simply means that instead of counting from left, the MySQL engine will try to count the start Index from the end of the String.
Let’s see an example
For our String ‘Software Testing Help’, let’s add a row below to identify the right indexes
L | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
S | o | f | t | w | a | r | e | T | e | s | t | i | n | g | H | e | l | p | |||
R | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 |
So, the bottom row here represents the right indexes. Now suppose we want to extract the word ‘Testing’ using the MySQL SUBSTRING function with the right index.
The right index for ‘T’ in Testing is 12 (and the left index is 10)
SELECT SUBSTRING('Software Testing Help', -12, 7) as extracted_string; SELECT SUBSTRING('Software Testing Help' FROM -12 FOR 7) as extracted_string; //Output Testing
As you can see in the example above, mentioning the right indexes as a negative value will still yield the same result.
Index Out of Bounds
Now, try some negative scenarios
- When start_index specified is not existing: i.e. the index specified is out of bounds, then the output would be empty.
SELECT SUBSTRING('Software Testing Help', 30, 2) as extracted_string; //output Empty String
- When the count specified is negative or more than the total length of the string, then the output would just be the rest of the String (starting the start_index specified).
SELECT SUBSTRING('Software Testing Help', 5, 100) as extracted_string; //output ware Testing Help
You can see above, here the count was mentioned as 100, but the String is just 21 characters long.
- Let’s try one more scenario with the count mentioned as negative.
SELECT SUBSTRING('Software Testing Help', 5, -20) as extracted_string; //output Empty String
Here also, there will be no error and there will be just an empty string that would be the output.
MySQL SUBSTRING_INDEX
This is another variant of the MySQL SUBSTRING function.
Here, instead of mentioning the actual value of start_index, we can simply mention the character and the index and the number of times the delimiter needs to be searched before returning the output.
An important point to note here is the SUBSTRING_INDEX
MySQL SUBSTRING INDEX function:
SUBSTRING_INDEX({StringValue/Column},delimiter,n)
- {StringValue/Column}: Actual String input or the column name in the SELECT query.
- Delimiter: It’s the character or the substring that needs to be looked for, to extract the remaining String as output. This value is case sensitive, so it should be specified as it appears in the actual String.
- n: Count of delimiter – i.e. the no of times the delimiter is to be searched before returning the String.
Let’s understand this with the help of some examples.
We will use the same string ‘Software Testing Help’ as our Input String for all the examples that we are going to discuss in the next section below.
Searching For A Character
While searching for character ‘T’ the output would be ‘Software ‘. Notice that the extra space after the word Software will be included as well.
SELECT SUBSTRING_INDEX('Software Testing Help', 'T', 1) as extracted_string; //Output Software
Search for a character that is appearing more than once. Here, it is – ‘e’
SELECT SUBSTRING_INDEX('Software Testing Help', 'e', 1) as extracted_string; //Output Softwar
In this example: since we have specified the count as 1, the output would be String till the first occurrence of ‘e’ is reached.
Change the count to 2 for the same character ‘e’.
SELECT SUBSTRING_INDEX('Software Testing Help', 'e', 2) as extracted_string; //Output Software T
Now, we can see, Substring up to the second occurrence of character ‘e’ is returned.
Take another example to understand that the SUBSTRING INDEX is case sensitive. Let’s try with the letter ‘t’ instead of ‘T’
SELECT SUBSTRING_INDEX('Software Testing Help', 't', 1) as extracted_string; //Output Sof
Searching For A String
We can use the SUBSTRING INDEX function for searching a string instead of a character. Let’s search for the word ‘Testing’.
SELECT SUBSTRING_INDEX('Software Testing Help', 'Testing', 1) as extracted_string; //Output Software
Searching For A Non-Existing String
Using SUBSTRING INDEX with a String that’s not existing.
SELECT SUBSTRING_INDEX('Software Testing Help', 'abc', 1) as extracted_string; //Output Software Testing Help
The output might be a little unexpected here. But since the MySQL Engine can not find the required match, it returns the entire string as the output.
Searching For Character Or String In Reverse Order
Similar to the SUBSTRING function, the SUBSTRING INDEX function also allows having the count mentioned as negative numbers. This means, instead of traversing from left to right, the MySQL Engine will try to find a match from right to left.
Understand this with the help of some examples.
SELECT SUBSTRING_INDEX('Software Testing Help', 'T', -1) as extracted_string; //Output esting Help
In the above example, you can see the delimiter is ‘T’ while the count is -1. So the output would be Substring from the right until the first match for the letter ‘T’ is obtained.
Applications Of MySQL SUBSTRING Function
We generally use the MySQL String functions while querying data from 1 or multiple tables. It can be used to split values in a column, or just to display the essential values as per the need.
Example: Assume that we have an employee details table consisting of id, name, and address.
And the address column is of format – {HouseNo},{StreetName}
Below is the Script to create a table and insert dummy data:
CREATE TABLE employee_details (name VARCHAR(100), address VARCHAR(100), age INT); INSERT INTO employee_details values ('Akash Malhotra', '22,Defence Colony', 24), ('Steve Wilson','12/4,Wilson Street',28), ('Monica Singh','190,Smith Lane',32);
Use the MySQL SUBSTRING function to have the house number extracted from the address column as shown below.
SELECT name, SUBSTRING_INDEX(address,',',1) as house_no, address, age from employee_details
Output:
name | house_no | address | age |
---|---|---|---|
Akash Malhotra | 22 | 22,Defence Colony | 24 |
Steve Wilson | 12/4 | 12/4,Wilson Street | 28 |
Monica Singh | 190 | 190,Smith Lane | 32 |
In the above query, you can see that we have used the SUBSTRING_INDEX function to obtain the House-Number from the address column and we have used ‘,’ as the delimiter.
Similar to this, there can be many applications where we would like to extract the String to fetch some meaningful information from a larger String Input or the column value.
Frequently Asked Questions
Q #1) Which MySQL function returns the position of the first occurrence of a SUBSTRING in the string?
Answer: MySQL provides 2 functions to return or extract a SUBSTRING out of a given String or column value.
- SUBSTRING
Example:
SELECT SUBSTRING('Test Input',1,4) as extracted_string; // Output Test
- SUBSTRING INDEX
Example:
SELECT SUBSTRING_INDEX('Test Input','t',1) as extracted_string; //Output Tes
Q #2) What is SUBSTRING_INDEX in MySQL?
Answer: SUBSTRING_INDEX function is a variant of the SUBSTRING function. While in the normal SUBSTRING function you have to mention the index as a number in the Input String, for the SUBSTRING_INDEX function, you can mention either a character or a smaller substring to be searched for.
Suggested reading =>> MySQL COUNT Tutorial
Conclusion
In this tutorial, we have learned about MySQL SUBSTRING and SUBSTRING_INDEX functions. Both the functions extract a Substring against the given input string but work in a slightly different way.
SUBSTRING expects the index as the actual numbers, while the SUBSTRING_INDEX can have the delimiter specified as a Character or a String value, and then the MySQL Engine can extract against the required matching String or Character.
Both these functions are largely used where a column having multiple pieces of information can be extracted into sub-columns using String extraction logic based on appropriate requirements.
In SUBSTRING_INDEX, how would you handle an N parameter that varied by row for a given column? (ex. 3 commas in row1, 1 comma in row2, 2 commas in row3, etc.) ?
I imagine that I’d need a RECURSIVE CTE, but not sure how to set that up in conjunction with the SUBSTRING_INDEX. I assume that I’d keep the N parameter as a variable in the function, but have you done this before?