SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software
Development resources, articles, tutorials, code samples, tools and downloads for SAP HANA and ABAP, HANA Database, SQLScript, SAP UI5, Screen Personas, Web Dynpro, Workflow

SAP HANA SQLScript Split String Function

For SAP HANA database SQLScript developers it is possible to create a split string function using regular expressions function substr_regexpr() and series data function series_generate_integer(). In this HANA SQL tutorial, I want to show how SQLScript programmers can use these two HANA functions with Concat() function for string split tasks required in many database developments.

Let's assume that you have a string variable consists of a list of string values seperated with comma ',' as follows:

declare pString nvarchar(5000);
pString := 'Yoda,Mace Windu,Anakin Skywalker,Luke,Me';
Code

If you are familiar with regular expressions from your previous development efforts, you will be happy to learn that you can use SUBSTR_REGEXPR function to identify specific patterns that will help HANA database SQL developer to split delimeter seperated string values into tabular format.

For example, if you look at the pString variable value, you can realize that we don't have delimeter character "," at the beginning and at the end of the variable value.
What happens if we add delimeter character to the beginning and to the end of the SQL string value.

concatenate delimeter character using SQL Concat function

Developers experienced with Regular Expressions can now easily see the trick here.
If we search for a specific pattern that starts with "," our delimeter character followed with a number of characters and ends with next "," delimeter, we can split given input string into pieces.

select
 SUBSTR_REGEXPR( '[,][^,]*[?=,]' IN CONCAT(CONCAT(',',pString),',') )
from dummy;
Code

You see we have extracted "Yoda" out of the pString SQLScript string variable. This is the first occurrence of the searched pattern.
What about extracting the second pattern in the variable, can we find it, too?

Let's execute following modified SQLScript SELECT statement

select  SUBSTR_REGEXPR( '[,][^,]*[?=,]' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 1) firstString,  SUBSTR_REGEXPR( '[,][^,]*[?=,]' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 2) secondString from dummy;
Code

Unfortunately, the results are not as what we exactly expected.
"Anakin Skywalker" the second SubStr_RegExpr() function output is not actually the second term in the string list variable pString

HANA database SubStr_RegExpr regular expression function

This is because, if we remove pattern ",*," for ",Yoda," the remaining string "Mace Windu,Anakin Skywalker,Luke,Me," is not lacking of beginning "," delimeter character.
So the next occurrence fetched for ",*," pattern is at ",Anakin Skywalker,"
Similarly, if we use above SQLScript regular expression query we will fetch the third occurrence for ",Me," by-passing "Luke"

Database developers can modify the regular expression pattern from '[,][^,]*[?=,]' to '([,])([^,]*)(?=,)' form.
Additionally using the GROUP clause in the SUBSTR_REGEXPR() function will enable the HANA database developer to choose only the names, that is the 2nd group in the expressions by excluding the "," commas in the matched pattern.

Let's execute the new version of the SQLScript code

select
 SUBSTR_REGEXPR( '([,])([^,]*)(?=,)' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 1 GROUP 2) firstString,
 SUBSTR_REGEXPR( '([,])([^,]*)(?=,)' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 2 GROUP 2) secondString,
 SUBSTR_REGEXPR( '([,])([^,]*)(?=,)' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 3 GROUP 2) thirdString,
 SUBSTR_REGEXPR( '([,])([^,]*)(?=,)' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE 4 GROUP 2) fourthString
from dummy;
Code

As expected the output will be as seen in following screenhot from SAP HANA Studio

split string on SAP HANA database using regular expression

If the SQL developer does not know how many items will be after splitting string and if he wants to get the resultset in tabular format not in a pivot format, then we need a numbers table function.
And the easiest method to create numbers table on SAP HANA database is using SERIES_GENERATE_INTEGER series data function.
Let's see now how to split string using SQL on SAP HANA database with SUBSTR_REGEXPR() and SERIES_GENERATE_INTEGER() functions.

declare pString nvarchar(5000);
pString := 'Yoda,Mace Windu,Anakin Skywalker,Luke,Me';

select
 NumbersTable.Element_Number,
 SUBSTR_REGEXPR( '([,])([^,]*)(?=,)' IN CONCAT(CONCAT(',',pString),',') OCCURRENCE NumbersTable.Element_Number GROUP 2) splitted
from
 DUMMY as SplitString,
 SERIES_GENERATE_INTEGER(1, 0, 10 ) as NumbersTable;
Code

You see how easy it is string splitting using SQLScript on SAP HANA database for SQL developers.

Series_Generate_Integer() function created an array of integer numbers starting 1 to 10. And I used the column Element_Number in Occurrence clause so that I can match every occurrence of the regular expression within the given string variable. But of course the upper limit of the table function or Series_Generate_Integer() function determines the maximum split count. In this case 10.
So, if the database developer is not sure a number like 100 maybe more than enough. Those "?" or NULL values are for those occurrences that does not have a match in the given string value.

SQLScript split string on HANA database

Now another step maybe working with a database table which has a column of string data type like nvarchar(5000), etc where concatenated string values are stored.

Let's create a sample database table.

create column table Concatenated (
 id int generated by default as identity(start with 1 increment by 1) not null,
 ConcatenatedString nvarchar(5000)
);
insert into Concatenated (ConcatenatedString) values (N'A,B,C,D,E,F,G');
insert into Concatenated (ConcatenatedString) values (N'Yoda,Mace Windu,Anakin Skywalker,Luke,Me');
insert into Concatenated (ConcatenatedString) values (N'SQL,HANA,ABAP,SAP,,ASP.NET,AWS');
Code

And now query HANA database table using following SQL Split String SELECT statement

select *
from (
 select
  ST.ID, -- Source table id
  NT.Element_Number, -- Occurrence number within source table row
  SUBSTR_REGEXPR( '([,])([^,]*)(?=,)'
    IN CONCAT(CONCAT(',',st.ConcatenatedString),',')
    OCCURRENCE NT.Element_Number
    GROUP 2
  ) splitted -- string piece
 from
 Concatenated as ST, -- source table
 SERIES_GENERATE_INTEGER(1, 0, 10 ) as NT -- numbers table
) tbl
where splitted is not null
order by ID, Element_Number;
Code

Please note that since I used another SELECT statement and filtered NULL pieces that are caused of non-matched series data function values. The output of the SQL string split query is as seen in following screenshot.

SQL string split query code execution result



SAP HANA and ABAP

Install SAP Free
CRM Companies List
Web Based CRM Software


Copyright © 2004 - 2021 Eralper YILMAZ. All rights reserved.