Split String UDF Script Function using Python on Exasol Database
Exasol is a high performance analytic database where SQL developers can create UDF scripts using Python, R, Java or Lua for problems that cannot be solved with SQL codes easily. In this Exasol SQL tutorial, I want to show how analytic database developers can create Python UDF script for building a split function.
Exasol has a huge list of built-in functions I could not see a string split function for SQL database developers.
And instead of building old style string split functions created by identifying the place of comma character for example as the delimiter and looping until all seperator characters are processed, I chosed to create a UDF script using Python.
Python string has a split() method which I want to use in my Exasol split functions built as a UDF script
Here is the sample split string UDF script for SQL developers running Exasol analytic database.
I use DbVisualizer to manage and develop code on Exasol database.
Please note that since we are developing our UDF script using Python language, the indents within source code is important.
Here is the output when I execute above Python script to create an UDF script on Exasol database.
Now I can see my split string UDF script in the object directory under Scripts > UDFs of database schema named Kodyaz. Kodyaz is the schema which I was connected when I executed Python UDF script create command.
Exasol database developers can use the UDF Script for splitting string as follows.
The output of the above SQL SELECT statement is as follows
If the UDF script function was emitting or returning more than one column value, the SELECT statement could be used as follows to execute the UDF script as a sub-select
Our sample string split UDF script function only takes string parameter which is to be splitted by the predefined character comma ","
Split a Sentence into list of Words using UDF Script on Exasol Database
Exasol database developers can enhance the sample Python UDF script for string splitting adding an additional parameter which is used as separator character or splitting character.
Following is the UDF script SPLIT_BY developed in Python with 2 input string parameters; one for string to be splitted and one as the splitting character.
Above Split_By UDF script can be used to split a given sentence into a list of contained words by passing the space character as the split character.
Let's execute Split_By Python UDF Script on Exasol database with sample data as follows
I hope SQL database developers on Exasol analytic databases will find this Python UDF script useful as a string split function.
Of course, for performance developers can compare UDF scripts created using different languages like Java, R and Lua beside Python.