Concatenate String Values on Amazon Redshift using ListAgg Function
In this Amazon Redshift Data Warehouse tutorial for SQL programmers I want to give SQL sample query codes showing how to concatenate column table values using SQL ListAgg functions.
To concatenate string values on Amazon Redshift database, SQL developers can use ListAgg SQL aggregate function with several arguments like comma separate character for fetching a CSV list, or WithIn Group Order By clause for sorting values within the result list, or using a Group By clause as well as using ListAgg function with Distinct clause.
In this SQL tutorial for Amazon Redshift database developers and administrators, I will provide sample SQL codes with different usages of ListAgg string aggregation function.
Assume that in our Amazon Redshift database we have created a SQL table using following CREATE TABLE DDL statement
And by executing a SQL SELECT query, we can display the data inserted into this category table to have a better understand of the data.
Now using ListAgg function, I want to show SQL developers to concatenate category names field catname in a comma separated format as string data type variable
Here is the SQL code with ListAgg function which we will execute
As SQL code builders will see, the category names under catname column are aggregated and concatenated as a single string and separated by a comma character we provided as the second argument of the Amazon Redshift SQL ListAgg aggregate function.
Here is the returning result:
MLS, Plays, Pop, Opera, NFL, Musicals, NHL, NBA, Jazz, MLB, Classical
Let's order the list so that the items will be listed alphabetically.
The related SQL query that will sort the items in the return list includes WITHIN GROUP (Group By ...) clause.
Now in the return list, the category names are sorted in alphabetical order with the help of "Within Group (Order By catname)" SQL clause
In following screenshot, it is seen that the items are sorted by the Order By clause:
Classical, Jazz, MLB, MLS, Musicals, NBA, NFL, NHL, Opera, Plays, Pop
How to concatatenate these characted data typed column values in such a manner that we will put them in different lists according to their category group values.
Each table row so each category includes a category group attribute by "catgroup" field.
Following SQL query will show database programmers to concatenata catname string values grouped by category group attribute catgroup
The execution of the SQL aggregation function ListAgg() with Group By clause results as follows on our sample Amazon Redshift database.
SQL database developers will see now a separate list of concatenated values for each category group value.
Of course, it is also possible to sort the aggregated items within each group alphebatically as seen in below SQL query
Now we have a better sorted and grouped concatenated column lists using Redshift SQL aggregate function ListAgg as in below SQL execution result
The return data type with Redshift ListAgg SQL aggregation function is Varchar(Max).
The maximum number of characters allowed with varchar(max) data type on an Amazon Redshift is 65535 characters which is 64K-1 characters.
So if the return list of concatenated string values has a longer list than 65535 characters, ListAgg() SQL function will raise following error message:
SQL Error [500310] [XX000]: [Amazon](500310) Invalid operation: Result size exceeds LISTAGG limit
To visualize this error, I want to execute a SQL Select query with ListAgg() function in use but this time on a Redshift database table with thousands of rows
The error indicating that the result size exceeds maximum number of characters allowed
SQL Error [500310] [XX000]: [Amazon](500310) Invalid operation: Result size exceeds LISTAGG limit
Details:
-----------------------------------------------
error: Result size exceeds LISTAGG limit
code: 8001
context: LISTAGG limit: 65535
query: 3458881
location: _rds_bin_padb.1.0.19097_data_exec_113_2080259697_9ef..3_0.cpp:158
process: query0_236_3458881 [pid=16437]
-----------------------------------------------;
In this case, the return list includes repeating values of country names.
If it is OK for you to include only distinct values of this column used with ListAgg SQL function for concatenation, then SQL programmer can convert ListAgg SQL query as follows with ListAgg(Distict ...) format
Of course, Redshift data warehouse SQL developers can also sort the country attributes within the result list
I hope SQL developers building data warehouse solutions on Amazon Redshift will find these ListAgg functions sample codes useful.