Prior to the release of SQL Server 2017, if you were wanting to perform any form of grouped concatenation we have the following approaches we can use:
- CLR
- Cursors
- For XML Path
- Quirky Update
- Recursive CTE
- Scalar UDF’s
Aaron Bertrand (B|T) has 2 very good posts Grouped Concatenation in SQL Server & Grouped Concatenation: Ordering and Removing Duplicates where he runs through what is grouped concatenation, undertakes some performance tests and shows you which of the above approaches are the most performant.
I recently reached out to Aaron around the specific use of the For XML Path as I was having issues with performance of a nasty query and after some discussions unfortunately there were no other options available due to the version of SQL Server being used. That led me to thinking maybe I will extend on Aaron’s 2 posts mentioned above and have a look in SQL Server 2017 at the new string function “STRING_AGG”.
The Definition of STRING_AGG on Microsoft Docs is “Concatenates values of string expressions and places separator values between them. The separator is not added at the end of string”
With this in mind on my SQL Server 2017 demo environment I took my pesky For XML Path statement and modified this to use STRING_AGG and undertake some performance tests to compare the 2 statements.
-- For XML Path statement to undertake a Grouped Concatenation SELECT '/' + EventSessions_Title As 'EventSessions' From dbo.SQLEvents As SE Left Join dbo.EventSessions As ESE On ESE.SQLEvents_ID = SE.SQLEvents_Id Where ESE.SQLEvents_ID = 1 For XML Path ('') -- STRING_AGG statement to undertake a Grouped Concatenation Select '/' + STRING_AGG (EventSessions_Title ,'/') Within Group (Order By EventSessions_Title Asc) As 'EventSessions' From dbo.SQLEvents As SE Left Join dbo.EventSessions As ESE On ESE.SQLEvents_ID = SE.SQLEvents_Id Where ESE.SQLEvents_ID = 1
Both executions of the statements above return us the same results. So that is the main thing. The thing to remember with using STRING_AGG vs For XML Path you have more control in TSQL to use this function to get the results.
With both returning the same results, following on from Aaron’s posts we need to look at the performance between the 2. Using SentryOne’s PlanExplorer for the testing made things very simple. Executing the 2 statements separately in plan explorer gives us not only the execution plan but also the statistical information allowing us to compare the 2.
From the results there is very little difference in performance between both statements. Prior to running each statement the cache is cleared ensuring that both statements are being executed cold.
The next thing I looked at was the Execution plans and are pretty much identical except for a change in 1 operator.
For our “For XML Path” we see the UDX operator being used between the Sort and Nested Loop operators.
For our “STRING_AGG” we see it now using the Stream Aggregate operator being used in between the Sort and Nested Loop operators.
Even though the performance between the 2 is no negligible in my environment and during my tests, moving forward I will be using STRING_AGG over For XML Path because of the flexibility it gives. Like everything in SQL Server, ensure you test and compare results before applying any change to your important systems.
Leave a Reply