SQL SUM function to Calculate Sum of Top N Rows
This SQL tutorial shows how SQL Sum() function is used with OVER clause and ROWS clause to calculate sum of a table column for a number of data rows like TOP 10 rows, etc.
For SQL database developers SUM() aggregation function is one of the first SQL functions learnt during beginning level. Of course when SQL programmers improve their knowledge and know-how on database development, I expect they follow enhancements to aggregate functions like OVER clause and ROWS clausefor example.
Let's assume you have a SQL Server database table with a numeric field which you want to calculate sum its sum.
The simplest method is as follows
select
SUM(max_column_id_used) as total
from sys.tables
Of course you can add filtering criteria using WHERE clause to limit the rows that participate in SUM calculation.
Another example of calculating sum is to calculate the sum of a certain field for top 10 rows.
Let's create a sample database table which contains a numeric column titled Amount.
I'll populate the Amount column with random integer values which I use R script to generate random numbers on SQL Server.
Here is the SQL table DDL script
create table Orders (
OrderNo int identity(1,1),
Amount int
)
And following SQL script is used to populate table with random integer values between 1 and 100
declare @RandomNumber int
declare @i int = 100
while @i > 0
begin
exec Generate_Random_Number_using_R 0,100,@RandomNumber Output
insert into Orders (Amount) values (@RandomNumber)
set @i = @i - 1
end
Here is the sample table contents
To calculate the sum of Amount fields for top 10 rows in sample database table, following SQL Select statement can be used
select
sum(Amount) as Total
from (
select top 10 Amount from Orders
) as t
Of course, it is always safer to use an ORDER BY clause with TOP clause in a SELECT command.
Otherwise, SQL developer cannot be sure about the order of the return set
Of course, database programmers can also modify the above code into a SQL CTE expression as follows
;with cte as (
select top 10 Amount from Orders Orders order by OrderNo
)
select sum(Amount) as Total from cte
Both above SQL codes will produce the same results. And these two SQL Select statements are the standard solutions for a requirement to calculate sum of top N row columns in SQL
I will suggest an aother approach for SQL database developers to find the sum of top N rows using SUM() aggregate function with Over Clause using the Rows clause as follows
-- either below
select top 1
SUM(Amount) OVER (order by OrderNo rows between 0 following and 9 following)
from Orders
-- or following
select top 1
SUM(Amount) OVER (order by OrderNo rows between current row and 9 following)
from Orders
I experienced huge performance improvements with new solution (Sum aggregate function with Over and Rows clauses) when compared to classic SQL SUM solution especially when database tables with NonClustered index
As last note, it is always better to try new functions and enhancements to replace old solutions in SQL Server for database programmers if especially there is a requirement for performance improvements.