__AGGREGATE FUNCTIONS__Aggregate functions perform calculations on a set of data and return a scalar value. Following aggregate functions are available in SQL Server 2008,

**AVG**: Returns the average of the values in a data set

**CHECKSUM_AGG**: Returns the checksum of the values in a data set

**COUNT**: Returns the number of the values in a data set COUNT(*) returns the number of rows in a set. COUNT(Column Name) Returns number of rows which contain data in specified column. COUNT(DISTINCT ) returns number of unique not null values in data set

**COUNT_BIG**: it work same as "COUNT" but it returns Bigint data type, while "COUNT" return Int data type.

**GROUPING**: It returns 0 or 1, the value 1 indicate aggregate while 0 indicate detail result

**MAX**: Returns the highest value in a data set

**MIN**: Returns the lowest value in a data set

**SUM**: Returns the total of the values in the data set

**STDEV**: Returns the statistical standard deviation of all values in given data set

**STDEVP**: Returns the statistical standard deviation for the population for all values in data set

**VAR**: Returns the statistical variance of all values in the specified expression

**VARP**: Returns the statistical variance for the population for all values in the specified expression

__GROUP BY CLAUSE__The "Group By" clause is used in the query with Aggregate functions. When an Aggregate function is used in Select, all other expression in Select must be either Aggregate function or included in Group By clause. The "Group By" allow to find out subtotal of the aggregate data. For example, I have a table like given below

Table A : MonthlySalary

Now look at the following query, it will return Average, Minimum and Maximum salary of each employee

SELECT Employee, AVG(Salary) Average,

MIN(Salary) Minimum, MAX(Salary)Maximum

FROM MonthlySalary

GROUP BY Employee

__WITH ROLLUP OPERATOR__The ROLLUP operator is useful in generating quick result set that contain subtotals and total. This operators provides grand total and subtotal based on the column included in GROUP BY clause. On "MonthlySalary" table execute the given query that uses WITH ROLLUP Operator and check the output,

SELECT Employee, Month, SUM(Salary) AS Salary

FROM MonthlySalary

GROUP BY Employee, Month WITH ROLLUP

**OR**

SELECT Employee, Month, SUM(Salary) as Salary

FROM MonthlySalary

GROUP BY ROLLUP(Employee,Month)

The result set contain subtotal of the salary for all the employees, and the grand total.

__WITH CUBE OPERATOR__The CUBE operator should be used when the summary information needed for more then one column. It generates a result set that contain subtotal for all the column specified in GROUP BY clause. Look at the query and the result given below

SELECT Employee, Month, SUM(Salary) AS Salary

FROM MonthlySalary

GROUP BY Employee, Month WITH CUBE

**OR**

SELECT Employee, Month, SUM(Salary) as Salary

FROM MonthlySalary

GROUP BY CUBE(Employee,Month)

Don't miss the row number 3, 7 and 11 which shows subtotal by month as well, when using WITH CUBE operator. While using WITH ROLLUP operator this rows will not be displayed.

**Things to remember while using WITH ROLLUP and WITH CUBE operator**

- CHECKSUM_AGG aggregate function is not compatible with ROLLUP, CUBE and GROUPING SETS
- Avoid using WITH CUBE operator on large table where their are more then 3 columns in GROUP BY clause, for batter performance.

__ROLLUP() AND CUBE()__ROLLUP() and CUBE() are equivalent to WITH ROLLUP and WITH CUBE and use the same query plans. But note that the new ROLLUP and CUBE syntax is only available in compatibility level 100.

__GROUPING AGGREGATE FUNCTION__How can a NULL generated by the CUBE operation be distinguished from a NULL returned in the actual data? This can be done by using the GROUPING function. The GROUPING function returns 0 if the column value NULL came from the fact data, and 1 if the column value NULL is generated by the CUBE operation. It will be useful when we have to further use the result set generated, as it is tough to find out which data is summary data and for which column, so using GROUPING function we can easily find it out. Look at the query and the result given below

SELECT Employee, Month, SUM(Salary) AS Salary,

GROUPING(Employee)ForEmployee, GROUPING(Month)ForMonth

FROM MonthlySalary

GROUP BY Employee, Month WITH Cube

__GROUPING SETS__It allows us to specify which aggregations do we want in our result set.

**For Example:**

**1**- Following query will give us same output as we get with ROLLUP() or WITH ROLLUP operator

SELECT Employee, Month, SUM(Salary) as Salary

FROM MonthlySalary

GROUP BY GROUPING SETS((Employee,Month), (Employee), 0)

This query aggregates the Salary by Employee and Month, then it aggregate by Employee only, at last it compute the total for all Employees for all Months. The () syntax with no GROUP BY columns display the total.

**2**- Following query will give us the same output as we get with CUBE() or WITH CUBE operator

SELECT Employee, Month, SUM(Salary) as Salary

FROM MonthlySalary

GROUP BY GROUPING SETS((Employee,Month), (Employee), (Month), 0)

This query aggregates the Salary by Employee and Month, then it aggregate by Employee only, then it aggregate by Month only, at last it compute the total for all Employees for all Months.

Very informative article.

ReplyDeleteThank you..

ReplyDelete