In the world of databases, it's essential to learn advanced SQL techniques to make data work better for you. This article is your guide into the advanced side of SQL, where we'll introduce you to some powerful tools like Windows Functions, Common Table Expressions (CTEs), and Pivot/Unpivot operations.
Once you understand and use these tools, your SQL skills will level up, enabling you to work with data in more sophisticated ways. This means you can uncover valuable insights from complex data more easily. Join us on this journey as we uncover the hidden potential of SQL for smarter and more efficient data handling.
Many people are familiar with the GROUP BY function in SQL that does calculations on a table when you group the data. But only some know about window functions, which are special tools that calculate things on a bunch of rows and give you one result for each row. In this article, we'll dive into window functions in SQL Server.
Window functions calculate values for groups of rows, not the whole table. They perform calculations similar to aggregate functions but return results for each row within a group, instead of combining them into one result. This means each row keeps its unique value.
Types of Windows Functions:
1. Aggregate Window Functions: These functions work with many rows, like SUM(), MAX(), MIN(), AVG(), COUNT(), etc.
2. Ranking Window Functions: These functions assign ranks to rows in a table partition, such as RANK(), DENSE_RANK(), ROW_NUMBER(), NTILE(), etc.
3. Value Window Functions: These functions use a series of values nearby, like LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), etc.
Syntax:
The OVER clause is used with window functions to define how the data should be organized. It has two main functions:
It groups rows into different sets based on a specific column or condition (done using the PARTITION BY clause).
It sorts the rows within each group in a particular order (achieved using the ORDER BY clause).
Table - student_score
Aggregate Window Functions
SUM ()
Example -
In this example, we find the department-wise total sum of scores and show it in front of each student.
Output:
AVG()
Example -
Find the average score of students for each department.
Output:
MAX()
Example -
Find the maximum score of students for each department.
Output:
MIN()
Example -
Find the minimum score of students for each department.
Output:
COUNT()
Example -
Find a department-wise count of students and show it in front of each student.
Output:
Ranking Windows Function
RANK()
In SQL, the RANK window function is used to assign a rank to each row within a result set. The rank is determined based on the values in one or more columns specified in the ORDER BY clause. Rows with the same values receive the same rank, and the next rank is skipped.
For example, if two rows have the same values and both receive a rank of 2, the next row will receive a rank of 4, not 3.
Example -
Assign a rank to students for each department and order them by score in descending order.
Output:
DENSE_RANK()
The DENSE_RANK() function gives each row in a group a rank. If multiple rows have the same value, they will have the same rank. But the ranks of the next rows won't be skipped.
Example -
Assign a rank to students for each department and order them by score in descending order
Output:
ROW_NUMBER()
In SQL, the ROW_NUMBER() window function is used to assign a unique number to each row within a result set. It assigns a different number to each row although multiple rows have the same value.
Example -
Assign a unique value to each student in the alphabetically ascending order of their name.
Output:
NTILE()
In SQL Server, the NTILE() function is like a tool that helps organize your data into groups. These groups are made so that each group has about the same amount of data. It makes it simpler to study or sort your information into different parts.
Syntax:
Number Expression - The number of groups we want to form.
Example -
Here we use the NTILE() function to divide rows into three groups.
Output:
Value Window Functions
LAG()
We use the LAG() function to access previous row data with a defined offset value. We can use this window function to make a comparison with the previous row.
Syntax:
Scalar Expression — Column Name
Offset — This is an integer used to forgo behind the number of rows
Default — If there is no previous row, then it shows NULL default. Otherwise, we can show the value we want.
Example -
Write a SQL query that retrieves the sales and previous sales details for each employee, splitting the result set by year and then sorting each partition based on the 'country' column.
Output:
LEAD()
LEAD() Function is used when we compare individual row data with subsequent row data.
Scalar Expression - Column Name
Offset - This is an integer used to forgo a subsequent number of rows
Default - If there is no subsequent row then it shows NULL default. Otherwise, we can show the value we want.
Example -
Write an SQL query that splits the result set by year and sorts each partition by the country column to retrieve the sales and next sales details for each employee.
Output:
FIRST_VALUE()
FIRST_VALUE() window function returns the first value from the table.
Syntax:
Scalar Value - Column Name
Example -
Here in this example, we find the first sale of each country.
Output:
LAST_VALUE()
LAST_VALUE() window function returns the last value form the table.
Scalar Value - Column Name
Example -
Here in this example, we find the last sale of each country.
Output:
We can use SQL Server's Common Table Expressions (CTEs) to simplify complex joins and subqueries. CTEs are like temporary tables that are created when we run a query and used for that query's calculations. However, they are not stored anywhere, and once the query is done, the CTE is deleted, so the data is not saved. This helps us work with data in a single query, like SELECT, INSERT, UPDATE, DELETE, CREATE VIEW, or MERGE.
Syntax:
Steps to Write a CTE:
Start CTE by writing the WITH clause.
Specify CTE name.
After that specify a comma-separate list of columns that we are going to use.
Give Alias to the CTE after AS.
Define a SELECT statement whose result set populates the CTE.
It should be kept in mind while writing the CTE query definition; we cannot use the following clauses:
ORDER BY unless you also use as TOP clause
INTO
OPTION clause with query hints
FOR BROWSE
Table - student_score
Example -
Write a query to find students whose score is greater than the average score.
Output:
Sometimes, we have to make several CTE queries and combine them to view the outcomes. We can apply the idea of multiple CTEs in this situation. To make multiple CTE queries into one statement, we need to use the comma (",") operator. Remember, the CTE name comes before the comma operator to tell them apart when we have multiple CTEs.
Syntax:
Table - Product_Sales
Examples -
Write a SQL query for a Country whose Total sale is greater than the average sales of all countries.
Output:
Benefits Of Using CTEs
Improved readability: CTEs make your query more readable, especially for larger queries and those with complex logic.
Reuse of subqueries: If you have multiple parts of a query that use the same data or subquery, you can move it to a CTE and refer to it in multiple places.
Multiple CTEs: You can use multiple CTEs within a query by separating them with a comma.
Recursive queries: CTEs can be used to write recursive queries or hierarchical queries, which refer to previous rows in the same query.
Pivoting in SQL means changing the way your data is displayed. Instead of having rows of information, it arranges it into columns, which makes it neater and more organized. You can do this to make it simpler to work with or to show it differently. SQL has tools like `PIVOT` and `UNPIVOT` to help with this, and you can also use functions that summarize or rearrange your data for better analysis or presentation.
Table - Pivote_Demo
PIVOT
In SQL, the “PIVOT” function is used to transform rows of data into columns. The PIVOT function is commonly used with aggregate functions to summarize and analyze data in a more structured format, making it easier to work with and understand. It is especially useful when you want to turn data from a long format (rows) into a wide format (columns) for reporting or analysis purposes.
Syntax:
Example -
Create North and South as columns and show the sales from Pivot Demo Data.
Output:
UNPIVOT
In SQL, the UNPIVOT function is used to reverse the process of pivoting data. Essentially, it converts columns into rows, which can be particularly useful when you want to analyze or work with the data in a more detailed and structured way.
Syntax:
Example -
Here we unpivote the pivoted data in the previous example.
Output:
SQL is a versatile language beyond basic SELECT, INSERT, UPDATE, and DELETE operations. Windows Functions, CTEs, and Pivot/Unpivot operations are valuable tools that enable you to solve complex data analysis challenges more efficiently and with greater precision. By understanding and utilizing these techniques, you can streamline your SQL queries, extract meaningful insights from your data, and ultimately make better-informed decisions in your data analysis journey.