Recursive queries with SQL Server
The purpose of this article is to present recursion with Common Table Expression (CTE). For that we will use the following table:
Or in SQL:
CREATE TABLE [dbo].[Employee] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[FirstName] NVARCHAR (100) NOT NULL,
[LastName] NVARCHAR (100) NOT NULL,
[ManagerId] INT NULL,
PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [FK_Manager] FOREIGN KEY ([ManagerId]) REFERENCES [dbo].[Employee] ([Id])
);
The goal is to obtain in a single query the complete hierarchy with the indentation:
Here's what MSDN is saying about CTE:
A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.
Here is an example of CTE:
-- Declare the CTE
WITH EmployeeCTE (Id, FullName)
AS
(
SELECT Id, FirstName + ' ' + LastName
FROM Employee
)
-- Use the CTE
SELECT * From EmployeeCTE
Let's see how we can use CTE to query the employee table.
#Get all results with recursion
SQL Server allows the use of recursive CTE. For this you need to create a query in 2 parts:
- The first returns the data to initiate recursion. In our case the initial request selects the employees who do not have a manager (the root of the hierarchy).
- The second selects new lines based on the already selected lines. In our case we use the relation between
Id
andManagerId
to select new lines.
Recursion stops when recursion no longer returns rows.
Here is the query to select all employees recursively:
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId)
AS
(
-- Root level of the hierarchy
SELECT Id, FirstName, LastName, ManagerId
FROM Employee
WHERE ManagerId IS NULL
-- Combine result with the result of the recursion
UNION ALL
-- Load the next level of the hierarchy
SELECT e.Id, e.FirstName, e.LastName, e.ManagerId
FROM Employee e
INNER JOIN EmployeeHierarchy eh -- self-reference! (recursion)
ON e.ManagerId = eh.Id
)
SELECT * FROM EmployeeHierarchy
To illustrate how the query works, here are the results obtained after each recursion:
#Show indentation and full name
We now have the desired dataset but it misses the notion of hierarchy ("-" before the name) in the result. We can simply add the notion of depth by adding a new column to the CTE. The value of this column is incremented with each recursion:
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId, level)
AS
(
SELECT Id, FirstName, LastName, ManagerId, 0
FROM Employee
WHERE ManagerId IS NULL
UNION ALL
SELECT e.Id, e.FirstName, e.LastName, e.ManagerId, eh.level + 1 -- compute the level
FROM Employee e
INNER JOIN EmployeeHierarchy eh ON e.ManagerId = eh.Id
)
SELECT Id, REPLICATE('--', level) + ' ' + FirstName + ' ' + LastName, ManagerId
FROM EmployeeHierarchy
#Sort the rows in the desired order
We have the notion of hierarchy, but there remains the problem of sorting. The columns available with this CTE do not correctly sort the data. The idea is to add a new column containing the full path as a string of a line in the hierarchy. We can sort on this column:
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId, level, fullpath)
AS
(
SELECT Id, FirstName, LastName, ManagerId, 0, CAST(Id AS NVARCHAR(200))
FROM Employee
WHERE ManagerId IS NULL
UNION ALL
SELECT e.Id, e.FirstName, e.LastName, e.ManagerId, eh.level + 1, CAST(eh.fullpath + '/' + CAST(e.Id AS NVARCHAR(20)) AS NVARCHAR(200))
FROM Employee e
INNER JOIN EmployeeHierarchy eh ON e.ManagerId = eh.Id
)
SELECT REPLICATE('--', level) + ' ' + FirstName + ' ' + LastName, fullpath FROM EmployeeHierarchy
ORDER BY fullpath
CAST AS NVARCHAR(200)
allows you to have the same type for the fullpath
column throughout the recursion. Indeed the type of a column of a CTE must be constant. If the depth of the hierarchy is important, you must of course adjust the size of the fullpath
column to avoid truncation to 200 characters.
This time the result is the one expected:
CTEs are very practical and powerful in solving this kind of problem. Note that there are other ways to model this table to avoid getting out the heavy artillery. For example, SQL Server 2008 introduced the HierarchyId type, but this will be the subject of another article.
Do you have a question or a suggestion about this post? Contact me!