Graph Processing Tables is introduced in SQL Server 2017. I will use Human Resource data structure as hierarchical data in this post. Every company have employees with different titles and duties. I created the following hierarchy to use in this post.
You might need to select all data generated by East Supervisor teams or you might need to create different report for each zone. If your data structure does not setup for an hierarchical query like that, you might end up with hardcoded employee ids and probably many IN keywords in WHERE clause. Let me introduce you how to handle this by using Graph Tables in SQL Server 2017 or later.
Let's create the Graph tables first. In the following code, I create two Graph tables named HR and ReportsTo. HR is a Node table which holds all the entities (Employees). ReportsTo is a relation table and it holds hierarchy relation between employees.
DROP TABLE IF EXISTS HR, ReportsTo
CREATE TABLE HR (
id int primary key identity(1,1),
empid int,
name varchar(50)) as NODE
CREATE TABLE ReportsTo as EDGE
Next, I want to insert some employees into the entity table and create hierarchy between employees by creating ReportsTo relation. You can read about my Graph Database post if this doesn't look familiar to you.
INSERT into HR(empid,name) values
(1,'Ceo'),(2,'Cio'),(3,'HR Director'),(4,'Sales Director'),
(5,'East Director'),(6,'Central Director'),(7,'West Director'),
(8,'East Manager'), (9,'Central Manager'),
(10,'West Manager'),
(11,'East Supervisor'),
(12,'Central Supervisor'),(13,'West Supervisor'),
(14,'East Employee'),(15,'Central Employee'),(16,'West Employee')
INSERT into ReportsTo values ((Select $node_id from hr where empid = 1),
(Select $node_id from hr where empid = 2)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 1),
(Select $node_id from hr where empid = 3)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 1),
(Select $node_id from hr where empid = 4)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 4),
(Select $node_id from hr where empid = 5)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 4),
(Select $node_id from hr where empid = 6)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 4),
(Select $node_id from hr where empid = 7)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 5),
(Select $node_id from hr where empid = 8)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 6),
(Select $node_id from hr where empid = 9)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 7),
(Select $node_id from hr where empid = 10)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 8),
(Select $node_id from hr where empid = 11)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 9),
(Select $node_id from hr where empid = 12)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 10),
(Select $node_id from hr where empid = 13)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 11),
(Select $node_id from hr where empid = 14)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 12),
(Select $node_id from hr where empid = 15)
)
INSERT into ReportsTo values (
(Select $node_id from hr where empid = 13),
(Select $node_id from hr where empid = 16)
)
So far, you should be able to do all this code in SQL Server 2017. We have Graph Processing tables and some data. We are ready to explore hierarchicaldata by using graph tables! You need to use MATCH keyword to query data in Graph Processing tables. In the following example, I search who reports to East Director.
SELECT emp.Name as Mgr, emp2.Name as Emp
FROM Hr as emp, ReportsTo as r, Hr as emp2
WHERE MATCH(emp-(r)->emp2) and emp.empid = 5
For Match to work, you need minimum two entities and one relation. That's exactly what I have here. I am looking for employees who have reportsTo relation with other employees in this query. It returns the following result.
The > determines the direction of the query. By changing this arrow into other direction, you can use the same relation table to find who manages employee. For example, let's rewrite this query and look who manages East Manager
SELECT emp.Name as Emp, emp2.Name as Mgr
FROM Hr as emp, ReportsTo as r, Hr as emp2
WHERE MATCH(emp<-(r)-emp2) and emp.empid = 8
In this query, I changed the alias of the columns, changed the direction of the arrow and moved its location. This query returns the following result.
SQL Server 2019 & SHORTEST_PATH
So far, we can find who is reporting to who or who is managing to who. But what if we want to see the whole hierarchy between two employees. For example, show me the path between the CEO and West Manager. To do that, I am going to use a new feature of Graph Processing named SHORTEST_PATH. This syntax is introduced in SQL Server 2019 CTP 3.1 and it finds the shortest path between two entities. I am going to use it in this example to find the path between the CIO and West Manager.
SELECT * FROM
(
SELECT emp.Name,
STRING_AGG(emp2.Name, '->') WITHIN GROUP (GRAPH PATH) as Employee,
LAST_VALUE(emp2.Name) WITHIN GROUP (GRAPH PATH) as LastNode
FROM HR as emp,
ReportsTo FOR PATH as rpt,
HR FOR PATH as emp2
WHERE MATCH(SHORTEST_PATH(emp(-(rpt)->emp2)+)) and emp.id = 1
) as temp
WHERE temp.LastNode = 'Central Director'
A lot going on here. Inner query finds all the possible paths, outer query filters the results by looking at LAST_VALUE result. Here is the result of this query.
I use + sign at the end of the SHORTEST_PATH, '+' tells query to repeat the pattern until the end of path. Rather than using + sign and repeat the pattern until the end, you can specify the number of repeat pattern by using {1,n} format. In this way, you can list all employees between two levels. For example, Let try to find all employees from CIO to Director level. That is 3 levels.
SELECT emp.Name,
STRING_AGG(emp2.Name, '->') WITHIN GROUP (GRAPH PATH) as Employee,
LAST_VALUE(emp2.Name) WITHIN GROUP (GRAPH PATH) as LastNode
FROM HR as emp,
ReportsTo FOR PATH as rpt,
HR FOR PATH as emp2
WHERE MATCH(SHORTEST_PATH(emp(-(rpt)->emp2){1,3})) and emp.id = 1
This returns the following results.
Superb article and just what is expected out of a hierarchical graph query
ReplyDelete