Introduction
You can use hierarchical queries to travel along parent-child relationships in your data. For example, family trees, computer directory structures, and company organization charts.
This tutorial shows methods to build org charts using the following sample company:*/
select * from employees
Module 2
Connect By
Connect by is an Oracle-specific way to create data trees using SQL. It has two key clauses, start with and connect by.
Start With
You state which rows are the roots here. These are the rows that appear at the "top" of the tree.
In a company org chart this is the CEO. Here that's employee_id 100, Steven King. So you can begin the chart with him using:
start with manager_id = 100
But if you do this, you need to change your query when a new CEO replaces him!
It's better to go with a more generic method. The CEO has no manager they report to. So their manager_id is null. So you could identify them with:
start with manager_id is null
Connect By
You state the parent-child relationship here. This links the columns that store the parent and child values. You access values from the parent row using the keyword prior.
In a company each employee's "parent" is their manager. Thus you need to join the parent row's employee_id to the child's manager_id. So you connect the prior employee_id to the current manager_id, like so:
connect by prior employee_id = manager_id
Put this all together and you get the following query:*/
select * from employees
start with manager_id is null
connect by prior employee_id = manager_id;
Module 3
Try It!
Complete the following query to build a "reverse" org chart. Begin with employee 107 and go up the chain to Steven King.
select employee_id, first_name, last_name, manager_id
from employees
start with employee_id
connect by
This should give the following output:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID
107 Diana Lorentz 103
103 Alexander Hunold 102
102 Lex De Haan 100
100 Steven King <null>*/
Module 4
Recursive With
Oracle Database 11.2 introduced another method for accessing trees: recursive subquery factoring. Aka recursive with.
This is the ANSI compliant way to build hierarchies in SQL. It's composed of two queries. A base query and a recursive one.
Base Query
You use this to define the root rows in your tree. This is like the start with clause in connect by. So to begin the chart with the CEO, use:*/
select employee_id, first_name, last_name, manager_id
from employees
where manager_id is null
Recursive Query
This maps to the connect by clause. Here you join the source table to the with clause on the columns storing the parent-child values.
For the company chart, you need to join each employee to their manager. This is the employee from the previous row. So you're linking the org_chart to the employees table.
This gives the following query:*/
select e.employee_id, e.first_name, e.last_name, e.manager_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
/*When you use recursive with, you must provide aliases for all the columns it returns. These go between the query name and "as":*/
with org_chart (
employee_id, first_name, last_name, manager_id
) as ( ...
/*To complete the query you need to union all the base and recursive queries together. Put this all together and you get:*/
with org_chart (
employee_id, first_name, last_name, manager_id
) as (
select employee_id, first_name, last_name, manager_id
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
)
select * from org_chart;
Module 5
Try It!
Complete the following query to build the reverse org chart from step three again. But this time using recursive with. It should start with employee_id 107 and go up the company to the CEO.*/
with org_chart (
employee_id, first_name, last_name, manager_id
) as (
select employee_id, first_name, last_name, manager_id
from employees
where
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id
from org_chart oc
join employees e
on
)
select * from org_chart;
/*This should give the following output:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID
107 Diana Lorentz 103
103 Alexander Hunold 102
102 Lex De Haan 100
100 Steven King <null>*/
Module 6
Level
With the queries so far it's hard to tell how senior someone is in the company. Adding the generation they belong to makes this easier. This allows you to see who's at the same level in the hierarchy. How you do this depends on whether you're using connect by or recursive with.
Connect By
With connect by you can use the pseudo-column level. This returns the current depth in the tree, starting with 1 for the roots. Each new set of children increases the depth by one. So all a manager's reports are one level below them.
You can select this on its own to show the current row's depth in the tree:*/
select level, employee_id, first_name, last_name, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
/*This helps. But it's still tricky to tell how senior someone is in the company. Indenting their name based on their position makes this easier. Combining level with lpad makes this easy.
The syntax of lpad is:
lpad ( str1, N, str2 )
It adds the characters in str2 before those in str1 until the string is N characters long. So you can use level to place spaces before each employee's name. This makes it easy to see where people place in the organization:*/
select level, employee_id,
lpad ( ' ', level, ' ' ) || first_name || ' ' || last_name name, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
/*Recursive With
Recursive with doesn't have an in-built equivalent of level. You need to build your own. You can do this by selecting the value 1 in the base query. And incrementing it by one in the recursive part. For example:*/
with org_chart (
employee_id, first_name, last_name, manager_id, lvl
) as (
select employee_id, first_name, last_name, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
)
select * from org_chart;
/*As with connect by, you can use this pad out values in the final select.*/
Module 7
Sorting Output: Connect By
When you build a hierarchical query, the database returns the rows in an order matching the tree structure.
Connect by returns rows in depth-first search order. If you use a regular order by you'll lose this sort.
But you can preserve the depth-first tree and sort rows with the same parent. You do this with the siblings clause of order by.
So to show a manager's reports after them, sorting employees with the same manager by hire date (first to last), you can do:*/
select level, employee_id, first_name, last_name, hire_date, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id
order siblings by hire_date;
/*Module 8
Sorting Output: Recursive With
Recursive with allows you to choose whether you want to traverse the tree using depth-first or breadth-first search. You define this in the search clause.
Depth-First Search
This starts at the root. Then picks one of its children. It then gets the child's child. And so on, down the tree accessing child nodes first. When it hits a leaf, it goes back up the tree until it finds an unvisited child.
So it always goes as far down the tree it can before accessing another row at the same level.
To use depth-first search, specify this in the search clause. The columns you sort by defines which order the database returns siblings. And the set clause defines a new column storing this sequence. It starts with 1 at for the first root. For each new row it increments by 1.
So to return employees in a depth first tree, sorting employees with the same manager by the date they were hired (first to last), use:
search depth first by hire_date set hire_seq
Which gives the following complete query:*/
with org_chart (
employee_id, first_name, last_name, hire_date, manager_id, lvl
) as (
select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search depth first by hire_date set hire_seq
select * from org_chart
order by hire_seq;
/*Breadth-First Search
Instead of travelling down the tree, you can go across it. This is breadth-first search.
Again, this starts with the root. But it accesses all the rows at the same level before going down to any children. So if you have many roots, it visits all these first. Then all the first generation children before going down to the second generation. And so on.
The sorting columns define which order you access nodes at the same depth.Unlike depth-first search, successive rows at the same level may alternate between parent rows.
So the following returns all the employees at the same rank next to each other. It sorts these by their hire date, first to last:
search breadth first by hire_date set hire_seq
Which completes the query like so:*/
with org_chart (
employee_id, first_name, last_name, hire_date, manager_id, lvl
) as (
select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search breadth first by hire_date set hire_seq
select * from org_chart
order by hire_seq;
/*Module 9
Try It!
Complete the following query to return employees in depth-first order. You should sort employees with the same manager by first_name:*/
select level, employee_id, first_name, last_name, hire_date, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id
order
/*This query should give the following output:
LEVEL EMPLOYEE_ID FIRST_NAME LAST_NAME HIRE_DATE MANAGER_ID
1 100 Steven King 17-JUN-2003 <null>
2 102 Lex De Haan 13-JAN-2001 100
3 103 Alexander Hunold 03-JAN-2006 102
4 104 Bruce Ernst 21-MAY-2007 103
4 105 David Austin 25-JUN-2005 103
4 107 Diana Lorentz 07-FEB-2007 103
4 106 Valli Pataballa 05-FEB-2006 103
2 101 Neena Kochhar 21-SEP-2005 100
3 108 Nancy Greenberg 17-AUG-2002 101
4 109 Daniel Faviet 16-AUG-2002 108
4 111 Ismael Sciarra 30-SEP-2005 108
4 110 John Chen 28-SEP-2005 108
4 112 Jose Manuel Urman 07-MAR-2006 108
4 113 Luis Popp 07-DEC-2007 108
Can you write this using recursive with?*/
Module 10
Detecting Loops
It's possible to store loops in your hierarchy. Usually this is a data error. But some structures may contain loops by design.
For example, the following sets the CEO's manager to be a lowly programmer:*/
update employees
set manager_id = 107
where employee_id = 100;
/*This leads to a circle in your data. So you could get stuck in an infinite loop. Luckily Oracle Database has cycle detection to stop this.
Connect By
If you try and build a hierarchy that contains a loop, connect by throws an ORA-1436 error:*/
select * from employees
start with employee_id = 100
connect by prior employee_id = manager_id;
/*You can avoid this using the nocycle keyword. This spots when the query returns to the same row. The database hides the repeated row and continues processing the tree.
To use it, place nocycle after connect by:*/
select * from employees
start with employee_id = 100
connect by nocycle prior employee_id = manager_id;
/*Recursive With
You control loop detection using the cycle clause of recursive with. Here you state which columns mark a loop. The database keeps track of the values it sees in these columns. If the current row's values for these appear in one of it's ancestors, you have a loop.
The syntax for this is:
cycle <columns> set <loop_column> to <loop_value> default <default_value>
The <loop_column> is a new one you define. When the database detects a cycle, it changes this from the default value to the <loop_value>
The org chart has a loop if you visit the same employee_id twice. So you want to check this column for cycles.
The snippet below does this and adds the column looped to your results. This starts with the value N. If the database finds a cycle, it sets it to Y:
cycle employee_id set looped to 'Y' default 'N'
You place this after your with clause, which gives the following complete query:*/
with org_chart (
employee_id, first_name, last_name, manager_id, department_id
) as (
select employee_id, first_name, last_name, manager_id, department_id
from employees
where employee_id = 100
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle employee_id set looped to 'Y' default 'N'
select * from org_chart;
/*Unlike connect by, this includes the rows you visit twice. So the CEO, Steven King, appears twice in the results. If you want to exclude these, filter them out in your final where clause using the loop column you defined.
Using recursive with you can choose any columns in your query to mark a "loop". This allows you to stop processing before you get back to the same row.
For example, you can cycle on department_id. The CEO's direct reports, Neena Kochhar and Lex De Haan, are in the same department as him. So if you cycle on department_id, like so:
cycle department_id set looped to 'Y' default 'N'
The database considers these two employees to form a cycle! So the hierarchy only contains three rows:*/
with org_chart (
employee_id, first_name, last_name, manager_id, department_id
) as (
select employee_id, first_name, last_name, manager_id, department_id
from employees
where employee_id = 100
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle department_id set looped to 'Y' default 'N'
select * from org_chart;
/*This can be useful to detect rows with two parents. Without cycle detection, you may access that row and all its children twice. The cycle clause allows you to avoid this. Connect by doesn't have an inbuilt way to do this.
Remove the loop before continuing the tutorial:*/
update employees
set manager_id = null
where employee_id = 100;
/*Module 11
Try It!
Complete the following query to cycle on job_id. Define a cycle column is_repeat which defaults to N. When accessing the same job_id, set it to Y*/
with org_chart (
employee_id, first_name, last_name, manager_id, job_id
) as (
select employee_id, first_name, last_name, manager_id , job_id
from employees
where employee_id = 102
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.job_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle
select * from org_chart;
/*This should return the following rows:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID JOB_ID IS_REPEAT
102 Lex De Haan 100 AD_VP N
103 Alexander Hunold 102 IT_PROG N
104 Bruce Ernst 103 IT_PROG Y
105 David Austin 103 IT_PROG Y
106 Valli Pataballa 103 IT_PROG Y
107 Diana Lorentz 103 IT_PROG Y*/
Module 12
Displaying Tree Details: Connect By
Using level makes it possible to see how deep the current row is in the tree. But it can still be tricky to see how the rows relate to each other. Connect by has many options to help with this.
Connect_by_root
The operator connect_by_root returns the value of a column from the root row.
Sys_connect_by_path
It can be useful to see values from all the rows between the root and the current row. Sys_connect_by_path allows you to do this. It builds up a string, adding the value from the first argument for the current row to the end of the list. It separates these using the second argument.
Connect_by_isleaf
You can identify the leaf rows using connect_by_isleaf. This returns 1 if the current row is a leaf. Otherwise it returns 0.
Combining these allows you to display:
The last_name of the CEO (the root row) on every row
A comma separated list of the management chain from the current employee up to the CEO
The employees who aren't managers (the leaves)
The complete query for this is:*/
select employee_id, first_name, last_name, manager_id,
connect_by_root last_name,
sys_connect_by_path ( last_name, ', ') chart,
connect_by_isleaf is_leaf
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
Module 13
Displaying Tree Details: Recursive With
Recursive with doesn't have the built-in options like connect by. But you can emulate them.
Values from the root (connect_by_root)
To return a value from the parent row, select it in the base query. And in the recursive part, return this column from the with clause name.
Showing the path from root to current row (sys_connect_by_path)
Again, start by selecting the value you want in the base query. In the recursive part, append the values you want to add with an appropriate separator.
Identifying the leaves (connect_by_isleaf)
Displaying leaf rows is more complex with recursive with. To find these, add the depth (level) to the tree. Then sort it using depth-first search.
Depth-first search always goes as far down the tree it can. After hitting a leaf, it goes back up to the next unvisited child. So you know a row is a leaf if the next row is at the same or higher depth in the tree. So you need to check if the next row's level is less than or equal to the current.
You can test for this using lead. This gets a value from the next row in the results. Assuming seq follows a depth-first order, the following returns LEAF if the row is a leaf. And null otherwise:*/
case
when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF'
end is_leaf
--Putting these all together gives:
with org_chart (
employee_id, first_name, last_name, manager_id, root_emp, chart, lvl
) as (
select employee_id, first_name, last_name, manager_id,
last_name root_emp, last_name chart, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id,
oc.root_emp, oc.chart || ', ' || e.last_name, oc.lvl+1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search depth first by employee_id set seq
select oc.*,
case
when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF'
end is_leaf
from org_chart oc;
----------------------------
Hierarchy Expansion
-----------------------------
I have a table containing a simple hierarchy:
SQL> DESC test_table;
Name Null? Type
------ ------ --------------
A NUMBER(38)
B NUMBER(38)
With the following data:
SQL> SELECT * FROM test_table;
A B
----------- ----------
1 -1
2 1
3 1
4 2
5 2
6 4
7 4
8 5
9 5
10 3
11 3
11 rows selected.
I want a query that will give me each node and all its ancestors (parents). That is, I need to have as part of the output:
A B
----------- ----------
...
9 9
9 5
9 2
9 1
...
I need this because 9 relates to 9, 9 relates to 5 (from the original table), 9 indirectly relates to 2 (by 5), and so on. How can I achieve this in a query?
At first glance, this seemed really hard, but it is pretty easy in Oracle9i Database and even easier in Oracle Database 10g. We can get the entire hierarchy easily:
SQL> SELECT a
2 FROM test_table
3 CONNECT BY PRIOR b=a
4 /
That actually gets your column B in the desired output. Now we need to get the root node of each of the rows in this hierarchy. Using the somewhat new SYS_CONNECT_BY_PATH function (Oracle9i Database and later), we can get each root node. Using the following query, we can see what the SYS_CONNECT_BY_PATH function returns:
SQL> SELECT a,
2 SYS_CONNECT_BY_PATH (a,'.') scbp
3 FROM test_table
4 CONNECT BY PRIOR b=a
5 ORDER BY 2
6 /
A SCBP
---------- ---------
...
9 .9
5 .9.5
2 .9.5.2
1 .9.5.2.1
...
As you can see, we are starting to get what we need—the front of each value of SCBP is the root of the hierarchy, and column A is the tail. Now we work a little substring ( SUBSTR ) magic:
SQL> SELECT a,
2 TO_NUMBER(
3 SUBSTR(scbp,1,
4 INSTR(scbp,'.')-1)
5 ) b
6 FROM (
7 SELECT a,
8 LTRIM(
9 SYS_CONNECT_BY_PATH(a,'.'),
10 '.') ||'.' scbp
11 FROM test_table
12 CONNECT BY PRIOR b=a
13 )
14 ORDER BY 2
15 /
A B
----------- ----------
...
9 9
9 5
9 2
9 1
...
And there we go. You'll be happy to know that in Oracle Database 10g, this is even easier. There are a bunch of new functions for CONNECT BY queries, such as:
CONNECT_BY_ROOT —returns the root of the hierarchy for the current row; this greatly simplifies our query. (See below for an example).
CONNECT_BY_ISLEAF —is a flag to tell you if the current row has child rows.
CONNECT_BY_ISCYCLE —is a flag to tell you if the current row is the beginning of an infinite loop in your hierarchy. For example, if A is the parent of B, B is the parent of C, and C is the parent of A, you would have an infinite loop. You can use this flag to see which row or rows are the beginning of an infinite loop in your data.
NOCYCLE —lets the CONNECT BY query recognize that an infinite loop is occurring and stop without error (instead of returning a CONNECT BY loop error).
For this question, the first new function, CONNECT_BY_ROOT , is relevant. The following query does the work for us in Oracle Database 10g:
SQL> SELECT CONNECT_BY_ROOT a cbr,
2 a b
3 FROM test_table
4 CONNECT BY PRIOR b=a
5 ORDER BY 1
6 /
CBR B
----------- ----------
...
9 9
9 5
9 2
9 1
...
No comments:
Post a Comment