Total Pageviews

4506

Tutorial - Hierarchical Queries: Oracle Databases for Developers

Module 1

Introduction

You can use hierarchical queries to travel along parent-child relationships in your data. For example, family trees, computer directory structures, and company organization charts.


This tutorial shows methods to build org charts using the following sample company:*/


select * from employees

Module 2

Connect By

Connect by is an Oracle-specific way to create data trees using SQL. It has two key clauses, start with and connect by.


Start With

You state which rows are the roots here. These are the rows that appear at the "top" of the tree.


In a company org chart this is the CEO. Here that's employee_id 100, Steven King. So you can begin the chart with him using:


start with manager_id = 100

But if you do this, you need to change your query when a new CEO replaces him!


It's better to go with a more generic method. The CEO has no manager they report to. So their manager_id is null. So you could identify them with:


start with manager_id is null

Connect By

You state the parent-child relationship here. This links the columns that store the parent and child values. You access values from the parent row using the keyword prior.


In a company each employee's "parent" is their manager. Thus you need to join the parent row's employee_id to the child's manager_id. So you connect the prior employee_id to the current manager_id, like so:


connect by prior employee_id = manager_id

Put this all together and you get the following query:*/


select * from employees

start with manager_id is null

connect by prior employee_id = manager_id;

Module 3

Try It!

Complete the following query to build a "reverse" org chart. Begin with employee 107 and go up the chain to Steven King.


select employee_id, first_name, last_name, manager_id

from  employees

start with employee_id 

connect by 

This should give the following output:


EMPLOYEE_ID   FIRST_NAME   LAST_NAME   MANAGER_ID   

          107 Diana        Lorentz              103 

          103 Alexander    Hunold               102 

          102 Lex          De Haan              100 

          100 Steven       King              <null>*/

Module 4

Recursive With

Oracle Database 11.2 introduced another method for accessing trees: recursive subquery factoring. Aka recursive with.


This is the ANSI compliant way to build hierarchies in SQL. It's composed of two queries. A base query and a recursive one.


Base Query

You use this to define the root rows in your tree. This is like the start with clause in connect by. So to begin the chart with the CEO, use:*/


  select employee_id, first_name, last_name, manager_id 

  from   employees

  where  manager_id is null

Recursive Query

This maps to the connect by clause. Here you join the source table to the with clause on the columns storing the parent-child values.


For the company chart, you need to join each employee to their manager. This is the employee from the previous row. So you're linking the org_chart to the employees table.


This gives the following query:*/


  select e.employee_id, e.first_name, e.last_name, e.manager_id 

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

/*When you use recursive with, you must provide aliases for all the columns it returns. These go between the query name and "as":*/


with org_chart (

  employee_id, first_name, last_name, manager_id

) as ( ...

/*To complete the query you need to union all the base and recursive queries together. Put this all together and you get:*/


with org_chart (

  employee_id, first_name, last_name, manager_id

) as (

  select employee_id, first_name, last_name, manager_id 

  from   employees

  where  manager_id is null

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id 

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

)

  select * from org_chart;

Module 5

Try It!

Complete the following query to build the reverse org chart from step three again. But this time using recursive with. It should start with employee_id 107 and go up the company to the CEO.*/


with org_chart (

  employee_id, first_name, last_name, manager_id

) as (

  select employee_id, first_name, last_name, manager_id 

  from   employees

  where  

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id 

  from   org_chart oc

  join   employees e

  on     

)

  select * from org_chart;

/*This should give the following output:


EMPLOYEE_ID   FIRST_NAME   LAST_NAME   MANAGER_ID   

          107 Diana        Lorentz              103 

          103 Alexander    Hunold               102 

          102 Lex          De Haan              100 

          100 Steven       King              <null>*/ 

Module 6

Level

With the queries so far it's hard to tell how senior someone is in the company. Adding the generation they belong to makes this easier. This allows you to see who's at the same level in the hierarchy. How you do this depends on whether you're using connect by or recursive with.


Connect By

With connect by you can use the pseudo-column level. This returns the current depth in the tree, starting with 1 for the roots. Each new set of children increases the depth by one. So all a manager's reports are one level below them.


You can select this on its own to show the current row's depth in the tree:*/


select level, employee_id, first_name, last_name, manager_id 

from   employees

start  with manager_id is null

connect by prior employee_id = manager_id;

/*This helps. But it's still tricky to tell how senior someone is in the company. Indenting their name based on their position makes this easier. Combining level with lpad makes this easy.


The syntax of lpad is:


lpad ( str1, N, str2 )

It adds the characters in str2 before those in str1 until the string is N characters long. So you can use level to place spaces before each employee's name. This makes it easy to see where people place in the organization:*/


select level, employee_id, 

       lpad ( ' ', level, ' ' ) || first_name || ' ' || last_name name, manager_id 

from   employees

start  with manager_id is null

connect by prior employee_id = manager_id;

/*Recursive With

Recursive with doesn't have an in-built equivalent of level. You need to build your own. You can do this by selecting the value 1 in the base query. And incrementing it by one in the recursive part. For example:*/


with org_chart (

  employee_id, first_name, last_name, manager_id, lvl

) as (

  select employee_id, first_name, last_name, manager_id, 1 lvl

  from   employees

  where  manager_id is null

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id, oc.lvl + 1

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

)

  select * from org_chart;

/*As with connect by, you can use this pad out values in the final select.*/


Module 7

Sorting Output: Connect By

When you build a hierarchical query, the database returns the rows in an order matching the tree structure.


Connect by returns rows in depth-first search order. If you use a regular order by you'll lose this sort.


But you can preserve the depth-first tree and sort rows with the same parent. You do this with the siblings clause of order by.


So to show a manager's reports after them, sorting employees with the same manager by hire date (first to last), you can do:*/


select level, employee_id, first_name, last_name, hire_date, manager_id 

from   employees

start  with manager_id is null

connect by prior employee_id = manager_id

order siblings by hire_date;

/*Module 8

Sorting Output: Recursive With

Recursive with allows you to choose whether you want to traverse the tree using depth-first or breadth-first search. You define this in the search clause.


Depth-First Search

This starts at the root. Then picks one of its children. It then gets the child's child. And so on, down the tree accessing child nodes first. When it hits a leaf, it goes back up the tree until it finds an unvisited child.


So it always goes as far down the tree it can before accessing another row at the same level.


To use depth-first search, specify this in the search clause. The columns you sort by defines which order the database returns siblings. And the set clause defines a new column storing this sequence. It starts with 1 at for the first root. For each new row it increments by 1.


So to return employees in a depth first tree, sorting employees with the same manager by the date they were hired (first to last), use:


search depth first by hire_date set hire_seq

Which gives the following complete query:*/


with org_chart (

  employee_id, first_name, last_name, hire_date, manager_id, lvl

) as (

  select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl

  from   employees

  where  manager_id is null

  union  all

  select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) search depth first by hire_date set hire_seq

  select * from org_chart

  order  by hire_seq;

/*Breadth-First Search

Instead of travelling down the tree, you can go across it. This is breadth-first search.


Again, this starts with the root. But it accesses all the rows at the same level before going down to any children. So if you have many roots, it visits all these first. Then all the first generation children before going down to the second generation. And so on.


The sorting columns define which order you access nodes at the same depth.Unlike depth-first search, successive rows at the same level may alternate between parent rows.


So the following returns all the employees at the same rank next to each other. It sorts these by their hire date, first to last:


search breadth first by hire_date set hire_seq

Which completes the query like so:*/


with org_chart (

  employee_id, first_name, last_name, hire_date, manager_id, lvl

) as (

  select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl

  from   employees

  where  manager_id is null

  union  all

  select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) search breadth first by hire_date set hire_seq

  select * from org_chart

  order  by hire_seq;

/*Module 9

Try It!

Complete the following query to return employees in depth-first order. You should sort employees with the same manager by first_name:*/


select level, employee_id, first_name, last_name, hire_date, manager_id 

from   employees

start  with manager_id is null

connect by prior employee_id = manager_id

order 

/*This query should give the following output:


LEVEL   EMPLOYEE_ID   FIRST_NAME    LAST_NAME   HIRE_DATE      MANAGER_ID   

      1           100 Steven        King        17-JUN-2003          <null>

      2           102 Lex           De Haan     13-JAN-2001             100 

      3           103 Alexander     Hunold      03-JAN-2006             102 

      4           104 Bruce         Ernst       21-MAY-2007             103 

      4           105 David         Austin      25-JUN-2005             103 

      4           107 Diana         Lorentz     07-FEB-2007             103 

      4           106 Valli         Pataballa   05-FEB-2006             103 

      2           101 Neena         Kochhar     21-SEP-2005             100 

      3           108 Nancy         Greenberg   17-AUG-2002             101 

      4           109 Daniel        Faviet      16-AUG-2002             108 

      4           111 Ismael        Sciarra     30-SEP-2005             108 

      4           110 John          Chen        28-SEP-2005             108 

      4           112 Jose Manuel   Urman       07-MAR-2006             108 

      4           113 Luis          Popp        07-DEC-2007             108

Can you write this using recursive with?*/


Module 10

Detecting Loops

It's possible to store loops in your hierarchy. Usually this is a data error. But some structures may contain loops by design.


For example, the following sets the CEO's manager to be a lowly programmer:*/


update employees

set    manager_id = 107

where  employee_id = 100;

/*This leads to a circle in your data. So you could get stuck in an infinite loop. Luckily Oracle Database has cycle detection to stop this.


Connect By

If you try and build a hierarchy that contains a loop, connect by throws an ORA-1436 error:*/


select * from employees

start with employee_id = 100

connect by prior employee_id = manager_id;

/*You can avoid this using the nocycle keyword. This spots when the query returns to the same row. The database hides the repeated row and continues processing the tree.


To use it, place nocycle after connect by:*/


select * from employees

start with employee_id = 100

connect by nocycle prior employee_id = manager_id;

/*Recursive With

You control loop detection using the cycle clause of recursive with. Here you state which columns mark a loop. The database keeps track of the values it sees in these columns. If the current row's values for these appear in one of it's ancestors, you have a loop.


The syntax for this is:


cycle <columns> set <loop_column> to <loop_value> default <default_value>

The <loop_column> is a new one you define. When the database detects a cycle, it changes this from the default value to the <loop_value>


The org chart has a loop if you visit the same employee_id twice. So you want to check this column for cycles.


The snippet below does this and adds the column looped to your results. This starts with the value N. If the database finds a cycle, it sets it to Y:


cycle employee_id set looped to 'Y' default 'N'

You place this after your with clause, which gives the following complete query:*/


with org_chart (

  employee_id, first_name, last_name, manager_id, department_id

) as (

  select employee_id, first_name, last_name, manager_id, department_id

  from   employees

  where  employee_id = 100

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id 

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) cycle employee_id set looped to 'Y' default 'N'

  select * from org_chart;

/*Unlike connect by, this includes the rows you visit twice. So the CEO, Steven King, appears twice in the results. If you want to exclude these, filter them out in your final where clause using the loop column you defined.


Using recursive with you can choose any columns in your query to mark a "loop". This allows you to stop processing before you get back to the same row.


For example, you can cycle on department_id. The CEO's direct reports, Neena Kochhar and Lex De Haan, are in the same department as him. So if you cycle on department_id, like so:


cycle department_id set looped to 'Y' default 'N'

The database considers these two employees to form a cycle! So the hierarchy only contains three rows:*/


with org_chart (

  employee_id, first_name, last_name, manager_id, department_id

) as (

  select employee_id, first_name, last_name, manager_id, department_id

  from   employees

  where  employee_id = 100

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id 

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) cycle department_id set looped to 'Y' default 'N'

  select * from org_chart;

/*This can be useful to detect rows with two parents. Without cycle detection, you may access that row and all its children twice. The cycle clause allows you to avoid this. Connect by doesn't have an inbuilt way to do this.


Remove the loop before continuing the tutorial:*/


update employees

set    manager_id = null

where  employee_id = 100;

/*Module 11

Try It!

Complete the following query to cycle on job_id. Define a cycle column is_repeat which defaults to N. When accessing the same job_id, set it to Y*/


with org_chart (

  employee_id, first_name, last_name, manager_id, job_id

) as (

  select employee_id, first_name, last_name, manager_id , job_id

  from   employees

  where  employee_id = 102

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id, e.job_id

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) cycle 

  select * from org_chart;

/*This should return the following rows:


EMPLOYEE_ID   FIRST_NAME   LAST_NAME   MANAGER_ID   JOB_ID    IS_REPEAT   

          102 Lex          De Haan              100 AD_VP     N           

          103 Alexander    Hunold               102 IT_PROG   N           

          104 Bruce        Ernst                103 IT_PROG   Y           

          105 David        Austin               103 IT_PROG   Y           

          106 Valli        Pataballa            103 IT_PROG   Y           

          107 Diana        Lorentz              103 IT_PROG   Y*/

Module 12

Displaying Tree Details: Connect By

Using level makes it possible to see how deep the current row is in the tree. But it can still be tricky to see how the rows relate to each other. Connect by has many options to help with this.


Connect_by_root

The operator connect_by_root returns the value of a column from the root row.


Sys_connect_by_path

It can be useful to see values from all the rows between the root and the current row. Sys_connect_by_path allows you to do this. It builds up a string, adding the value from the first argument for the current row to the end of the list. It separates these using the second argument.


Connect_by_isleaf

You can identify the leaf rows using connect_by_isleaf. This returns 1 if the current row is a leaf. Otherwise it returns 0.


Combining these allows you to display:


The last_name of the CEO (the root row) on every row

A comma separated list of the management chain from the current employee up to the CEO

The employees who aren't managers (the leaves)

The complete query for this is:*/


select employee_id, first_name, last_name, manager_id,

       connect_by_root last_name,

       sys_connect_by_path ( last_name, ', ') chart,

       connect_by_isleaf is_leaf

from   employees

start with manager_id is null

connect by prior employee_id = manager_id;

Module 13

Displaying Tree Details: Recursive With

Recursive with doesn't have the built-in options like connect by. But you can emulate them.


Values from the root (connect_by_root)

To return a value from the parent row, select it in the base query. And in the recursive part, return this column from the with clause name.


Showing the path from root to current row (sys_connect_by_path)

Again, start by selecting the value you want in the base query. In the recursive part, append the values you want to add with an appropriate separator.


Identifying the leaves (connect_by_isleaf)

Displaying leaf rows is more complex with recursive with. To find these, add the depth (level) to the tree. Then sort it using depth-first search.


Depth-first search always goes as far down the tree it can. After hitting a leaf, it goes back up to the next unvisited child. So you know a row is a leaf if the next row is at the same or higher depth in the tree. So you need to check if the next row's level is less than or equal to the current.


You can test for this using lead. This gets a value from the next row in the results. Assuming seq follows a depth-first order, the following returns LEAF if the row is a leaf. And null otherwise:*/


case 

  when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF'

end is_leaf

--Putting these all together gives:


with org_chart (

  employee_id, first_name, last_name, manager_id, root_emp, chart, lvl

) as (

  select employee_id, first_name, last_name, manager_id, 

         last_name root_emp, last_name chart, 1 lvl

  from   employees

  where  manager_id is null

  union  all

  select e.employee_id, e.first_name, e.last_name, e.manager_id, 

         oc.root_emp, oc.chart || ', ' || e.last_name, oc.lvl+1

  from   org_chart oc

  join   employees e

  on     e.manager_id = oc.employee_id

) search depth first by employee_id set seq

  select oc.*, 

         case 

           when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF'

         end is_leaf

  from   org_chart oc;

----------------------------  

Hierarchy Expansion

-----------------------------

I have a table containing a simple hierarchy:


SQL> DESC test_table;

 Name        Null?      Type

 ------      ------     --------------

 A                      NUMBER(38)

 B                      NUMBER(38)

With the following data:


SQL> SELECT * FROM test_table;


          A              B

-----------     ----------

          1             -1

          2              1

          3              1

          4              2

          5              2

          6              4

          7              4

          8              5

          9              5

         10              3

         11              3


11 rows selected.

I want a query that will give me each node and all its ancestors (parents). That is, I need to have as part of the output:


          A              B

-----------     ----------

...

          9              9

          9              5

          9              2

          9              1

...

I need this because 9 relates to 9, 9 relates to 5 (from the original table), 9 indirectly relates to 2 (by 5), and so on. How can I achieve this in a query?


At first glance, this seemed really hard, but it is pretty easy in Oracle9i Database and even easier in Oracle Database 10g. We can get the entire hierarchy easily:


SQL> SELECT a

  2    FROM test_table

  3    CONNECT BY PRIOR b=a

  4  /

That actually gets your column B in the desired output. Now we need to get the root node of each of the rows in this hierarchy. Using the somewhat new SYS_CONNECT_BY_PATH function (Oracle9i Database and later), we can get each root node. Using the following query, we can see what the SYS_CONNECT_BY_PATH function returns:


SQL> SELECT a,

  2    SYS_CONNECT_BY_PATH (a,'.') scbp

  3    FROM test_table

  4    CONNECT BY PRIOR b=a

  5    ORDER BY 2

  6  /


         A    SCBP

----------     ---------

...

         9    .9

         5    .9.5

         2    .9.5.2

         1    .9.5.2.1

    ...

As you can see, we are starting to get what we need—the front of each value of SCBP is the root of the hierarchy, and column A is the tail. Now we work a little substring ( SUBSTR ) magic:


SQL> SELECT a,

  2     TO_NUMBER(

  3      SUBSTR(scbp,1,

  4             INSTR(scbp,'.')-1)

  5      ) b

  6    FROM (

  7  SELECT a,

  8    LTRIM(

  9      SYS_CONNECT_BY_PATH(a,'.'),

 10         '.') ||'.' scbp

 11    FROM test_table

 12   CONNECT BY PRIOR b=a

 13         )

 14   ORDER BY 2

 15 /


          A              B

-----------     ----------

...

          9              9

          9              5

          9              2

          9              1

...

And there we go. You'll be happy to know that in Oracle Database 10g, this is even easier. There are a bunch of new functions for CONNECT BY queries, such as:


CONNECT_BY_ROOT —returns the root of the hierarchy for the current row; this greatly simplifies our query. (See below for an example).


CONNECT_BY_ISLEAF —is a flag to tell you if the current row has child rows.


CONNECT_BY_ISCYCLE —is a flag to tell you if the current row is the beginning of an infinite loop in your hierarchy. For example, if A is the parent of B, B is the parent of C, and C is the parent of A, you would have an infinite loop. You can use this flag to see which row or rows are the beginning of an infinite loop in your data.


NOCYCLE —lets the CONNECT BY query recognize that an infinite loop is occurring and stop without error (instead of returning a CONNECT BY loop error).


For this question, the first new function, CONNECT_BY_ROOT , is relevant. The following query does the work for us in Oracle Database 10g:


SQL> SELECT CONNECT_BY_ROOT a cbr,

  2         a b

  3     FROM test_table

  4   CONNECT BY PRIOR b=a

  5   ORDER BY 1

  6  /


       CBR            B

-----------     ----------

...

         9            9

         9            5

         9            2

         9            1

... 

No comments:

Post a Comment

ORA-00845: MEMORY_TARGET not supported on this system

 The shared memory file system should have enough space to accommodate the MEMORY_TARGET and MEMORY_MAX_TARGET values. To verify: SQL> sh...