M-ary trees, directories, and CTEs
I recently had to solve a somewhat theoretical, somewhat Computer Science 101 problem in a real world application: finding all nodes in an m-ary tree given an arbitrary starting node. In terms of source data, the trees were stored in a SQL database using the parent pointer model. Graph databases are more well suited for modeling such data, but let’s ignore that fact for the time being, because most data still lives in relational databases (of some sort) in the current age. While trying to solve this apparently simple problem in Postgres, I came across some hidden costs of common table expressions (CTEs) that I was not familiar with. Let’s look at this simple theoretical problem, the pure SQL solution, and some things you should be aware of about CTE implementations on the DBMS side.
There are various ways to model this data, some more efficient than others for
identifying all nodes in a tree given an arbitrary starting node. The parent
pointer model that we are considering is one of the simplest. There is a very
good chance that you are already familiar with the parent pointer model in the
context of m-ary trees. If you are using a POSIX shell, you can run ls -a
in
any directory and you will see .
, referencing the current directory, and ..
,
referencing the parent directory. This is basically the parent pointer model.
Additionally, your whole system directory structure can be though of as an m-ary
tree, with the root directory as the parent to all other nodes. Now, file
systems are not germane for the rest of our discussion for many reasons, but
this example shows the universality of this tree data structure and the parent
pointer data model. Linking it back to the discussion at hand, if we were to
utilize the parent pointer model in a SQL database or an application (with, i.e.
multiple instances of a node
class), each record (in the case of a database)
contains the node ID and parent ID, or each node itself (in the case of an
application) contains the node ID and a pointer to the parent node. Unlike what
is visible in a directory, we will assume that a node object or record has no
information about it’s children.
Let’s say we have this tree:
0
/|\
1 2 3
/ \ \
4 5 6
// \\
7 8 9 10
and we want to recover all nodes in the tree starting from the node with ID = 4. If you were working with an application that had objects in memory, you would have something like this in Python:
class Node:
def __init__(self, id: int, parent: "Node"):
self.id = id
self.parent = parent
zero = Node(0, None)
one = Node(1, zero)
two = Node(2, zero)
three = Node(3, zero)
four = Node(4, two)
five = Node(5, two)
six = Node(6, three)
seven = Node(7, four)
eight = Node(8, four)
nine = Node(9, four)
ten = Node(10, four)
or this in C++:
struct Node{
int id;
Node *parent;
Node(int id, Node *parent) : id(id), parent(parent) {};
bool operator==(const Node& other) const {
return id == other.id and other.parent == parent;
}
};
Node zero = Node(0, nullptr);
Node one = Node(1, &zero);
Node two = Node(2, &zero);
Node three = Node(3, &zero);
Node four = Node(4, &two);
Node five = Node(5, &two);
Node six = Node(6, &three);
Node seven = Node(7, &four);
Node eight = Node(8, &four);
Node nine = Node(9, &four);
Node ten = Node(10, &four);
From this, thinking about full tree recovery, it is apparent that we cannot recover the full tree without some additional data: all you can do, given some node, is find the parent, and the parent of the parent, etc. We need some other data that would allow us to:
- Identity the parent and all parent’s parents of the initial node (this is the given in the parent pointer model).
- Identify all children and children’s children of the initial node.
- Identify all children and children’s children of the parent and parent’s parents of the initial node.
An additional hash table that maps parents to children fits the bill, something like:
children_lut = {zero : [one, two, three]}
or
struct NodeHasher {
const std::size_t operator()(Node node) const {
// Just convert the memory address to a size_t as the hash
return reinterpret_cast<size_t>(&node);
}
};
std::unordered_map<Node, std::vector<Node>, NodeHasher> children_lut = {
{zero, {one, two, three}}
};
Of course, there are other, more standard ways (i.e. by storing information on the children themselves) to make a tree fully identifiable, but we chose this example because it is a good analog for how we can model this data in a database while still using the parent pointer model. Hopefully, with this approach using a map of parents to children, the procedure we need to follow to identify all the nodes of a tree is relatively clear: first traverse all the parents of the starting node (given in the node itself), then get all of the children of the starting node using the map, and finally get all of the children of all the parents using the map, This relatively simple exercise is left to the reader; this is just setting the stage for our actual focus: a pure SQL implementation.
In a table, in a relational database, you have something like this:
ID | Parent ID |
---|---|
0 | |
1 | 0 |
2 | 0 |
3 | 0 |
4 | 2 |
5 | 2 |
6 | 3 |
7 | 4 |
8 | 4 |
9 | 4 |
10 | 4 |
Obviously, in the table, we are not holding a direct reference to the parent object (there is no concept of parent “object” in this case), we only have the parent ID. Considering a record in the database to be analogous to an object (like an instance of a class), we can, of course, find the parent trivially;
select * from tree_nodes where id = $parent_id;
and we can also find the children trivially:
select * from tree_nodes where parent_id = $id;
In Postgres, and probably some other DBMSs, we can even implement this as a hash table by adding hash indexes to the table (we will do this when we create our table). Now: how do we identify all nodes in the tree given an arbitrary starting node in pure SQL? The only straightforward way to do this in pure SQL, in a single query, is through a recursive CTE. Let’s set up Postgres locally, through Docker, to run some tests:
docker pull postgres:17.2
docker run --name tree_testing -v .:/scripts -e POSTGRES_PASSWORD=pw -d \
postgres:17.2
Postgres 17.2 is the most recent version at the time of writing. Some behavior will likely be different in later or earlier versions. Let’s generate some random m-ary trees that will be inserted into our table with our target tree:
import random
max_children = 19
min_children = 0
max_children_per_node = 5
current_node_id = 11
all_node_list = []
for i in range(10000):
node_list_i = []
root_i = Node(current_node_id, None)
current_node_id += 1
node_list_i.append(root_i)
n_children = random.randint(min_children,max_children)
current_parent = root_i
current_parent_idx = 0
cntr = 0
while n_children > 0:
n_children_current_node = random.randint(0, min(n_children, max_children_per_node))
for j in range(n_children_current_node):
current_node_i = Node(current_node_id, current_parent)
n_children -= 1
node_list_i.append(current_node_i)
current_node_id += 1
cntr += 1
current_parent_idx = cntr % len(node_list_i)
current_parent = node_list_i[current_parent_idx]
all_node_list.append(node_list_i)
query = "insert into tree_nodes (id, parent_id) values "
for tree in all_node_list:
for node in tree:
id = node.id
parent_id = node.parent.id if node.parent else 'null'
if parent_id == 'null':
query += f"('{id}', {parent_id}),"
else:
query += f"('{id}', '{parent_id}'),"
query = query[0:len(query)-1]
with open("insert_nodes.sql", 'w') as file:
file.write(query)
This Python script generates random trees with a maximum of 20 nodes (1 root node + up to 19 other nodes). We have a nested while loop that generates a random number of children for each node, adding children to each node in the order that they were created until we have reached the total number of nodes needed for the tree. After this, I am generating a SQL DML file that will insert all of the trees.
We can now go into our Postgres prompt:
docker exec -it tree_testing psql -U postgres
Then we can run the following (you will note that I am not a big fan of uppercase keywords in SQL):
drop table if exists tree_nodes cascade;
create table tree_nodes (
id varchar(256) primary key,
parent_id varchar(256)
);
create index idx_tree_nodes_id on tree_nodes using hash (id);
create index idx_tree_nodes_parent_id on tree_nodes using hash (parent_id);
insert into tree_nodes (id, parent_id) values
('0', null),
('1', '0'),
('2', '0'),
('3', '0'),
('4', '2'),
('5', '2'),
('6', '3'),
('7', '4'),
('8', '4'),
('9', '4'),
('10', '4');
\i /scripts/insert_nodes.sql
We now have all of our example data ready in our test DB. Now, how do we find all nodes in the tree containing the node with ID = 4? Again, step 1 is to find the record with ID 4 itself (this will be the base case in our recursive CTE) and then all parents of parents.
with recursive parents(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from tree_nodes
where id = '4'
union all
select tn.id, tn.parent_id, 'parent' as relation
from tree_nodes tn
inner join parents p on tn.id = p.parent_id)
select * from parents;
Step 2 is the find all the children of all the children, with the record with ID 4 itself being the base case again. I will add this a second CTE in the same query and take the union of the results.
--Step 1
with recursive parents(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from tree_nodes
where id = '4'
union all
select tn.id, tn.parent_id, 'parent' as relation
from tree_nodes tn
inner join parents p on tn.id = p.parent_id),
--Step 2
children(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from tree_nodes
where id = '4'
union all
select tn.id, tn.parent_id, 'child' as relation
from tree_nodes tn
inner join children c on tn.parent_id = c.id)
select * from parents
union
select * from children;
Step 3 is to find all the other children of the parents. Selecting all the
already found parents is the base case here. Thus, we need to reference the
already defined parents
CTE in this new CTE.
--Step 1
with recursive parents(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from tree_nodes
where id = '4'
union all
select tn.id, tn.parent_id, 'parent' as relation
from tree_nodes tn
inner join parents p on tn.id = p.parent_id),
--Step 2
children(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from tree_nodes
where id = '4'
union all
select tn.id, tn.parent_id, 'child' as relation
from tree_nodes tn
inner join children c on tn.parent_id = c.id),
--Step 3
parents_children(id, parent_id, relation) as (
select id, parent_id, 'parent_with_other_children' as relation
from parents where relation = 'parent'
union all
select tn.id, tn.parent_id, 'other_child_of_parent' as relation
from tree_nodes tn
inner join parents_children pc on tn.parent_id = pc.id)
select * from parents
union
select * from children
union
select * from parents_children;
If we run explain analyze
on this query, I get an execution time of ~1 ms,
though your mileage may vary. We are actually getting good performance with this
query, with sensible use of indexes, but let’s look at how easy it is for things
go haywire. Let’s say, for whatever reason, we basically aliased our
tree_nodes
table by running select *
against it in a CTE named all_trees
:
with recursive all_trees as (
select * from tree_nodes),
--Step 1
parents(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from all_trees
where id = '4'
union all
select tn.id, tn.parent_id, 'parent' as relation
from all_trees tn
inner join parents p on tn.id = p.parent_id),
--Step 2
children(id, parent_id, relation) as (
select id, parent_id, 'individual' as relation
from all_trees
where id = '4'
union all
select tn.id, tn.parent_id, 'child' as relation
from all_trees tn
inner join children c on tn.parent_id = c.id),
--Step 3
parents_children(id, parent_id, relation) as (
select id, parent_id, 'parent_with_other_children' as relation
from parents where relation = 'parent'
union all
select tn.id, tn.parent_id, 'other_child_of_parent' as relation
from all_trees tn
inner join parents_children pc on tn.parent_id = pc.id)
select * from parents
union
select * from children
union
select * from parents_children;
Running explain analyze
on this yields an execution time of ~2000 ms. We
effectively changed nothing, but we are not using the hash indexes on the
tree_nodes
table now, and have a ~2000x (!) slowdown. This has to do with the
relatively naive method that Postgres uses to determine if a CTE should be
materialized (that is, computed once, and stored for later, thereby removing the
chance to use any indexes on the underlying table) or not (meaning that
subsequent references to the CTE would actually still reference the underlying
source table, like a standard, not materialized view). If a CTE is referenced
more than once in a query, it is materialized by default. We can force the CTE
not to be materialized by adding not materialized
after with recursive
all_trees as
, i.e. with recursive all_trees as not materialized (
.
Now, we could have examined this behavior without all of the detail about m-ary trees, but I think it is useful to examine the behavior for a real problem, in a context where the choice made by the query planner can have massive performance implications. This is definitely something to be keenly aware of when using CTEs in Postgres. In general, it is always important to consider how your programs behave at a low level (i.e. what assembly is my compiler generating from my code, what operations are going to be executed by my DBMS to fetch the data I need given the query I wrote, etc.), especially in performance critical applications.