All Articles

Notes on Tree Data Structure

 

Trees

 

graph TD; id1((A))---id2((B)); id1---id3((C)); id2---id4((D)); id2---id5((E)); id3---id6((F)); id3---id7((G)); id6---id8((H));

 

Trees consist of nodes, the top node is called the root(in this case A), the leaf would be H in this case, the lowest node/level(leaves if more than one at lowest level) of the Tree. The depth is how far the tree goes from the root, in this example the depth is 3. The height is 3 of the tree at node A, an inverse relationship exists between the height and depth(node A for example is of 0 depth 3 height and node H for example is of 3 depth 0 height). In a tree each step from top to bottom is called a level of a tree, levels are equal to the depth value plus one. They also have an edge which is a line between two nodes and a path which is a sequence of nodes and edgens connecting a node with a descendant. Paths include all nodes and all edges along the path. Paths are strictly top to bottom and can’t be changed in the middle.

 

Types of Searches in Trees

 

This search scans from left to right on each level and then descends until it gets to the last possible node. The sequence for the tree diagram above would be A-B-C-D-E-F-G-H  

This search scans the left side first(by convention) checks node visited until it runs into a leaf goes back off and see if there were any other right children present. After it is done with the left side, it goes back to the root and does the same for the right. The sequence for the tree diagram above would be A-B-D-E-C-F-H-G.

This search scans the left side first(by convention) traversing until we hit a leaf and then checks off parents on the way back. We go to the right when needed and then all the way back to the root. If you look at the way we go through the tree, every node is visited in order left to right. The sequence for the tree diagram above would be D-B-E-A-H-F-C-G.

Post-Order Depth First Search

This search scans the left side first(by convention) traveresing until we hit a leaf and then check it off and make sure to check off any other descendants until we are done and then finally check parent nodes. We go to the right and do the same. The sequence for the tree diagram above would be D-E-B-H-F-G-C-A.

Binary Trees

Trees where parents have at most 2 children. Nodes can have 0, 1 or 2 children. A search would be O(n) if looking for a value and using any traversal algorithm. A delete operation also starts with a search, if deleting a leaf you can simply delete it. If you delete a parent for example you can delete the parent and attach the child to its parent. If the parent has 2 children, you can promote the child up(for example if we deleted B it could become E replaces B and D becomes a child of E). What do we do if the parent we delete has 2 children and those 2 children have children. In worst case you go down until you hit a leaf and switch it with the deleted node. Inserting an element just requires tacking a node to a leaf or a parent node with an open spot.

Binary Search Tree

A type of binary tree, BSTs are sorted so every value on the left of a particular node is smaller and every right node is larger. Because of this structure we can do operations like search, insert and delete quickly. You don’t need to search every element, we can make decisions based on values, run time is the height of the tree O(log(n)).

Unbalanced Binary Tree

A binary tree that has skewed distribution of nodes for example. Best case O(log(n)) worst case O(n)

graph TD; id1((5))---id2((10)) id2---id3((15)) id3---id4((20))

Heap

A heap is a type of tree that arranges elements in increasing or decreasing order.The root is either the maximum or minimum value in the tree. Two different type of heaps, max heap and min heap. Heaps don’t need to be binary trees, so they can have any number of children.

Max Heap

graph TD; id1((7))---id3((5)) id1---id2((3)) id3---id5((2)) id3---id4((1))

In a max heap a parent must always have a greater value than its children. So the root is the biggest element. An example of a max binary heap.

Min Heap

graph TD; id1((1))---id2((2)) id1---id3((3)) id2---id4((5)) id2---id5((7))

The parent has a lower value than its children. So the root is the smallest element. An example of a min binary heap.

A binary heap must be a complete tree, meaning all levels except last one must be full. If last one isn’t totally full values are added left to right. in this min heap or max heap a function that gets the root value is called peek. Searching isn’t like a BST, it is O(n) in search. we can use some properties to our advantage, if we know that the value is bigger than the root value then we can quit the search. Worst case is O(n), average case is still approximated to O(n) Reshuffling via the heap property is called heapify. While inserting, we stick bigger values wherever there is open space in the tree and via heapify correct the tree. We compare new element with its parent and swap them when it is bigger. Extract operation causes root to be removed from the tree and use the right-most leaf to replace it. Then compare to children and swap when necessary. Heapify and Extract is O(log(n))(height of tree) in the worst case.

Heap Implementation

Though heaps are represented as trees, they are often stored as arrays. Since we know how many children there are and how many nodes per level we can use math to figure out where next node would fall in array and then traverse the tree.

Sorted Array to Tree Conversion

[19,17,13,11,7,5,3,2,1]

graph TD; id1((19))---id2((B)); id1---id3((13)); id2---id4((11)); id2---id5((7)); id3---id6((5)); id3---id7((3)); id4---id8((2)); id4---id9((1))

We know first element is root since it is the biggest, the next 2 are the children of the root. By convention insert left to right, we can track size of each level in a variable and everything left over fills in the left side of the last level. Not every array can be represented in a heap, they need to be sorted in descended or ascending order. In an array we need to store node value and index in slot. Array saves us space versus node that needs value, left child, right child and parent.

Red-Black Tree