|
| 1 | +--- |
| 2 | +title: Learning Rust via Binary Search Tree |
| 3 | +description: Let's learn Rust by implementing our own binary search tree! |
| 4 | +slug: 20240629-rust-via-bst |
| 5 | +lang: en |
| 6 | +date: 2024-06-29 |
| 7 | +type: Post |
| 8 | +tags: |
| 9 | + - dsa |
| 10 | + - computer_science |
| 11 | + - rust |
| 12 | +--- |
| 13 | + |
| 14 | +It's been quite a while since my last post about Rust, which didn't really talk about its core philosophy and how it |
| 15 | +worked, so this was written as an introduction to Rust for people who have already dabbled with the C programming |
| 16 | +language family. Additionally, if you don't know what binary search trees are, I strongly recommend taking a look at |
| 17 | +them first. |
| 18 | + |
| 19 | +# Ownership |
| 20 | + |
| 21 | +We must first talk about ownership before implementing our BST. In Rust, an object must be owned by |
| 22 | +something and only that something. For example, take an uncopiable type `A`, the following code would result in |
| 23 | +compilation error. |
| 24 | + |
| 25 | +```rust |
| 26 | +let x: A = ...; // ownership of ... is x |
| 27 | +let y = x; // transfers ownership to y |
| 28 | +x...; // errors |
| 29 | +``` |
| 30 | + |
| 31 | +So to reference `x` without `x` losing ownership requires "borrowing", which is split into two categories: shared |
| 32 | +references(&) and exclusive(mutable, &mut) references. The former is a pointer to a constant piece of memory that you can't |
| 33 | +edit via the pointer, the latter is a pointer to an editable piece of memory. There can only be at most one mutable |
| 34 | +reference to `x` per scope, and if there is a mutable reference to `x`, there must not be any shared references to `x` in |
| 35 | +that scope. |
| 36 | + |
| 37 | +# Node and Leave |
| 38 | + |
| 39 | +In C, we can define a BST node like this: |
| 40 | + |
| 41 | +```c |
| 42 | +struct node { |
| 43 | + int key; |
| 44 | + struct node *left, *right; |
| 45 | +}; |
| 46 | +``` |
| 47 | + |
| 48 | +A `node` has two pointers pointing to other `node`s. Note that their allocation is not concretely defined by our struct |
| 49 | +itself. If the left/right child is a leave, then the value of it is set to `nullptr`. This approach has some downsides, |
| 50 | +though, most notably the fact that you have to free the memory itself if you allocate heap memory. This would result in |
| 51 | +memory leaks should you forget to do so. |
| 52 | + |
| 53 | +In Rust, since a node can only be connected to its parent, and knowing the node's parent is optional, we can define a |
| 54 | +BST tree and a BST node like is: |
| 55 | + |
| 56 | +```rust |
| 57 | +struct BSTNode<T> { |
| 58 | + left: BST<T>, |
| 59 | + right: BST<T>, |
| 60 | + pub key: i32, |
| 61 | + pub val: T, |
| 62 | +} |
| 63 | + |
| 64 | +pub struct BST<T> { |
| 65 | + node: Option<Box<BSTNode<T>>>, |
| 66 | +} |
| 67 | +``` |
| 68 | + |
| 69 | +Evidently, we define an intermediate struct called `BST<T>`, which is a container for an optional `Box<node>` property. |
| 70 | +A `Box` is simply a unique pointer to an area of heap memory that would be dropped once the `Box` isn't owned. If the |
| 71 | +property is a `None`, then the BST is simply an empty tree. Some people might do this |
| 72 | + |
| 73 | +```rust |
| 74 | +pub struct BST<T> { |
| 75 | + left: Option<Box<BST<T>>>, |
| 76 | + right: Option<Box<BST<T>>>, |
| 77 | + pub key: i32, |
| 78 | + pub val: T, |
| 79 | +} |
| 80 | +``` |
| 81 | + |
| 82 | +to make their code more succinct, and it does work, but you can't represent an empty tree without using `Option` on the |
| 83 | +top level, so we'll stick to the first implementation for the rest of this article. |
| 84 | + |
| 85 | +Let's implement some simple functions for our BST. |
| 86 | + |
| 87 | +```rust |
| 88 | +impl<T> BST<T> { |
| 89 | + // an empty BST |
| 90 | + pub fn new_empty() -> Self { |
| 91 | + Self { node: None } |
| 92 | + } |
| 93 | + // a BST with one node |
| 94 | + pub fn new(key: i32, val: T) -> Self { |
| 95 | + Self { |
| 96 | + node: Some(Box::new(BSTNode::new(key, val))), |
| 97 | + } |
| 98 | + } |
| 99 | + pub fn is_empty(&self) -> bool { |
| 100 | + self.node.is_none() |
| 101 | + } |
| 102 | + pub fn insert(&mut self, key: i32, val: T) { |
| 103 | + if let Some(root) = &mut self.node { |
| 104 | + // match is a way to split and pattern match results |
| 105 | + match root.key.cmp(&key) { |
| 106 | + Ordering::Equal => { |
| 107 | + panic!("BST should not have same id!"); |
| 108 | + // bad practice ^ |
| 109 | + } |
| 110 | + Ordering::Greater => { // root.key > key |
| 111 | + root.left.insert(key, val); |
| 112 | + } |
| 113 | + Ordering::Less => { |
| 114 | + root.right.insert(key, val); |
| 115 | + } |
| 116 | + } |
| 117 | + } else { |
| 118 | + self.node = Some(Box::new(BSTNode::new(key, val))); |
| 119 | + } |
| 120 | + } |
| 121 | +} |
| 122 | +``` |
| 123 | + |
| 124 | +Here, `Box::new` is a way to allocate heap memory. |
| 125 | + |
| 126 | +# BST Deletion |
| 127 | + |
| 128 | +Next, we'll define deletion in BST. To deletion node `n` in a BST, we can discuss individual cases: |
| 129 | + |
| 130 | +1. `n` is a leave: Make `n` an empty tree. |
| 131 | +2. `n` has only one child: Make the child take its place. |
| 132 | +3. `n` has two children: Make `n`'s inorder successor its place without changing `n`'s children. |
| 133 | + |
| 134 | +Then, we write down the code. |
| 135 | + |
| 136 | +```rust |
| 137 | +impl<T: Clone> BST<T> { |
| 138 | + fn extract_min(&mut self) -> Option<Box<BSTNode<T>>> { |
| 139 | + let Some(root) = &mut self.node { |
| 140 | + return None; |
| 141 | + }; |
| 142 | + |
| 143 | + if !root.left.is_empty() { |
| 144 | + return root.left.extract_min(); |
| 145 | + } |
| 146 | + |
| 147 | + // find the inorder successor and take it in the process |
| 148 | + return self.node.take(); |
| 149 | + } |
| 150 | + } |
| 151 | + |
| 152 | + pub fn remove(&mut self, key: i32) { |
| 153 | + let Some(root) = &mut self.node else { |
| 154 | + return; |
| 155 | + }; |
| 156 | + match root.key.cmp(&key) { |
| 157 | + Ordering::Equal => match (root.left.node.as_ref(), root.right.node.as_ref()) { |
| 158 | + (None, None) => { |
| 159 | + self.node.take(); |
| 160 | + // take itself, become an empty tree |
| 161 | + } |
| 162 | + (Some(_), None) => { |
| 163 | + self.node = root.left.node.take(); |
| 164 | + // copy left node to self |
| 165 | + // make left node an empty tree, without ownership, it would be dropped |
| 166 | + } |
| 167 | + (None, Some(_)) => { |
| 168 | + self.node = root.right.node.take(); |
| 169 | + // copy right node to self |
| 170 | + // make right node an empty tree, without ownership, it would be dropped |
| 171 | + } |
| 172 | + (Some(_), Some(_)) => { |
| 173 | + if let Some(x) = root.right.extract_min() { |
| 174 | + root.key = x.key; |
| 175 | + root.val = x.val.clone(); |
| 176 | + } |
| 177 | + } |
| 178 | + }, |
| 179 | + Ordering::Greater => { |
| 180 | + root.left.remove(key); |
| 181 | + } |
| 182 | + Ordering::Less => { |
| 183 | + root.right.remove(key); |
| 184 | + } |
| 185 | + } |
| 186 | + } |
| 187 | +} |
| 188 | +``` |
| 189 | + |
| 190 | +`Option::take` is simply a way to replace the memory at `Some(x)` to `None`, and copy `Some(x)` to the output. |
| 191 | +If `Option<Box<T>>` is taken, the original `Box<T>` is dropped, with the new `Box` immediately instantiated. This |
| 192 | +may all be optimized away, though. The taken node then causes the BST owning it to become an empty tree. |
| 193 | + |
| 194 | +# A small step |
| 195 | + |
| 196 | +In this article, we gained some insight regarding how Rust's ownership system worked, but we haven't talked about |
| 197 | +another important concept in rust, which is lifetime! See you in the next post. |
0 commit comments