pwshub.com

Trie data structures: A guide for UX designers

Trie Data Structures

Whenever you make a search query on Google, YouTube, or Instagram, you get a popover with suggestions that may be close to or exactly what you were thinking. Ever wondered how this auto-complete feature works?

As it turns out, some auto-complete features are powered by a data structure type called a trie (some pronounce it as “tree” and others “try”).

A trie’s structure looks like a tree — the data begins at one origin but branches off to sub-tries and beyond. The trie data structure is ideal for efficiently storing, searching, and retrieving strings from large databases.

In UX design, trie data structures can optimize a user’s experience by outputting logical search results, assisting the user with suggestions, and correcting spelling errors.

Think of spell-checking algorithms and text prediction. We use these additional features on a daily basis, and they seem so effortless, right? Well, it is a bit more complicated, but we’ll break it down in this article.

Let’s further discuss what trie data structures are, review the trie operations, see their applications in UX design, and then get into the pros and cons of using tries.

What is a trie data structure?

First of all, why is it called a trie data structure?

“Trie” comes from the word “retrieval” – one of its main functionalities. Tries both store and retrieve data from a large collection of strings. These strings can be any sequence of letters, numbers, spaces, and even symbols surrounded by double-quotations.

The singular word “hello” and the phrase “how to bake a cake in under 20 minutes?” are both considered strings.

Google Search Trie Data Example
Using Google search to type “how to bake a cake”

There are countless combinations of strings given the volume of words and numbers in any specific language. But managing an extensive set of strings becomes more doable when trie data structures are implemented.

A trie comprises three main components:

  • Root node — The starting point of a trie that contains an array of references*, but has a null value since it includes an empty string (“”)
  • Child node — Contains an array of references that is connected to a root node or other child nodes
  • Edge — Connects two nodes to form a relationship

Components of a Trie

*Note: An array of references can be anything that makes up a string in the trie data structure. For example, there are 26 letters in the English alphabet. If a trie only supports the English alphabet, every node will contain an array that references the 26 possible letters. Since arrays start at 0, the array will span from 0 to 25 (or A to Z).

Let’s look at an example of a trie with the words — baby, bark, bars, and bed.

  • The trie begins at the root node with a null value
  • An edge connects the root node to the first child node; the node references value b
  • The child node branches off into two new child nodes; one node references value a and the other e
  • The remaining child nodes are created until the string is complete
Trie Example With Words
Trie with four words — baby, bark, bars, and bed

As you can see, prefix matching is used to organize the nodes based on shared references. All words share the prefix “b-” and three words share the prefix “ba-.” Instead of having to list each string in a database individually, strings can be grouped based on their prefixes. This supports better efficiency in how the data is searched and retrieved when called on, as well as saves on storage space.

Now that we’ve explored what a trie is and its benefits of organizing and retrieving data, let’s dive deeper to discuss some trie operations.

Trie operations

When a data structure is being built or used, various operations need to be performed to keep the database updated or correct any errors. For trie data structures, the main operations supported include insertion, searching, and deletion.

I’ll discuss each to explain how a trie works:

1. Insertion

The insertion operation can be seen as an “add” tool. If you’re starting with the first few strings in the data structure or inputting the final ones, you’ll use the insertion operation to add the data.

Let’s practice by inserting the string “bark” into a predefined trie with the words — baby and bars.

  • Start at the root node of the trie
  • Check if there are existing child nodes that have a matching prefix to “bark”
  • I noted the trie includes a prefix for “bar-”
  • After the prefix of “bar-,” insert a new node with the value “k”
  • Mark the node as the “end of word”

Here is a code snippet and demo for this insertion operation in JavaScript:

// I have a predefined trie with the words "baby" and "bars"
const trie = {
    "b": {
        "a": {
            "b": {
                "y": {
                    "isEndOfWord": true
                }
            },
            "r": {
                "s": {
                    "isEndOfWord": true
                }
            }
        }
    }
};
// Function that inserts a word into the trie
function insertWord(word) {
    let currentNode = trie;
    for (let i = 0; i < word.length; i++) {
        const char = word[i];
        // If the character doesn't exist in the current node, create a new node
        if (!currentNode[char]) {
            currentNode[char] = {};
        }
        // Moves to the next node
        currentNode = currentNode[char];
    }
    // Marks the end of the word
    currentNode.isEndOfWord = true;
}
// Demo: Insert the word "bark" into the trie
insertWord("bark");
console.log(JSON.stringify(trie, null, 2));
// Output of the console log
{
  "b": {
    "a": {
      "b": {
        "y": {
          "isEndOfWord": true
        }
      },
      "r": {
        "s": {
          "isEndOfWord": true
        },
        "k": {
          "isEndOfWord": true
        }
      }
    }
  }
}

2. Searching

The search operation is similar to insertion since you’re comparing the string you’re looking for to the existing nodes. But instead of adding nodes to the data structure, you’re simply traversing the nodes to see if the string exists.

Let’s search for the string “barn” in a predefined trie with the words — baby, bark, and bars.

  • Start at the root node of the trie
  • Compare each node’s value to spell out “barn”
  • Follow the matching branches until you find the entire string
  • Make sure the end of the string includes a mark for “end of word”

If the trie doesn’t include a matching prefix branch (such as “bar-”), then the string doesn’t exist. Also, if the end of the string doesn’t include an “end of word” mark, the string doesn’t exist. For instance, the word existing in the trie could be “barns” and not “barn.”

Here is a code snippet and demo for this search operation in JavaScript:

// I have a predefined trie with the words "baby", "bark", and "bars"
const trie = {
    "b": {
        "a": {
            "b": {
                "y": {
                    "isEndOfWord": true
                }
            },
            "r": {
                "k": {
                    "isEndOfWord": true
                },
                "s": {
                    "isEndOfWord": true
                }
            }
        }
    }
};
// Function that searches for a word in the trie
function search(word) {
    let currentNode = trie; // Starts at the root node
    // Loops through each character in the word
    for (let char of word) {
        // If the character is not found in the trie, return false
        if (!currentNode[char]) {
            return false;
        }
        // Moves to the next node
        currentNode = currentNode[char];
    }
    // Checks if the current node marks the end of a word
    return currentNode.isEndOfWord === true;
}
// Demo: Search for the word "barn"
const isFound = search("barn");
console.log(isFound ? "Word found!" : "Word not found."); // Searches the trie
// Output of the console log
Word not found.

3. Deletion

Lastly, the deletion operation removes unneeded strings from the trie. Unlike insertion and search, deletion requires some caution to ensure you aren’t affecting strings that share a prefix with the string being removed.

Let’s delete the string “boom” in a predefined trie with the words — boom, boomer, and body.

  • Traverse and find the string in the trie
  • Remove the mark for the “end of word”
  • With caution, remove unnecessary child nodes (if needed)
  • I noted that the string “boomer” exists in the trie, so I can’t remove any nodes without affecting this string

If the string “boomer” didn’t exist in the trie, I could remove the nodes with the values m and o since they would no longer be needed to complete a string.

Here is a code snippet and demo for this deletion operation in JavaScript:

// I have a predefined trie with the words "boom", "boomer", and "body"
const trie = {
  'b': {
    'o': {
      'o': {
        'm': {
          'isEndOfWord': true,
          'e': {
            'r': {
              'isEndOfWord': true
            }
          }
        }
      },
      'd': {
        'y': {
          'isEndOfWord': true
        }
      }
    }
  }
};
// Function that deletes a word from the trie
function deleteWord(trie, word) {
  function deleteHelper(node, word, index) {
    if (index === word.length) {
      // If the end of the word is found
      if (node.isEndOfWord) {
        node.isEndOfWord = false;
        // Checks if node has no other children
        return Object.keys(node).length === 1;
      }
      return false;
    }
    const char = word[index];
    if (node[char]) {
      const shouldDeleteChild = deleteHelper(node[char], word, index + 1);
      if (shouldDeleteChild) {
        delete node[char];
        // Checks if current node has other children
        return Object.keys(node).length === 0;
      }
    }
    return false;
  }
  deleteHelper(trie, word, 0);
}
// Demo: Delete the word 'boom' from the trie
deleteWord(trie, 'boom');
console.log(JSON.stringify(trie, null, 2));
// Output of the console log
{
  "b": {
    "o": {
      "o": {
        "m": {
          "isEndOfWord": false,
          "e": {
            "r": {
              "isEndOfWord": true
            }
          }
        }
      },
      "d": {
        "y": {
          "isEndOfWord": true
        }
      }
    }
  }
}

Applying trie data structures to UX design

Though tries heavily rely on software engineers to be properly implemented, UX designers can still benefit from understanding and advocating for the usage of tries in their websites or applications (unless you’re a unicorn who can design and code).

As we’ve discussed, tries help applications run important features such as auto-complete, spell-checking algorithms, and text prediction. Each of these features improves the interface’s functionality by assisting the user as they complete tasks.

Let’s say I’m making a grocery list in my iPhone’s Notes app. As I type in items in the list, there’s a bar above the keyboard with options that dynamically update depending on what I type.

In the image below, I typed “Oa-” and I received a suggestion for “oatmeal” (which was correct):

Text Prediction Uses Trie
Grocery list made in iPhone’s Notes app

Instead of typing “Oatmeal,” I can simply select the suggestion to replace what I’ve already typed. Not only that, if I accidentally type “Oatmael,” it automatically updates my spelling mistake after I finish typing the word.

Because trie data structures were implemented, it used my input of “Oa-” to search the trie and prefix match for possible options. It then showed me the suggestion of “Oatmeal” since that was a match (I may get more than one suggestion depending on the prefix).

As for spell-checking, my incorrect input of “Oatmael” was traversed in the trie, and nodes with invalid characters were replaced to be valid.

Spelling Correction Using Trie Data
Typing error of “Oatmael” in iPhone’s Notes app

These examples of auto-complete and spell-checking add value to any website or app with tasks that involve typing by taking some of the work off the user’s plate.

Who wants to memorize how to spell “restaurant”? I don’t. Because trie data structures are implemented in the apps I use, all I have to do is type “res-” and let my device do the rest of the work for me.

Pros and cons of a trie

This section will guide you in determining if using a trie is the best option for structuring data on your website or application.

But before we get into the pros and cons, let’s review two similar data structure types to compare tries to — hash tables and binary search trees:

  • Hash tables use a method called a hash function to organize, store, and retrieve data. Hash tables are used in scenarios where exact matches for a string are needed, such as a device’s cache
  • Binary search trees (BSTs) use a sorting method that is dependent on the node’s values. Each node includes two child nodes; the left child holds a smaller value, and the right value holds a larger value. BSTs are used in search algorithms as well as spell-checking (like tries) since they quickly search through datasets

Pros

Trie data structures have already proven to be advantageous since they are used in many applications, but let’s review specific strengths of a trie:

  • Operation-efficient — Tries allow for quick insertion, searching, and deletion, which is beneficial when traversing or updating large datasets (similar to BSTs)
  • Space-managing — Since trie structures use prefix matching versus listing each string individually, many strings share the same prefixes, which saves on memory
  • Versatile application — Tries can be used in many different features and products, such as dictionaries, lookup tables, and auto-complete functions (hash tables and BSTs are also used similarly)
  • Supports multi-languages — Trie structures can store and retrieve strings of many alphabetic languages like English, Italian, and Spanish

Cons

Trie data structures are powerful and useful when implemented, but they do come with limitations. Let’s review some of the weaknesses of a trie:

  • Difficult implementation — Since each node plays a part in constructing a string in the dataset, managing a large trie can be complex to ensure the node-mapping is correct
  • High memory usage — I know this was also listed as a pro, but in comparison to hash tables and BSTs, tries use the most memory since each character in a string requires one node (but there are workarounds to compress tries)
  • Slow lookup speeds — When searching for exact strings in a trie, it is slower compared to hash tables and is worsened as the length of the string increases
  • Language limits — Though tries do support alphabetic languages, they do not work as well with logographic languages (Chinese and Japanese) since symbols are used to represent whole words

Tries, hash tables, and BSTs are efficient data structures but are strongly suited for particular applications.

For instance, if you need exact data matching, hash tables will be the best option since they will be the quickest at finding the match. Or, if you need auto-complete or text-prediction, tries will be the best option due to its prefix matching capabilities.

Conclusion

In this article, I shared how trie data structures efficiently store, search, and retrieve strings from large collections of data. Whether you know it or not, many of the features we use on a daily basis are powered by trie data structures, such as auto-complete and spell-checking algorithms.

Tries are built from a root node and child nodes, and relationships between nodes are made from edges. Each root node contains a null value, and each child node holds one value that references an array of possible values. As you traverse down a trie, you will be able to spell out strings that have been inserted; the strings can spell out anything, such as “hello!” or “buy sunglasses”.

Trie data structures support three main operations — insertion, searching, and deletion:

  • Insertion adds strings to the trie by using prefix matching to position the string with similar ones, like placing “baby” and “bad” under the same prefix of “ba-”
  • Searching simply traverses the trie to find if a particular string exists in the trie, such as searching for “barn”
  • Deletion removes strings not necessary to the trie, but removal must be done cautiously to avoid affecting related strings, such as removing “bark” but keeping “barked”

In terms of UX design, tries are critical to enhancing the functionality of a website or application. If you expect a user to frequently type or use a search function, implementing a trie data structure will assist the user in completing tasks effectively. The user will be able to do tasks like filling out forms or searching a page quickly while reducing the UI costs.

Trie data structures are effective when managing datasets, and they reduce redundancies when strings share common prefixes. Tries are versatile and powerful tools to implement in your projects for the right use cases.

Source: blog.logrocket.com

Other stories
27 minutes ago - Discover some of the best Node.js web scraping libraries, including Axios and Superagent, and techniques for how to use them. The post The best Node.js web scrapers for your use case appeared first on LogRocket Blog.
3 hours ago - Infinite runner games have been a favorite for gamers and developers alike due to their fast-paced action and replayability. These games often feature engaging mechanics like endless levels, smooth character movement, and dynamic...
5 hours ago - Yesterday, Elizabeth Siegle, a developer advocate for CLoudflare, showed off a really freaking cool demo making use of Cloudflare's Workers AI support. Her demo made use of WNBA stats to create a beautiful dashboard that's then enhanced...
5 hours ago - User interviews are great — only if you don't rush them. In this piece, I share how using debrief questions in research can help capture better insights and improve your interview process. The post Using debrief questions to get the most...
5 hours ago - Inertia.js enables you to build SPAs with a traditional backend framework and a modern JavaScript frontend with server-side routing. The post Inertia.js adoption guide: Overview, examples, and alternatives appeared first on LogRocket Blog.