In the previous post we covered the the Minimum Spanning Tree. many of you must have gone through it, and would have liked the algorithms explained there, but today you many of you must have already forgotten their implementation details. So what would you do? You will open your favorite browser and just start typing minimum... , and voila, the link to the blog will be automatically suggested to you by the browser. This happens to us very often, right? We open a link, read something, and then when we again type in something similar in the search bar of our browser, the browser automatically suggests the link.
But how does our browser remembers all this? How does it know what we are looking for?
A network browser keeps a history of the URLs of sites that you have visited. By organizing this history as a trie data stucture, you need only type the prefix of a previously used URL and the browser can complete the URL very easily.
Does this sound something familiar? The autofill feature also uses the same technique. It maintains trie of strings and auto completes the word by searching for a string with the same prefix and having maximum frequency count.
The Trie Data Structure
Tries are made up of a set of nodes connected by pointers, which indicate parent-child relationship between them. Each node can contain multiple branches, with each of the branches representing a possible key character.
Let us consider a trie which stores strings of English alphabets, in this case we will have a total 26 possible characters. In case of binary tries, we would have only 2 characters.
In the usual implementation, we have a blank root node which points to all other starting characters of the set of strings, which then point to the corresponding next character.
So let us build a trie. Suppose we have a list of strings ,
List = { "study", "student", "stuff" , "ton", "tight" }
So we start with a blank node and point it to character 's'. Now we will keep on linking it to the next characters until the end is reached, i.e., we finish the 'study' string. Now for 'student', we start from blank node, search if there is any 's' there. Since we already have one, now we search if it has a connected node with character 't'. We proceed in the same way and create a new node if we are not able to find an existing one.
Trie Implementation in C++
Now that we know how a trie data structure is formed, let us see its implementation in C++.
#include<bits/stdc++.h>
using namespace std;
class trie {
public:
// pointers to all possible child nodes
trie *node[total_size];
int count_words;
/*
a count variable storing frequency of string ending at that node
function to make a new node and initialise it
*/
trie *make_new_node() {
trie *new_node = new trie;
new_node->count_words = 0 ;
for(int i=0;i<total_size;i++)
new_node->node[i]= NULL;
return new_node;
}
// function to insert a new string in trie
void insert(trie *root, string s)
{
trie *temp = root;
int i,id;
bool flag = true; // to check if string is already present
for(i=0;i<s.size();i++)
{
id = s[i] - 'a';
// if there is no next node, make one
if(!temp->node[id])
{
flag = false;
temp->node[id] = make_new_node();
}
temp = temp->node[id];
}
// increment count of the string
temp->count_words = temp->count_words + 1 ;
if(flag)
cout<<"String already present in trie , count updated .\n";
else
cout<<"String inserted !\n";
}
// function to search if a given string exists in trie
bool search(trie *root,string s)
{
trie *temp = root;
int i, id;
for(i=0;i<s.size();i++)
{
id = s[i]-'a';
// if next node is null , but string has more characters left, return false
if(!temp->node[id])
return false;
temp = temp->node[id];
}
// if node has a positive count , return true
if(temp->count_words > 0)
return true;
else
return false;
}
}
// function to count frequency of strings
int frequency(trie *root,string s)
{
trie* temp = root;
int i, id;
for(i=0;i<s.size();i++)
{
id = s[i] - 'a';
if(!temp->node[id])
return 0;
temp = temp->node[id];
}
return temp->count_words;
}
};
// main method
int main()
{
int t, q, i, p;
string s;
trie node;
trie *root = new trie ;
root = node.make_new_node();
// insertion
cin >> t;
while(t--)
{
cin >> s;
node.insert(root,s);
}
//searching
cin >> q;
while(q--)
{
cin >> s;
if(node.search(root,s))
cout << "String Found\n";
else
cout << "String Not Found\n";
}
// output frequency
cin >> q;
while(q--)
{
cin >> s;
p = node.frequency(root,s);
cout << p << endl;
}
return 0;
}
Sample Input:
3
study
student
study
2
top
study
3
lol
study
ss
Output:
String inserted !
String inserted !
String already present in trie , count updated .
String Not Found
String Found
0
2
0
Time Complexity of Trie
Time Complexity of trie creation: O(Length of longest string * Number of strings)
Time Complexity of searching: O(Length of string)
You may also like: