While looking at this video, I learned about the Aho-Corasick algorithm which is an interesting way of searching a string for matches against a given set of strings.
Given a set of words and a phone number, we want to scan the whole phone number and trace which words can be constructed from it or portions of it.
Remember that a phone keypad has the following associations:
0 and 1: no characters2: A, B, C
3: D, E, F
4: G, H, I
5: J, K, L
6: M, N ,O
7: P, Q, R, S
8: T, U, V
Therefore from number 22 we can create the following strings:
AA, AB, AC, BB, BC, BA, CC, CB, CA
Assuming our words are:
AA, CA, CD
We can then say that for phone number "22" we can create only:
AA, CA
If we simplify the problem and consider a static string as input, then Aho-Corasick gives us a nice O(length_of_string) time to find all possible solutions.
In our case we can still apply this logic, however we need to consider all possible characters that can be associated with a digit in our string.
The logic is therefore similar to what we did in findMatchingStringsInString from the last exercise, but with a couple adjustments:
- we create a static map to retrieve the set of possible characters associated with each digit
- when we loop over the phone number we retrieve the possible characters for each digit and loop over them, branching the logic to start with each possible connection from there
- if we encounter a digit that has no characters associated (eg 1 or 0) we restart from the root of the trie and try to match a new word starting from the empty string
- whenever we find a valid node, we add all its words to our result set
The complexity here is different than before: building the Aho-Corasick trie stays same in O(sum of length of all words) + O(nodes), which is O(sum of length of all words) worst case, while the search is now O(possible chars per digit ^ length of string) which in our problem is exactly O(4 ^ length of string) in the worst case, eg when the phone number is composed by only 7s or 9s.
The final result then would be O(4 ^ length of string).
You can find my implementation of findMatchingWordsInPhoneNumber on my Gist, along with some tests in FindWordsInPhoneNumberJTests.
No comments:
Post a Comment
With great power comes great responsibility