18/05/2021

[Java] Find all words that can be created from a given phone number in linear time

The previous solution to this problem has a very high cost, the reason for that is the way we cycle on the input. We keep it as a string of digits, and try all combinations  for each one building an hypothetical string as we go.

A better approach which achieves linear complexity instead is to perform additional preprocessing and also some postprocessing:

  • create a static map from character to the digit it has on the same key on the keypad. Since this is static cost is O(1)
  • convert all words to their digit representation. If multiple words have the same representation, collect all of them. This is another map using the digit representation as key and a set of words matching that representation as value. This is in O(sum of length of all words) and we also use a StringBuffer to save on creating too many String objects
  • build the trie using the digit representation as set of matching words. This is O(number of words)
  • run Aho-Corasick searching our phone number as input. This is O(length of phone number)
  • for each matching word we found (which is now its digit representation), retrieve all words that have that same representation and add them to the result set. This is O(number of words) 

This is another case where using a little extra space provides a huge improvement on the overall runtime.

We now have linear complexity, with the worst case being the highest between: sum of length of all words, number of words or length of phone number.

You can check my implementation of findMatchingWordsInPhoneNumberAsDigits on my Gist along with some tests in FindWordsInPhoneNumberAsDigitsJTests.

No comments:

Post a Comment

With great power comes great responsibility