Scumm Bar: March 2017

31/03/2017

[Java] Remove kth to last node in a singly LinkedList

Here is a simple method to remove the k-th to last element from a singly LinkedList (not the standard one). Runtime is going to be O(N) and space is going to be O(1). Keep scrolling for a better version that still runs in O(N) time - although actual better running time! - and O(1) space using two pointers.

The idea is simply to walk the list first to find out how many elements are there, then calculate which element we want to remove and walk the list again until we find and remove it. It is not possible to remove the head element, so we must return the, potentially new, head node after editing the list in place.

We count elements in the list starting from 1 so k = 0 means remove the last element and k = length - 1 means remove the first element. If k = length, we are asking to remove element 0 which does not exist.

You can check my Gist for the code with tests (RemoveKtoLastNode) and the LinkedList data structure (SingleLinkedListNode):

 //CAUTION: must return a value because otherwise we would not be able to remove the head element  
   public static SingleLinkedListNode removeKtoLastNode(SingleLinkedListNode head, int k){  
     if(head == null) return head;  
   
     int length = head.length();  
     /*  
     list is 1 to length, so check k is in that range  
     CAUTION: we are removing kth to last so k = 0 means remove last  
     k = length -1 means remove first  
     */  
     if(k < 0 || k > length - 1) return head;  
   
     int n, count = 1;  
     SingleLinkedListNode prev = head, next;  
   
     //find out which element we want to remove  
     n = length - k;  
   
     //if it's the head, just return next  
     if(n == 1){  
       return prev.next;  
     }  
   
     //otherwise walk through the list and delete the desired element  
     while(prev != null){  
       next = prev.next;  
       //CAUTION: check with count + 1 so we do not skip the element we want to delete. next will be it!  
       if(n == count + 1){  
         if(next != null) prev.next = next.next;  
         //CAUTION: avoid NullPointerException when we are deleting tail  
         else prev.next = null;  
         return head;  
       }  
       prev = prev.next;  
       count++;  
     }  
   
     return head;  
   }

The alternative method using two pointers instead does not require us to calculate the list length beforehand, instead we let a pointer walk the list and start running a second pointer only after the first one has done K+1 steps. This means that when the first pointer reaches the end, the second will be exactly before the element to delete.
There is just one thing to be careful about: at the end the slow pointer might not be pointing anywhere. This means we might want to remove the head element or we got a K greater than the length of the list.

Here is the snippet from my Gist:

   
   
   //CAUTION: must return a value because otherwise we would not be able to remove the head element  
   public static SingleLinkedListNode removeKtoLastNodePointers(SingleLinkedListNode head, int k){  
     if(head == null || k < 0) return head;  
   
     int fast_idx = 0;  
     SingleLinkedListNode fast = head, slow = null;  
   
     /*  
     keep two pointers running after each other  
     the slow one starts k+1 steps after the fast one  
     when the fast one reaches the end of the list, the slow one should be exactly before the element to remove  
     */  
     while(fast.next != null){  
       if(slow != null) slow = slow.next;  
       fast = fast.next;  
       fast_idx++;  
       if(fast_idx == k + 1) slow = head;  
     }  
   
     //if the slow one is actually pointing somewhere, remove the following node  
     if(slow != null) slow.next = slow.next.next;  
   
     //otherwise if we want to remove the head (k = list length), just return the next from head  
     if(fast_idx == k) return head.next;  
   
     return head;  
   }

[Java] Remove duplicates from a singly LinkedList

Here is a simple method to remove all duplicate nodes from a singly LinkedList (not the standard one) without using additional data structures. This means the runtime is going to be O(N^2) and space is going to be O(1).

The idea is simply to walk the list with a pointer and for each element walk the remainder of the list using two pointers. When we find a match to the element we are currently evaluating, we remove it. It is not possible to remove the head element, so we can edit the list in place.

You can check my Gist for the code with tests (RemoveDuplicates) and the LinkedList data structure (SingleLinkedListNode):

 public static void removeDuplicates(SingleLinkedListNode head){  
     SingleLinkedListNode curr, prev, next;  
   
     curr = head;  
   
     //for each single node  
     while(curr != null){  
       //track 2 nodes at the same time  
       prev = curr;  
       next = prev.next;  
       //walk the whole list  
       while(next != null){  
         /*  
         if the next node is a duplicate, delete it by linking the previous to the next.next  
         CAUTION: do NOT move the previous node because we could lose a value  
         */  
         if(next.data == curr.data) prev.next = next.next;  
         //only move previous if the next one was unique  
         else prev = prev.next;  
         //move next to the new next element  
         if(prev != null) next = prev.next;  
       }  
       curr = curr.next;  
     }  
   }

30/03/2017

[Java] Compress string by counting character repetitions

Here is a simple method to compress a string by counting the character repetitions and producing a new string containing that information; if the compressed string is longer, just keep the original one.

The idea is simply to walk the string keeping track of the current count for each item we encounter in two separate arrays. Then we walk both arrays at the same time and concatenate each character with the number of appearances he had; lastly we check if the compressed string is shorter and decide what to return.

You can check my Gist for the code (stringCompress) with tests:

 public static String stringCompress(String s){  
     if(s == null || s.length() <= 1) return s;  
   
     int[] counts = new int[s.length()];  
     char[] chars = new char[s.length()];  
     char c;  
     int j = 0;  
   
     counts[0] = 1;  
     chars[0] = s.charAt(0);  
   
     //for each character in string, count how many times it appears  
     for(int i = 1; i < s.length(); i++){  
       c = s.charAt(i);  
   
       if(c == chars[j]) counts[j]++;  
       else{  
         j++;  
         chars[j] = c;  
         counts[j] = 1;  
       }  
     }  
   
     //if we compressed the string a bit, then our arrays have some space left, mark the premature end  
     if(j < counts.length - 1) counts[++j] = -1;  
   
     //we do not want to create a new String everytime while we concatenate  
     StringBuilder sb = new StringBuilder();  
   
     j = 0;  
     while(j < counts.length){  
       if(counts[j] == -1) break;  
       //for each character in original string, concatenate it alongside the number of occurrences  
       sb.append(chars[j]);  
       sb.append(counts[j]);  
       j++;  
     }  
   
     //if the compressed string is longer, return original string instead  
     if(sb.length() > s.length()) return s;  
   
     return sb.toString();  
   }

[Java] String replaceAll method from space to %20

Here is a simple method to replace all occurrences of space in a string with '%20' without using the replaceAll method.

The idea is to first count how many spaces are there in the string, then allocate a char array of appropriate size and fill it up correctly.

You can check my Gist for the code (replaceSpace) with tests:

 public static String replaceSpace(String s){  
     if(s == null || s.length() == 0) return null;  
   
     int space_count = 0;  
     char c;  
   
     //count how many white spaces we have  
     for(int i = 0; i < s.length(); i++){  
       c = s.charAt(i);  
       if(c == ' ') space_count++;  
     }  
   
     if(space_count == 0) return s;  
   
     //allocate space for new string replacement. CAREFUL: we replace 1 char with 3 so we need 2 extra spaces!  
     char[] res = new char[s.length() + space_count * 2];  
   
     //prepare result string. CAREFUL: can't use same index otherwise we skip characters in either string or char[]  
     for(int i = 0, j = 0; j < s.length(); i++, j++){  
       c = s.charAt(j);  
   
       if(c == ' '){  
         res[i] = '%';  
         res[++i] = '2';  
         res[++i] = '0';  
       }  
       else res[i] = c;  
     }  
   
     return new String(res);  
   }

29/03/2017

[Java] Find most consecutively repeated character in string

Here is a simple method to find the most consecutively repeated character in a string. In my view punctuation and case should not be ignored.

The idea is simply to walk the string keeping track of the current count for the item we're evaluating. When we encounter a different character, we check if the item we were evaluating appeared more times than the current max and if so, we set it as the current max.

You can check my Gist for the code (maxRepChar) with tests:

 public static Entry maxRepChar(String s){  
     Entry max = new Entry(), curr = new Entry();  
     char c;  
   
     if(s == null || s.length() == 0) return null;  
   
     //initialize with first item in string  
     curr.count = 0;  
     curr.c = s.charAt(0);  
     max.count = 0;  
   
     for(int i = 0; i < s.length(); i++){  
       c = s.charAt(i);  
   
       /*  
       keep incrementing the current counter for the item we're evaluating now  
       as long as we have a subsequent uninterrupted stream  
       */  
       if(c == curr.c) curr.count++;  
       else{  
         /*  
         when we interrupt the stream, check if the item we're evaluating  
         appeared more times than the current maximum, if so, set it as max  
         */  
         if(curr.count > max.count){  
           max.count = curr.count;  
           max.c = curr.c;  
         }  
   
         //and reset the counter for current item  
         curr.c = c;  
         curr.count = 1;  
       }  
     }  
   
     //don't miss the case when the max run untils end of string  
     if(curr.count > max.count){  
       max.count = curr.count;  
       max.c = curr.c;  
     }  
   
     return max;  
   }

The Entry class is simply a helper structure:

char c;
int count = 0;

27/03/2017

[Java] Check whether a string is palindrome

Here is a simple method to test if a string is palindrome. In my view punctuation should be ignored as well as case.

The idea is simply to walk the string from the borders towards the middle at the same time, skipping characters that are not alphanumeric until we either reach the middle or find two characters which are not same. Also, a single character is palindrome.

You can check my Gist for the code (isPalindrome) with tests:

 public static boolean isPalindrome(String s){  
     if(s == null || s.length() < 1)return false;  
   
     for(int i = 0, j = s.length() - 1; j > i; ){  
       if(!Character.isLetterOrDigit(s.charAt(i))){  
         i++;  
         continue;  
       }  
       if(!Character.isLetterOrDigit(s.charAt(j))){  
         j--;  
         continue;  
       }  
       if(Character.toLowerCase(s.charAt(i)) != Character.toLowerCase(s.charAt(j)))return false;  
       else{  
         i++;  
         j--;  
       }  
     }  
     return true;  
   }

[Java] Trie data structure for text input prediction

A Trie is a very useful data structure with a confusing name, which is luckily easier to use than to pronounce.

I have seen it mainly used in input text suggestion or String analysis. The structure is a Tree where each node can have a label and multiple children; nodes with a label represent a full piece of data (eg a word) while nodes without label are used only for navigation or substring evaluation for example.

The list of children can be implemented with different data structures, I propose a version using a HashMap for faster searching: given an input String s, starting from the first character keep calling get(char) until we reach a word to suggest. Each get is O(1) so we are only paying O(M) (worst case is we're going to predict the longest string present).

Using a List would require us to scan the list until we find the entry for the next character which I think due to big-O magic can be assumed to still be O(1) since the list is at most as big as the alphabet we are using and therefore we can call it constant size, but still sounds like an unnecessary operation to me.

LinkedHashMap would not help much unless we associate some meaning with the insertion order eg: most often used words first.

TreeMap would slow the process down but guarantee to retrieve words in the desired order (eg alphabetical).

26/03/2017

[Java] 0/1 Knapsack problem O(n log n) heuristic

The 0/1 knapsack problem is a fairly common problem in daily life and it's unfortunately one of those problems where you must accept that the optimal solution cannot be computed as quickly as you would like, no matter how hard you think about the problem. As a friend put it, "if the solution looks like black magic, there is probably something wrong with it".

The problem description states: given a recipient with a fixed capacity and a set of items, each with a specific weight and value, the goal is to find the subset that maximises the total value of the chosen items while not exceeding the maximum recipient capacity.
In the 0/1 version, you cannot break an object (not possible to only take a fraction of it) so you can only either take it or not.

Sometimes calculating the optimal solution is too expensive or simply unfeasible (eg online knapsack) so a different, possibly suboptimal approach has to be taken. The one described here uses a greedy algorithm and aims at maximising the total value by selecting items after ordering them by desirability in terms of value per space left after they are picked.

20/03/2017

[SQL] Excel COUNTIFS - count columns matching criteria in a row

Excel in his arsenal of useful functions has COUNTIFS, basically a count of how many elements in a one dimensional range match a specific criteria. It says multidimensional, but it's not, it's either the same criteria twice for more dimensions or a different criteria. Key point: one list at a time.

However this is a very basic need which is not immediately achievable in SQL as well since we cannot loop over columns in a row. That is, unless we remember that PIVOTing is actually a thing. In this specific case we use the inverse operation, UNPIVOT.

[Java] Mergesort sorting algorithm for int arrays

Mergesort is well known as one of the best sorting algorithms out there with a O(n log n) worst case runtime; it doesn't get much better than this without strange big-O tricks.

But while understanding how it works is easy, implementing it is a different story given that it combines two nested function calls in the recursion and requires careful array index management. Also Java pass by value adds a small gotcha for good measure.

You can find my implementation for int[] arrays (as in: not Integer, not ArrayList, not List, not whatever else. Pure and simple int primitive data type with old fashioned array, the kind that does not change size once created if you can still remember those) on my Gist along with some test cases.

12/03/2017

[Python] BloggerBackup script to backup Blogger posts

This script allows users to quickly download posts from their Blogger blog in order to keep a backup copy. On Windows there is the amazing BloggerBackupUtility but sadly that's not available for Linux as well, hence this small project comes to life.

You can get the utility here or check out the source code here. To run it, you must have Python 3 installed.

Usage is:

 bloggerbackup --api-key <API key> --blog <blog URL> --backup-dir <backup directory> [--start-date <date>] [--end-date <date>] [--verbose]

options:

   --help - prints this help  
   
   --api-key <api key> - mandatory API key used to issue queries  
   
   --blog <blog URL> - mandatory URL of the blog to operate on  
   
   --backup-dir <backup directory> - mandatory directory where to put the posts backup  
   
   --start-date <date> - optional, date from which to begin fetching posts in RFC 3339 format: yyyy-MM-ddTHH:mm:ss+HH:mm  
   
   --end-date <date> - optional, date where to stop fetching posts in RFC 3339 format: yyyy-MM-ddTHH:mm:ss+HH:mm  
   
   --verbose - optional, prints debug information while processing

The script is extremely barebone and definitely improvable. Also, it requires you to setup an own API key in order to issue queries in the free tier. To do so, visit the Google credentials page.

06/03/2017

[Java] BST element distance - parent pointer

Well, that's a bit anticlimactic. After playing with the other approaches, the one that initially gave me the most pain was also quite fast to implement as soon as the AHA! moment came.

However spoiler alert: the results are quite disappointing.

[Java] BST element distance - lowest common ancestor

So I was thinking about the parent pointer approach to find the distance between to elements in a binary search tree and avoiding using the path lists approach, which means finding the lowest common ancestor; that's the lowest node in the tree that is common to both paths to the two elements.
If the elements are one the child of the other, then it will be equal to the parent, otherwise it will either be the root if the elements are on completely different branches or the divergence node if the elements are in the same subtree.

Then I got down to coding and when I got to the LCA logic, finally, ... I did not use the parent pointer yet again :(

[Java] BST element distance - path lists

The exercise that stole too much time from me last time was about binary search trees. The description is quite simple: find the distance between two elements in a given tree; that means how many hops do you need to do to navigate to element B from element A; if the elements are the same, distance is 0, if one or both elements are missing, distance is -1.

04/03/2017

[Java] Scorecard algorithm

Some time ago I took an online coding test which left me with an unfinished exercise; I played too long with the other one and had no time to finish this :(

However it seemed quite easy and interesting so I decided to finish it later, here is the description of the exercise as I remember it and my solution:

Bill plays some kind of game where the score is tracked on a scorecard; in addition to simple scoring moves, he can attempt (and possibly fail) more rewarding moves that will influence the score calculation.

Bill starts with a score of 0 and cannot earn negative points.

Bill can alter his total score by:

scoring a point (integer) - it will be added to the total score
doubling a previous score ("X") - the previous score will be doubled before adding it to the total score. If there is no previous score, it counts as 0
summing two previous scores ("+") - the two precedent scores will be summed together before adding the result to the total score. If there are no predecessors, it counts as 0; if there is only one predecessor, it will be added again to the total score
failing a move ("Z") - the previous score will be ignored from the total score and its value will also not be considered for subsequent special moves. If there is no previous score, this has no effects

Pages