07/06/2021

[Java] Merge K sorted arrays

Given K sorted arrays of different lengths, merge them in a single sorted array.

If we had two arrays only, we could solve this in O(N+M) time by keeping two pointers, one for each array, and at each iteration we compare the current element for each to decide which one to add to the result set, then we advance only that pointer.

When we can't move anymore, if we still have elements left in one array, we add them all.

Expanding this approach to K > 2 yields a O(NxK^2) time unfortunately since we can't just as easily do the element comparison to find the smallest out of all arrays and we would therefore loop over K elements each time to find our pick.

We know sorting an array takes O(N log N) time, therefore we could simply append all arrays together and then sort them in O(NK log NK) time.

However our approach for K = 2 can be expanded a bit to achieve a slightly worse time than linear which is still better than this append and sort approach.

If we had an efficient way of tracking and retrieving the smallest item of a given set, we'd have our solution. This luckily comes in the form of a MinHeap.

We can therefore for each array, insert into the heap a Pair(value, Pair(index of this array in input, index of this value in this array)) and also provide a comparator that only looks at the values.

By initializing the heap with the very first element of each array, we are ready to start our logic. We keep polling the heap, which will return the smallest of all values and add that to our result (we also need to track how many items we have inserted so far in order to know exactly where to place the next one).

If the source array of that value has more elements, we add the next element to our heap.

When our heap is empty, we have processed all input.

This uses O(K) extra space but results in O(NxK log K) time since we only keep maximum K elements in the heap at each time.

If any input array is null or empty, we raise an exception. We could just as easily decide to skip it altogether instead. 

This has also the added benefit of allowing us to sort huge arrays as we can keep on disk the input and output and only need to keep in memory K items at a time.

You can check my implementation of mergeKSortedArrays on my Gist along with some tests in MergeKSortedArraysJTests.

No comments:

Post a Comment

With great power comes great responsibility