Pages
▼
30/01/2018
[Java] Unbounded knapsack algorithm
Taking up again the knapsack challenge I tried some time ago, I decided to solve an easier problem first, the unbounded knapsack.
We already know we need to compute some values in order to find the perfect solution to the problem from past experience :) so this time we'll build our information data structure incrementally, starting from considering the bag with capacity 0 and working up to the input capacity.
The important thing is to convince ourselves that there is NO BETTER SOLUTION than considering all possibilities, with some interesting ways to reduce the number of possibilities though ;)
At each weight increase we want to know what would the best pick(s) be from our items if the maximum weight was the one we're currently evaluating. Therefore the key part is: we need to check ALL items every time to determine whether it's best to pick the item alone OR the item PLUS some of the previous items OR even MORE of the previous items.
If we care only about the maximum value we can achieve, we can easily track this with an integer array or similar, but we would lose the information regarding which and how many items make up our best solution; we can fix this by using some object to track the partial information and update our result in a nice way before returning from the function.
By knowing the best pick at each step, we will reach in O(NxK) time and O(k) space the best solution where N is the number of items and k is the maximum weight the bag can carry.
We can put some improvements in place, for example we can preprocess the items to discard any item that weighs more than what our bag can carry or any item for which a better item exist (more value for same or less weight).
You can check my implementation of fillUnboundedKnapsack on my Gist alongside with some tests in KnapsackJTests and the supporting structures I use.
We already know we need to compute some values in order to find the perfect solution to the problem from past experience :) so this time we'll build our information data structure incrementally, starting from considering the bag with capacity 0 and working up to the input capacity.
The important thing is to convince ourselves that there is NO BETTER SOLUTION than considering all possibilities, with some interesting ways to reduce the number of possibilities though ;)
At each weight increase we want to know what would the best pick(s) be from our items if the maximum weight was the one we're currently evaluating. Therefore the key part is: we need to check ALL items every time to determine whether it's best to pick the item alone OR the item PLUS some of the previous items OR even MORE of the previous items.
If we care only about the maximum value we can achieve, we can easily track this with an integer array or similar, but we would lose the information regarding which and how many items make up our best solution; we can fix this by using some object to track the partial information and update our result in a nice way before returning from the function.
By knowing the best pick at each step, we will reach in O(NxK) time and O(k) space the best solution where N is the number of items and k is the maximum weight the bag can carry.
We can put some improvements in place, for example we can preprocess the items to discard any item that weighs more than what our bag can carry or any item for which a better item exist (more value for same or less weight).
You can check my implementation of fillUnboundedKnapsack on my Gist alongside with some tests in KnapsackJTests and the supporting structures I use.
28/01/2018
[Java] Find maximum number of concurrent events in a calendar
This problem can be generalized to: find the maximum number of overlapping entries in a time series. We will focus on a calendar example where each event has a start and end time; you can find here the implementation of the CalendarEntry and CalendarEvent classes.
This problem is very similar to the one about verifying if all open parentheses in an expression have a matching closing parenthesis. In both cases we need to be able to find start and end points of an interval and verify where do they reside.
But the calendar entries might not be given in a sorted order already so to solve our problem we need to first split them in bot start and end points, then sort the points by comparing the times and breaking ties by giving precedence to the start points - this is very important! - if precedence is given to the end points we are not tracking the information we need, therefore double check that compareTo override :)
After the input data is sorted, we simply need to walk through the sorted list/array/whatever and keep a counter that we increase every time we encounter a start point and decrease when an end point is found. The maximum value of the counter is also the maximum number of concurrent events in the calendar.
You can check the implementation of getMaxCalendarOverlap on my Gist alongside some test cases in CalendarJTests.
This problem is very similar to the one about verifying if all open parentheses in an expression have a matching closing parenthesis. In both cases we need to be able to find start and end points of an interval and verify where do they reside.
But the calendar entries might not be given in a sorted order already so to solve our problem we need to first split them in bot start and end points, then sort the points by comparing the times and breaking ties by giving precedence to the start points - this is very important! - if precedence is given to the end points we are not tracking the information we need, therefore double check that compareTo override :)
After the input data is sorted, we simply need to walk through the sorted list/array/whatever and keep a counter that we increase every time we encounter a start point and decrease when an end point is found. The maximum value of the counter is also the maximum number of concurrent events in the calendar.
You can check the implementation of getMaxCalendarOverlap on my Gist alongside some test cases in CalendarJTests.
27/01/2018
[Java] Bellman-Ford algorithm
Let's start the new year with something not overly complicated like a proven graph algorithm, Bellman-Ford.
This is a good example of an algorithm for which choosing a good data structure can be the difference between a walk in a park or in hell during implementation.
A key point here is to represent the graph a two lists: Vertices and Edges. Using a different data structure instead can lead to a very messy code and complex implementation; for example I would advise not to keep a list of out edges in each Vertex and track the out vertex in the edge itself - even though the uploaded classes allow you to do so, but I keep them for the next exercises.
Also another important point is to understand exactly how a negative cycle can be detected and tracked into the resulting data structure. While we only need (V - 1) x E iterations to create the table, we need instead VxE iterations to be absolutely sure no negative cycle exists.
Another gotcha: we use "infinity" to mark the unknown distances at the beginning, but when we update the distances themselves if we are starting from infinity it means we don't know yet how to reach that specific node, so we should skip tracking it in the table. At the end, if the node is not present in the table or the distance to the node is "infinity", it means it's unreachable.
The last important step is to remember to update the path cost to "negative infinity" for all nodes in our graph that are in a negative cycle reachable from the given starting node; this gives us a full understanding of which paths are actually valid.
You can check my implementation on my Gist along with some test cases in BellmanFordJTests. Remember to check out the Vertex and Edge classes too.
This is a good example of an algorithm for which choosing a good data structure can be the difference between a walk in a park or in hell during implementation.
A key point here is to represent the graph a two lists: Vertices and Edges. Using a different data structure instead can lead to a very messy code and complex implementation; for example I would advise not to keep a list of out edges in each Vertex and track the out vertex in the edge itself - even though the uploaded classes allow you to do so, but I keep them for the next exercises.
Also another important point is to understand exactly how a negative cycle can be detected and tracked into the resulting data structure. While we only need (V - 1) x E iterations to create the table, we need instead VxE iterations to be absolutely sure no negative cycle exists.
Another gotcha: we use "infinity" to mark the unknown distances at the beginning, but when we update the distances themselves if we are starting from infinity it means we don't know yet how to reach that specific node, so we should skip tracking it in the table. At the end, if the node is not present in the table or the distance to the node is "infinity", it means it's unreachable.
The last important step is to remember to update the path cost to "negative infinity" for all nodes in our graph that are in a negative cycle reachable from the given starting node; this gives us a full understanding of which paths are actually valid.
You can check my implementation on my Gist along with some test cases in BellmanFordJTests. Remember to check out the Vertex and Edge classes too.