Saturday, August 11, 2018

Edit Distance

Find an minimum edit distance between two strings. One edit operation can be insertion, deletion, and substitution at the position from one string to another.

Ideas to solve
1. What is the smallest subproblem?
2. How many variables involved in subproblems?

Cautions:
1. Be careful when dealing with edge cases, and not getting index out of bound error.

Every dynamic programming problems has an directed acyclic graphs, and each node represents a subproblem.

Dynamic Programming

Dynamic Programming is an advanced problem solving technique that utilizes the answers of smaller subproblems. It start with smallest subproblem and iteratively find larger problems one by one, and save the results of each subproblem and use it for larger problems. When we go from smaller and larger subproblem, we need to find a relationship between these. Hence, the key to solve an algorithmatic problem with DP, we need to build abilities to find optimal subproblem structure and the link between them.

The first example is Longest increasing subsequences.
Input: a sequence of numbers a1,...,an,
Output: an increasing subsequence of greater length.
A subsequence is any subset of these numbers taken in order.

The longest subsequence of 5,2,7,6,3,8,9,7 is 2,3,8,9

At first attempt, I spent a lot of hours and did not come up with correct algorithm. Before we tackle into coding, always write your algorithm in a pseudocode. And check if the algorithm looks correct.
Here are few things to remember.
0. Don't look at answer. Solve it on your own.
1. Base condition
2. Did you find optimal subproblem? How to index subproblem?
3. Did you save the result of subproblems, and reuse it in later calculation? What's the format of result?
4. Did you find correct relationship between subproblems ( from smaller and larger)
5. How many functions needed?

It is been a while I am solving this kind of problem again, and it took me almost three hours to get the correct algorithm.

Here is my pseudocode and a program in Java.

1. Base condition
Result[0] = input[0]

2. Optimal subproblem
LongSub(0),...,LongSub(n-1)

3. Result format:
Use Map>

4. Relationship
LongSub(v) = maxSize ( LongSub(u),v), for all v > u where u is less than v by index position.

Java program

public class LongestSub {
public static void main (String[] args) {
System.out.println();
List list1 = new ArrayList();
list1.add(5);
list1.add(2);
list1.add(8);
list1.add(6);
list1.add(3);
list1.add(6);
list1.add(9);
list1.add(7);
System.out.println(list1);

List list2 = findLongSub(list1,8);
Collections.sort(list2);
System.out.println(list2);
}
public static List findLongSub(List input, int n)
{
Map> resultmap = new HashMap>();
List first = new ArrayList();
first.add(input.get(0));
resultmap.put(0, first);
List result = longSub(resultmap, input, n, n-1);
return result;
}
public static List longSub(Map> map, List input, int n, int i)
{
List result = new ArrayList();
result.add(input.get(i));
if (i==0)
{
return result;
}
List subArray = new ArrayList();
subArray.addAll(input.subList(0, i));
int maxsize = 0;
List intermediate = new ArrayList();
for (Integer j=0;j<subArray.size();j++)
{
intermediate.clear();
intermediate.add(input.get(i));
if (input.get(i) > subArray.get(j))
{
List temp = null;
if (map.containsKey(j))
temp = map.get(j);
else
temp = longSub(map, input,n,j);

if (temp.size() + 1 > maxsize)
{
maxsize = temp.size() + 1;
intermediate.addAll(temp);
result.clear();
result.addAll(intermediate);
}
}
}
map.put(i, result);
return result;
}

}