Question :Given strings S and T, find the minimum (contiguous) substring W of S, so that T is a subsequence of W.
If there is no such window in S that covers all characters in T, return the empty string “”. If there are multiple such minimum-length windows, return the one with the left-most starting index.
Example :
Input: S = “abcdebdde”, T = “bde”Output: “bcde”Explanation: There are many substrings with “bde” but the smallest amongst them is “bcde” and “bdde” of length 4. Out of these 2, “bcde” occurs first, Hence it is the answer.
Idea:
We can have the brute force solution where we consider every substring of S and check whether it contains T. Amongst those substrings, our answer will be the one of the shortest length and smallest index. But it?s time complexity would be O(S) (number of substrings * length of substring).
But this can be solved elegantly using DP, which I am going to discuss here.
The idea here is that, we let dp[i][j] = k denote that in substring S(0?i), there exists a subsequence corresponding to T(0?j) starting at index k of S.
So, for our answer, we will have to consider the entire string T. Hence, our starting index would be one amongst dp[0][t-1], dp[1][t-1], ?, dp[s-1][t-1], where s and t are the lengths of s and t respectively.Now, how do we derive dp[i][j] ?
if(S[i] == T[j]) dp[i][j] = dp[i-1][j-1]else dp[i][j] = dp[i-1][j]
Clearly, when S[i] == T[j], S(0?i) contains T(0?j) and the last character is the same, so we have to search for the remaining T, i.e, T(0?j-1) in S(0?i-1) else we would have to search for the entire T(0?j) in S(0?i-1).
Also, dp[i][j] = -1 if subsequence T is not found in S.
Code :
string minWindow(string S, string T) { int s = S.size(); int t = T.size(); vector<vector<int>> dp(s, vector<int>(t, -1)); for(int i = 0; i < s; i++) { if(S[i] == T[0]) { dp[i][0] = i; } else { if(i != 0) { dp[i][0] = dp[i-1][0]; } } } for(int i = 1; i < s; i++) { for(int j = 1; j < t; j++) { if(S[i] == T[j]) { dp[i][j] = dp[i-1][j-1]; } else { dp[i][j] = dp[i-1][j]; } } } int begin = -1, length = INT_MAX; for(int i = 0; i < s; i++) { int index = dp[i][t-1]; if(index != -1) { int curLength = i – index + 1; if(curLength < length) { begin = index; length = curLength; } } } if(begin == -1) return “”; return S.substr(begin, length); }
Code Explanation :
- We first initialise the first column of our dp array
vector<vector<int>> dp(s, vector<int>(t, -1)); for(int i = 0; i < s; i++) { if(S[i] == T[0]) { dp[i][0] = i; } else { if(i != 0) { dp[i][0] = dp[i-1][0]; } } }
- We initialise the entire dp array by the equations explained above
for(int i = 1; i < s; i++) { for(int j = 1; j < t; j++) { if(S[i] == T[j]) { dp[i][j] = dp[i-1][j-1]; } else { dp[i][j] = dp[i-1][j]; } }}
- We start traversing the last column of the dp array because that will have the entire string T covered. Our answer should be the string with the shortest length, i.e., shortest i-beginIndex+1, where beginIndex is already given by dp[i][t-1].
- Our algorithm will automatically find the substring starting at an earlier index in case of length match as we are traversing from i = 0 to s-1, and updating our ans only if we get a string of greater length.
int begin = -1, length = INT_MAX; for(int i = 0; i < s; i++) { int index = dp[i][t-1]; if(index != -1) { int curLength = i – index + 1; if(curLength < length) { begin = index; length = curLength; } } }
- We return an empty string in case we do not find one, i.e., our begin = -1 or we return the string starting at index begin of length length.
if(begin == -1) return “”;return S.substr(begin, length);
Time Complexity : O(st)Space Complexity : O(st)
Well, this was my first attempt at explaining.Improvement on its way ! 🙂