Hello Erevan,
Today we will discuss one more medium complexity problem. We will discuss it because I was asked to solve this problem on one of the interview. And I also had great time on finding more optimal solution now.
So, as always in the beginning let?s analyze the problem first. We need to find longest substring on unrepeated characters. Basically this is all 🙂
- Easiest solution ? is sliding window solution.
We start from index = 0, and pointer = 0, max length string currently is 1. We are moving pointer to the right, till the end of the string, or when string in window stops being unique. When string in window stops being unique ? let?s move index to the right till it become unique again. On each move of any pointer ? we checking max length.
The biggest drawback in this solution is of course validating if string is unique. As a dumb solution we can just go from the beginning of the string in slide window and check if we meet same character more then 1 time.
2. Optimised with hash table solution 1.
Since the biggest problem of the solution 1 was validating of uniqueness, we can optimise this part by adding hashtable.
Once we are moving pointer to the right ? we are adding element to the hashtable, or if it was already there ? increasing value of this element to +1. We will know, that the string is not unique once element in hashtable will be present more then once.
When we are moving index to right ? we are removing elements from hashtable ( decreasing their count to -1 ). Once we find character which was met by pointer, our job is done, and string in sliding window unique again.
So, solution can looks something like this:
Weird and not funny code:) Let?s change on how we are tackling the problem!
3. Count only indexes. Basically we do not need to save any elements, or something. What we need ? is once we found element, which makes our string not unique ? find last occurrence of this char, and calculate length of the string between current pointer AND last occurrence of the char + 1.
Let?s take an example of string:
abcabc
We are starting from index = 0. Once we read the char => we are adding it to the hashtable with index = 0. b with index 1. c with index 2. Then we?ve encountered character a again. it is obvious that unique string starts from the first location of a + 1: abcabc and length 3.
Then after we meet b again, we are moving again: abcababc. And so on.
So, for us ? main thing is to save last occurrence of each symbol, which helps us to calculate length of the unique string.
As you can see the code is way smaller, and more easy to read.
Complexity of the solution: O(n) ( if to not speak about hashtable internal complexity ). Size ? O(1), since we are limited only with 26+ chars.
Common question about the code: why start <= usedChars[s[i]].
This is easy one. Imagine next string:
abcdbac
First occurrence on not unique character bill be ?b?: abcdbac. That means that last string with all unique characters before this b ? will be abcdbac. So, we are pointing start pointer to the element with index 2.
Since we are not saving here elements counts in hashtable, once we will find second a ? we will see, that this is not unique char again, however its position was before even first b, so we kindly ignoring that, since 0 ( which is index of a ) is less then value of start.
Happy coding, Erevan!