Leet Code: Longest Substring Without Repeating Characters
y
Author: Jason Smith
Greetings programs!
The LeetCode problem we will be tackling in this post is Longest Substring Without Repeating Characters. It is classified as a medium problem.
Problem Description
Given a string s, find the length of the longest without duplicate characters.
Constraints:
- 0 <= s.length <= 5 * 10^4
- s consists of English letters, digits, symbols and spaces.
Solution
This problem is all about calculating the longest substring. So first we need to know what that longest substring is. To do that we will be using a technique called a sliding window. We will store our starting and ending indices and slide them around as we process the characters in the string. For our sliding window to work properly we must be able to move it through the characters of the string. The conditions for moving the start and the end positions are different, but relatively simple. The right portion of our sliding window will move to the character every iteration. This ends up becoming our main looping structure for processing each character. The left portion of our sliding window only moves when we find a repeating character. This is what will allow us to remove repeating characters and only look at unique substrings. To keep track of characters we have seen before I used a HashMap. It mapped a character seen to how many times it had been seen. This could also just be an array for each ASCII character allowed. The main thing is that we need to track what characters we have already seen in our current substring. Now it became a simple task of processing the string. Every character processed is first added to the HashMap if it wasn't already there. Then the right portion of the window is incremented until a repeat character is found. Once a repeat character is found we start moving the left portion of the window forward, decrementing the visited count for each character as we pass it. Finally, after each time we know we have a unique substring we take the max() of our current window size and the stored maximum value for this string. At the end we return that stored maximum value.
This runs pretty quick through the data sets. We are just looping through the characters one time after all. However, we do have that inner loop where we move the left portion of our window to make sure the substring is unique. If we had an array of the alphabet with an extra Z at the end the runtime of this could be O(N^2). This algorithm seemed to work well enough to solve the problem, but I went and read some other solutions that other coders wrote. The main thing that they did that seems to have sped up the algorithm significantly was to store the index of the last seen instance of the character instead of the character count. This allowed them to jump the left position of the window forward instantly to where they would have a unique string. If they see a position in the last seen lookup map that is before the position of the left part of the window, they just ignore it as having already been dropped. This simple optimization took their code from a O(N^2) to a O(N). A Great optimization. They also used a fixed size array that is the same size of the characters they could run into. Giving them an O(1) constant memory usage, but I don't see this as big of an approvement over just a HashMap.
Conclusion
This was a good medium problem that showed the sliding window technique for arrays. I learned a good optimization to keep in my back pocket for the next problem I run across that needs a sliding window as well. It is very important to take the time to read some of the other coders solutions to these problems. You never know what you may learn.
End of Line