Date and Time: Oct 11, 2024, 16:38 (EST)
Link: https://leetcode.com/problems/substring-with-concatenation-of-all-words/
You are given a string s
and an array of strings words
. All the strings of words
are of the same length.
A concatenated string is a string that exactly contains all the strings of any permutation of words
concatenated.
- For example, if
words = ["ab","cd","ef"]
, then"abcdef"
,"abefcd"
,"cdabef"
,"cdefab"
,"efabcd"
, and"efcdab"
are all concatenated strings."acdbef"
is not a concatenated string because it is not the concatenation of any permutation ofwords
.
Return an array of the starting indices of all the concatenated substrings in s
. You can return the answer in any order.
Example 1:
Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation:
The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.
Example 2:
Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation:
There is no concatenated substring.
Example 3:
Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation:
The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"].
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"].
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"].
-
1 <= s.length <= 10^4
-
1 <= words.length <= 5000
-
1 <= words[i].length <= 30
-
s
andwords[i]
consist of lowercase English letters.
-
Use
word_counts{}
to record each word inwords
and their occurancy. -
Start from
range(word_len)
(we can form substring starting from range(0, word_len) and each time we step by word_len), we createl = i, counts = 0, sub_counts{}
. And we check froml
tor in range(l, len(s)-word_len + 1, word_len)
, so each increment ofr
will let us create asub_word
and we check if thissub_word
inword_counts
or not. -
In the for loop, we want to check the range of
l, r
and find if the words in this range equal to the counts ofcounts
, and if we detect repeated word in this range, ie.sub_counts[sub_word] > word_counts[sub_word]
, we incrementl += word_len
and decrementcounts -= 1
. Later, if we find equalcounts == len(words)
, we find a substring of all words, and we append thisl
index tores[]
. -
If a current
sub_word
does not exist inword_counts
, we need to incrementl = r + word_len
and clearsub_counts{}
, resetcounts = 0
. -
Finally, return
res
.
Example 1: from index 0, we can form substring = barfoothefoobarm
, so we can find any valid concatenation from this substring. Later, we go to index 1, we can form arfoothefoobarma
, for index 2, we can form rfoothefoobarman
. This is how we use range(word_len)
to find all the concatenations.
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
# Count words frequency
# start from each index of s, try if we can find all words from i
# if yes, append i to res, otherwise, continue
# if word in curr_counts > word_counts, decrement by moving l
# TC: O(n^2), SC: O(n)
res = []
word_counts = {}
word_len = len(words[0])
for word in words:
word_counts[word] = word_counts.get(word, 0) + 1
for i in range(word_len):
l = i
counts = 0
sub_counts = {}
# Check sub_word
for r in range(l, len(s) - word_len + 1, word_len):
sub_word = s[r:r + word_len]
# Check if sub_word exists in word_counts
if sub_word in word_counts:
sub_counts[sub_word] = sub_counts.get(sub_word, 0) + 1
counts += 1
# If repeated word found, shrink the window from left
while sub_counts[sub_word] > word_counts[sub_word]:
sub_counts[s[l:l + word_len]] -= 1
l += word_len
counts -= 1
if counts == len(words):
res.append(l)
# If sub_word doesn't exist, update l
else:
l = r + word_len
sub_counts.clear()
counts = 0
return res
Time Complexity:
Space Complexity: