Given a word and a string S, find all starting indices in S which are anagrams of word.

For example, given that word is “ab”, and S is “abxaba”, return 0, 3, and 4.

Brute force

The brute force solution here would be to go over each word-sized window in S and check if they’re anagrams, like so:

1
2
3
4
5
6
7
8
9
10
11
12
from collections import Counter

def is_anagram(s1, s2):
return Counter(s1) == Counter(s2)

def anagram_indices(word, s):
result = []
for i in range(len(s) - len(word) + 1):
window = s[i:i + len(word)]
if is_anagram(window, word):
result.append(i)
return result

This would take O(|W| * |S|) time. Can we make this any faster?

Rolling hash

One way to solve this problem is to use the Rabin-Karp algorithm. The basic idea is that we could make a frequency-based hash of the target word and check if any window under s hashes to the same value. That is, the hash would be the sum of char * prime_num ** char_freq for each character and their frequency. If the hash of the word and window matches, we could do a manual == of the two strings. Since collisions are expected to be rare, this would be O(S). However, there’s an easier way to solve this problem:

Count difference

Notice that moving along the window seems to mean recomputing the frequency counts of the entire window, when only a little bit of it actually updated. This insight leads us to the following strategy:

  • Make a frequency dictionary of the target word
  • Continuously diff against it as we go along the string
  • When the dict is empty, the window and the word matches

We diff in our frequency dict by incrementing the new character in the window and removing
old one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
class FrequencyDict:
def __init__(self, s):
self.d = {}
for char in s:
self.increment(char)

def _create_if_not_exists(self, char):
if char not in self.d:
self.d[char] = 0

def _del_if_zero(self, char):
if self.d[char] == 0:
del self.d[char]

def is_empty(self):
return not self.d

def decrement(self, char):
self._create_if_not_exists(char)
self.d[char] -= 1
self._del_if_zero(char)

def increment(self, char):
self._create_if_not_exists(char)
self.d[char] += 1
self._del_if_zero(char)


def anagram_indices(word, s):
result = []

freq = FrequencyDict(word)

for char in s[:len(word)]:
freq.decrement(char)

if freq.is_empty():
result.append(0)

for i in range(len(word), len(s)):
start_char, end_char = s[i - len(word)], s[i]
freq.increment(start_char)
freq.decrement(end_char)
if freq.is_empty():
beginning_index = i - len(word) + 1
result.append(beginning_index)

return result

This should run in O(S) time.

Are you interviewing for programming jobs, or do you just enjoy fun programming questions? Check out our newsletter, Daily Coding Problem, to get a question in your inbox every day.