Nlogk
Code Templates

Rolling Hash

Use Cases for Rolling Hash

Use CaseDescriptionExample
Substring fingerprint in O(1)After O(n) preprocessing, hash any substring [left, right] in constant time.Compare s[2:5] and s[10:13] without rescanning characters.
Pattern search (Rabin–Karp)Slide a fixed-length window; compare needle hash to each window hash, then verify on match.Find first index of needle in haystack. 28. Find the Index of the First Occurrence in a String
Repeated substring detectionTrack window hashes in a set; duplicates imply a repeated substring of that length.187. Repeated DNA Sequences — length-10 windows on a genome string.
Binary search on answer + hash checkFor “longest substring with property X”, binary search length L and use rolling hash to test in O(n).1044. Longest Duplicate Substring
Palindrome / prefix checks (with reverse hash)Maintain forward and backward hashes to test palindromes or prefix equality quickly.214. Shortest Palindrome

Formula: hash(s[left:right+1]) = (prefix[right+1] - prefix[left] * power[length]) % mod

Code Template

Constants and prefix setup

  1. Pick a base (often 26 or 131) and a large mod (e.g. 10**9 + 7) to reduce collisions.
  2. Precompute power[i] = base^i mod mod and prefix[i] = hash of s[0:i).

Build prefix hashes

  1. power[i+1] = power[i] * base % mod.
  2. prefix[i+1] = prefix[i] * base + ord(s[i]) (or map a→0 … z→25 for lowercase letters).

Substring hash query

  1. Inclusive range [left, right]: subtract the prefix before left, scaled by the window length.

Rabin–Karp search helper

  1. Build RollingHash on text and pattern; compare pattern hash to each length-m window.
  2. On hash match, verify with a direct string compare (hash collisions are possible).
class RollingHash:
def __init__(self, s, base=131, mod=10**9 + 7):
self.base = base
self.mod = mod
n = len(s)
self.power = [1] * (n + 1)
self.prefix = [0] * (n + 1)

Common mistakes

  1. Wrong substring length in get_hash
# Wrong: use right instead of window length
return (self.prefix[right + 1] - self.prefix[left] * self.power[right]) % self.mod
# Correct:
length = right - left + 1
return (self.prefix[right + 1] - self.prefix[left] * self.power[length]) % self.mod
  1. Forgetting collision check
# Wrong: return on hash match only
if hash_text.get_hash(start, start + m - 1) == target:
return start
# Correct: verify the actual substring
if hash_text.get_hash(start, start + m - 1) == target:
if text[start:start + m] == pattern:
return start
  1. Mismatched base/mod between two strings
# Wrong: different mod when comparing haystack and needle hashes
hash_text = RollingHash(haystack, mod=10**9 + 7)
hash_pattern = RollingHash(needle, mod=10**9 + 9)
# Correct: same base and mod
hash_text = RollingHash(haystack)
hash_pattern = RollingHash(needle)

Issues and consideration

  • Collisions: Two different strings can share the same hash mod M. Always confirm with a direct compare in contests and interviews when the answer must be exact.
  • Double hashing: Use two (base, mod) pairs and compare both hashes for stricter checks (common on 1044).
  • Negative modulo: Python % handles (a - b) % mod; in other languages normalize negative results after subtraction.
  • Character encoding: ord(s[i]) works for general ASCII; for az only, ord(s[i]) - ord('a') keeps numbers smaller.
  • Complexity: Preprocess O(n); each get_hash O(1); Rabin–Karp scan O(n) windows with O(1) hash each (plus occasional verify).

Same RollingHash class and get_hash; only the outer loop changes.

28. Find the Index of the First Occurrence in a String

class RollingHash:
def __init__(self, s, base=131, mod=10**9 + 7):
self.base = base
self.mod = mod
n = len(s)
self.power = [1] * (n + 1)
self.prefix = [0] * (n + 1)
for i in range(n):
self.power[i + 1] = self.power[i] * self.base % self.mod
self.prefix[i + 1] = (self.prefix[i] * self.base + ord(s[i])) % self.mod
def get_hash(self, left, right):
length = right - left + 1
return (self.prefix[right + 1] - self.prefix[left] * self.power[length]) % self.mod
def rabin_karp(text, pattern):
n, m = len(text), len(pattern)
if m == 0:
return 0
if m > n:
return -1
hash_text = RollingHash(text)
hash_pattern = RollingHash(pattern)
target = hash_pattern.get_hash(0, m - 1)
for start in range(n - m + 1):
if hash_text.get_hash(start, start + m - 1) == target:
if text[start:start + m] == pattern:
return start
return -1
class Solution:
def strStr(self, haystack: str, needle: str) -> int:
return rabin_karp(haystack, needle)

187. Repeated DNA Sequences

class RollingHash:
def __init__(self, s, base=131, mod=10**9 + 7):
self.base = base
self.mod = mod
n = len(s)
self.power = [1] * (n + 1)
self.prefix = [0] * (n + 1)
for i in range(n):
self.power[i + 1] = self.power[i] * self.base % self.mod
self.prefix[i + 1] = (self.prefix[i] * self.base + ord(s[i])) % self.mod
def get_hash(self, left, right):
length = right - left + 1
return (self.prefix[right + 1] - self.prefix[left] * self.power[length]) % self.mod
class Solution:
def findRepeatedDnaSequences(self, s: str):
n = len(s)
window = 10
if n < window:
return []
rh = RollingHash(s)
seen = {}
result = []
for start in range(n - window + 1):
h = rh.get_hash(start, start + window - 1)
if h in seen:
sub = s[start:start + window]
if sub not in result:
result.append(sub)
else:
seen[h] = start
return result