Count the Number of Occurrences of a given Substring in a String
In this article, we will learn how to count the occurrences of a substring in a string in Python. We will discuss codes having built-in functions, without built-in functions. Let's first have a quick look over what is a string in Python.
Python String
The String is a type in python language just like integer, float, boolean, etc. Data surrounded by single quotes or double quotes are said to be a string. A string is also known as a sequence of characters.
string1 = "apple"
string2 = "Preeti125"
string3 = "12345"
string4 = "pre@12"
In Python, we can count the occurrences of a substring from a given string using three different methods. The mentioned codes will return the count of how many times a substring is present in a string.
For Example,
Example: Count the Occurrences of Substring using Pattern Searching Algorithm
This is a simple solution to match characters of a substring one by one and we increment the counter by 1 when we get the complete match for the substring. This program is generally helpful for those who are looking for an algorithm without the use of any built-in functions.
Time Complexity: O(M*N)
def count(sub, s):
M = len(sub)
N = len(s)
res = 0
# A loop to slide sub[] one by one
for i in range(N - M + 1):
# For current index i, check for the match
j = 0
while(j < M):
if (s[i + j] != sub[j]):
break
j += 1
if (j == M):
res += 1
j = 0
return res
# Driver Code
string = "abracadabra"
substring = "bra"
print("Count:", count(substring, string))
Count: 2
Example: Count the Occurrences of Substring using KMP Algorithm
This solution is based on KMP(Knuth Morris Pratt) algorithm. The basic idea behind this algorithm is that it detects the mismatched pattern or substring instead of the matched pattern. lps[] array is used to skip the characters while matching. The following is a self-explanatory code. We will look into this algorithm in detail in another article.
Time Complexity: O(M+N)
def count(sub, s):
M = len(sub)
N = len(s)
# Create lps[] that will hold the longest prefix suffix values for subtern
lps = [None] * M
j = 0 # index for sub[]
# Preprocess the substring (calculate lps[] array)
lps_Array(sub, M, lps)
i = 0 # index for s[]
res = 0
next_i = 0
while (i < N):
if sub[j] == s[i]:
j = j + 1
i = i + 1
if j == M:
# When we find substring first time, we iterate again to check if there exists more substring
j = lps[j - 1]
res = res + 1
# We start i to check for more than once appearance of substring, we will reset i to previous start+1
if lps[j] != 0:
next_i = next_i + 1
i = next_i
j = 0
# Mismatch after j matches
elif ((i < N) and (sub[j] != s[i])):
# Do not match lps[0..lps[j-1]] characters, they will match anyway
if (j != 0):
j = lps[j - 1]
else:
i = i + 1
return res
def lps_Array(sub, M, lps):
# Length of the previous longest prefix suffix
len = 0
i = 1
lps[0] = 0 # lps[0] is always 0
# The loop calculates lps[i] for i = 1 to M-1
while (i < M):
if sub[i] == sub[len]:
len = len + 1
lps[i] = len
i = i + 1
else: # (sub[i] != sub[len])
# search the step
if len != 0:
len = lps[len - 1]
else: # if (len == 0)
lps[i] = len
i = i + 1
# Driver code
string = "abracadabra"
substring = "bra"
print("Count:", count(substring, string))
Count: 2
Example: Count the Occurrences of Substring using count() Function
In this example, we use built-in count()
function to count the occurrences of the substring in the given string. It takes substring as an argument. Also, you can provide substring, start and stop arguments to find a substring within a range.
Time Complexity: O(n)
string = "abracadabra"
substring = "bra"
ct = string.count(substring)
print("Count:",ct)
Count: 2
Conclusion
In this article, we learned to count the occurrences of a substring in a given string in Python by using several methods. We used some simple algorithms like pattern searching without any built-in function, KMP algorithm, and count() function to count the occurrences. We discussed that all these methods along with their time complexities.