Python Program To Check for URL in a String
In this tutorial, we will learn to check if a string contains a URL or not. To know about Strings in Python in detail, refer to the article on strings. For a given string, we have to check if there is a URL present, if found, then we will print the URL which is present in the string.
Look at the examples to understand the input and output format.
Input: "studytonight.com"
Output: []
Input: "Profile: https://www.studytonight.com/"
Output: ['https://www.studytonight.com/']
To solve this problem, we will be using the concept of regular expression of Python. Python provides re module that supports regular expressions in Python. A regular expression is a special sequence of characters that helps one to match or find other strings or sets of strings, using a specialized syntax held in a pattern.
In this module, we have findall() method, which we will be using in our program. The findall() method finds all the matches and returns them as a list of strings, with each string representing one match. This method scans the string from left to right and matches are returned in the order in which they are found.
Algorithm
Look at the algorithm to understand the approach better.
Step 1- Import re module
Step 2- Define a function that will find the URL
Step 3- In the function, define a regular expression that will store all the possible characters of a URL
Step 4- Declare another variable that will store all the strings which are in the pattern of the URL
Step 5- Print all the strings in the list
Step 6- Declare a string with characters
Step 7- Pass the string in the function and print the value returned by it
Python Program
In this program, we have used a method of the re module which will find a defined pattern in a given string. To use the method, we have to import the re module in the program. If there is no URL in the string, the program will print an empty list.
import re
def findURL(string):
regex=r"(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))"
url= re.findall(regex,string)
return [x[0] for x in url]
s = "Studytonight: https://www.studytonight.com/"
print("Urls: ", findURL(s))
Urls: ['https://www.studytonight.com/']
Conclusion
In this tutorial, we have learned how to check and print if there is any URL present in the string or not using the findall() method of the re module.