Signup/Sign In
PUBLISHED ON: JULY 15, 2021

Python Program To Check for URL in a String

In this tutorial, we will learn to check if a string contains a URL or not. To know about Strings in Python in detail, refer to the article on strings. For a given string, we have to check if there is a URL present, if found, then we will print the URL which is present in the string.

Look at the examples to understand the input and output format.

Input: "studytonight.com"

Output: []

Input: "Profile: https://www.studytonight.com/"

Output: ['https://www.studytonight.com/']

To solve this problem, we will be using the concept of regular expression of Python. Python provides re module that supports regular expressions in Python. A regular expression is a special sequence of characters that helps one to match or find other strings or sets of strings, using a specialized syntax held in a pattern.

In this module, we have findall() method, which we will be using in our program. The findall() method finds all the matches and returns them as a list of strings, with each string representing one match. This method scans the string from left to right and matches are returned in the order in which they are found.

Algorithm

Look at the algorithm to understand the approach better.

Step 1- Import re module

Step 2- Define a function that will find the URL

Step 3- In the function, define a regular expression that will store all the possible characters of a URL

Step 4- Declare another variable that will store all the strings which are in the pattern of the URL

Step 5- Print all the strings in the list

Step 6- Declare a string with characters

Step 7- Pass the string in the function and print the value returned by it

Python Program

In this program, we have used a method of the re module which will find a defined pattern in a given string. To use the method, we have to import the re module in the program. If there is no URL in the string, the program will print an empty list.

import re
def findURL(string):
    regex=r"(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))"
    url= re.findall(regex,string)
    return [x[0] for x in url]

s = "Studytonight: https://www.studytonight.com/"
print("Urls: ", findURL(s))


Urls: ['https://www.studytonight.com/']

Conclusion

In this tutorial, we have learned how to check and print if there is any URL present in the string or not using the findall() method of the re module.



About the author:
Nikita Pandey is a talented author and expert in programming languages such as C, C++, and Java. Her writing is informative, engaging, and offers practical insights and tips for programmers at all levels.