How to get MD5 Sum of a String in Python?
In this article, we will learn how to get MD5 sum of a given string in Python. We will use a built-in function to find the sum. Let's first have a quick look over what is MD5 in Python.
MD5 Hash in Python
MD5 Hash is one of the hash functions available in Python's hashlib
library. It is mainly used in cryptographic functions to perform hash calculations. Hash is also used to check the checksum of a file, password verification, fingerprint verification, build caches of large data sets, etc. It accepts a byte string and outputs the equivalent hexadecimal string of the encoded value. Encoding a string to an MD5 hash produces a 128-bit hash value.
Hashing algorithms typically act on binary data rather than text data, so you should be careful about which character encoding is used to convert from text to binary data before hashing. The result of a hash is also binary data. In this article, we will import hashlib
library to use hashlib.md5()
function to find the MD5 sum of the given string in Python.
Three functions are mainly used here-
1. encode()
- It encodes and converts the given string into bytes to be acceptable by the hash function.
2. digest()
- It returns the encoded data in byte format.
3. hexdigest()
- It returns the encoded data in hexadecimal format. It returns a 32 character long digest.
Example: Use hashlib.md5() to get MD5 Sum of a String
This method imports hashlib
library of Python. The below example calls hashlib.md5()
function with an argument as a byte string to return an MD5 hash object. It calls str.encode()
with str as an argument to return an encoded string. hexdigest()
function is then called to display the encoded data in hexadecimal format, else you can call digest()
a function to display data in byte format. The md5 hash function encodes the string and the byte equivalent encoded string is printed.
Python 2.x Example
import hashlib
#using hexdigest()
print hashlib.md5("This is a string").hexdigest()
print hashlib.md5("000005fab4534d05key9a055eb014e4e5d52write").hexdigest()
41fb5b5ae4d57c5ee528adb00e5e8e74
f927aa1d44b04f82738f38a031977344
Python 3.x Example
import hashlib
#using hexdigest()
print(hashlib.md5("This is a string".encode('utf-8')).hexdigest())
print(hashlib.md5("000005fab4534d05key9a055eb014e4e5d52write".encode('utf-8')).hexdigest())
#using digest()
print(hashlib.md5("This is a string".encode('utf-8')).digest())
print(hashlib.md5("000005fab4534d05key9a055eb014e4e5d52write".encode('utf-8')).digest())
41fb5b5ae4d57c5ee528adb00e5e8e74
f927aa1d44b04f82738f38a031977344
b'A\xfb[Z\xe4\xd5|^\xe5(\xad\xb0\x0e^\x8et'
b"\xf9'\xaa\x1dD\xb0O\x82s\x8f8\xa01\x97sD"
Note:
1. If you need byte type output, use digest()
instead of hexdigest()
.
2. You must have noticed in the above examples that, Python 2 does not require utf-8 encoding but Python 3 requires encoding. If you run the program in Python 3 without encode()
, you will get an error. Reason: MD5 function takes a byte string and does not accept Unicode. Python 3 is explicit, and so str (""
) is Unicode and has to be encoded to a byte string. Strings in Python 2 can be interpreted as either a byte string or Unicode string, and passing str (""
) string is interpreted as a byte string. If the string has Unicode characters, it will raise an Exception. Encoding a byte string will leave ASCII characters untouched and convert Unicode correctly
Conclusion
In this article, we learned about hashlib.md5()
function to get the MD5 sum of a string. We discussed MD5 hash functions and why it is used. We saw the implementation of the hash function in both Python 2 and 3.