Parse a YAML file in Python
In this article, we will learn various ways to parse YAML files in Python. We will use some built-in modules and libraries available in Python and some related custom examples as well. Let's first have a quick look over the full form of YAML, an introduction to YAML, and then read about various parsing modules to read YAML documents in Python.
Introduction to YAML
YAML stands for YAML Ain't Markup Language
. It is a human-readable serialization language commonly used for configuration files and data storage purposes. The method of reading the information from a YAML file and further analyzing its logical structure is known as Parsing. Parsing a YAML file in Python reads the contents of the YAML file into Python as a dictionary. YAML files can have .yml or .yaml as file extension type.
We will parse the two mentioned YAML files in this article- items.yaml and data.yaml.
//items.yaml
cap: 1
purse: 5
books: 23
case: 2
bottles: 12
pens: 6
//data.yaml
country:
- Algeria
- Bangladesh
- Poland
- Guinea
- Denmark
---
company:
- Getsocio
- Flyder
- Powerstorm
- Indofood
Let us discuss various ways to parse the above two YAML files.
Parse YAML file using load() function
The below example imports yaml
module of Python. Python provides yaml.load()
function to parse the contents of the given file. It converts a YAML file to a Python object and prints the content in the form of a Python Dictionary. It is a recommended YAML parser and emitter for Python. The file is passed as an argument to the function. It extracts data by using standard dictionary methods (dict.keys(), dict.values()).
Example
We use it when data is coming from trusted sources. Also, it avoids arbitrary code execution.
import yaml
with open('items.yml') as f:
dict = yaml.load(f, Loader=yaml.FullLoader)
print(dict)
{'cap': 1, 'purse': 5, 'books': 23, 'case': 2, 'bottles': 12, 'pens': 6}
Example2
This works similar to the above code but it is only used when data is coming from untrusted sources.
import yaml
with open('items.yml') as f:
dict = yaml.load(f, Loader=yaml.SafeLoader)
print(dict)
{'cap': 1, 'purse': 5, 'books': 23, 'case': 2, 'bottles': 12, 'pens': 6}
Parse YAML file using full_load() function
The below example imports yaml
module of Python. Python provides yaml.full_load()
function to parse the contents of the given file. It takes one file as its argument and return the content of the file in the form of key-value pair.
import yaml
with open('items.yml') as f:
dict = yaml.full_load(f)
print(dict)
{'cap': 1, 'purse': 5, 'books': 23, 'case': 2, 'bottles': 12, 'pens': 6}
Parse multiple YAML documents using load_all() function
The below example imports yaml
module of Python. Python provides yaml.load_all()
function to parse the contents of the given file. This function can parse multiple YAML documents present in a single file as shown in data.yaml file. It prints the data separately in the form of a list.
import yaml
with open('data.yml') as f:
docs = yaml.load_all(f, Loader=yaml.FullLoader)
for x in docs:
for k, v in x.items():
print(k, ":", v)
country : ['Algeria', 'Bangladesh', 'Poland', 'Guinea', 'Denmark']
company : ['Getsocio', 'Flyder', 'Powerstorm', 'Indofood']
Parse YAML file using yaml.safe_load()
An alternative approach to parse the YAML file is by using yaml.safe_load()
function. It can be used to load data by untrusted sources.
import yaml
with open('items.yml') as f:
dict = yaml.safe_load(f)
print(dict)
{'cap': 1, 'purse': 5, 'books': 23, 'case': 2, 'bottles': 12, 'pens': 6}
Conclusion
In this article, we learned about YAML files and different ways to parse YAML files by using several built-in functions of yaml supported by Python such as load()
, safe_load()
, load_all()
, and full_load()
. We used some custom parsing codes as well to parse the YAML files.