Working with XML in Python Requests library
What is XML? XML mean Extensible Markup Language, which need for storing structured data and group any items. In XML markup language you may create tags with any names. The most popular examples of XML - Sitemaps and RSS feeds.
Example of XML file:
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories> </food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories> </food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories> </food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories> </food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories> </food>
</breakfast_menu>
In this example, file contain breakfast_menu global tag, who include food categories, and every food category includes name, price, description and calories tag.
Now we start learning how to work with XML and Python Requests library. First we need to prepare our working environment.
For create new project and virtual environment install python3-virtualenv package. It need for separation requirements of each project. Installation in Debian/Ubuntu:
sudo apt install python3 python3-virtualenv -y
Create project folder:
mkdir my_project
cd my_project
Create Python virtual environment with env named folder:
python3 -m venv env
Activate virtual environment:
source env/bin/activate
Install PIP's dependencies:
pip3 install requests
Let's start code writing.
Create main.py file and insert code below:
import requestsThis code snippet help us to find all inner tags.
import xml.etree.ElementTree as ET
request = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(request.content)
for item in root.iter('*'):
print(item.tag)
The output of this code:
(env) user@localhost:~/my_project$ python3 main.py
breakfast_menu
food
name
price
description
calories
food
name
price
description
calories
food
name
price
description
calories
food
name
price
description
calories
food
name
price
description
calories
Now we are write code for getting values from inner elements. Open main.py file and replace previously code with this:
import requests
import xml.etree.ElementTree as ET
request = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(request.content)
for item in root.iterfind('food'):
print(item.findtext('name'))
print(item.findtext('price'))
print(item.findtext('description'))
print(item.findtext('calories'))
We received next result:
(env) user@localhost:~/my_project$ python3 main.py
Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650
Strawberry Belgian Waffles
$7.95
Light Belgian waffles covered with strawberries and whipped cream
900
Berry-Berry Belgian Waffles
$8.95
Light Belgian waffles covered with an assortment of fresh berries and whipped cream
900
French Toast
$4.50
Thick slices made from our homemade sourdough bread
600
Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
At the final step we prettifying output data to make it easier to read:
import requests
import xml.etree.ElementTree as ET
request = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(request.content)
for item in root.iterfind('food'):
print('Name: {}. Price: {}. Description: {}. Calories: {}'.format(item.findtext('name'), item.findtext('price'), item.findtext('description'), item.findtext('calories')))
Here output:
(env) user@localhost:~/my_project$ python3 main.py
Name: Belgian Waffles. Price: $5.95. Description: Two of our famous Belgian Waffles with plenty of real maple syrup. Calories: 650
Name: Strawberry Belgian Waffles. Price: $7.95. Description: Light Belgian waffles covered with strawberries and whipped cream. Calories: 900
Name: Berry-Berry Belgian Waffles. Price: $8.95. Description: Light Belgian waffles covered with an assortment of fresh berries and whipped cream. Calories: 900
Name: French Toast. Price: $4.50. Description: Thick slices made from our homemade sourdough bread. Calories: 600
Name: Homestyle Breakfast. Price: $6.95. Description: Two eggs, bacon or sausage, toast, and our ever-popular hash browns. Calories: 950
Source materials:
Example of XML file taken from W3Schools.
Support me on Patreon
#http-requests #parsing #python #xml