How to convert XML file into CSV using Python ? ( Quick Steps)

How to convert XML file into CSV using Python

Most of the legacy system’s export system has XML File export. If we need to perform an analysis then CSV format or XLS format works best. Then if we can convert XML file into CSV using python then this automation can save a lot of time for us. In this article, we will explore how can we convert XML file into CSV using python step by step. It will be a custom way for conversion.

 

Convert XML file into CSV using python (Steps )-

It is a two-step process. The first is default but the second is not generic. Lets deep dive into them.

Step 1 : Convert XML file to Python Dict –

To convert XML files into Python dict we can leverage third-party library

!pip install xmltodict

Use the below code to read the XML file into Python dict.

import xmltodict
with open("sample.xml") as xml_file:
    data_dict = xmltodict.parse(xml_file.read())
sample file XML
sample file XML

Step 2: Convert Python Dict to CSV File –

It is not very difficult to convert dict to CSV but it will vary because of datapoint. So we will take this step in two sub-steps.

2.1 Shaping dict into the exportable format –

Here we need to understand the structure of the dictionary. Then we need to shape it either

  •  list of dictionaries with uniform key
  • list of list of values.

If we are only keeping the values and making list of list then we have to create a custom columns list  but if we are converting the dict to list of dictionaries then can take keys of dict as column names.

Example : Here is an example of convertable format.

data_dict=[
  {"Series":"I", "X":10.0, "Y":8.04},
  {"Series":"I", "X":8.0, "Y":6.95},
  {"Series":"I", "X":13.0, "Y":7.58},
  {"Series":"I", "X":9.0, "Y":8.81},
  {"Series":"I", "X":11.0, "Y":8.33},
  {"Series":"I", "X":14.0, "Y":9.96}]

2.2 Exporting dict to Python CSV using Pandas –

we will use pandas library and then we will construct the dataframe out of it. Here is the code for dict to csv conversion.

import pandas as pd
df = pd.DataFrame(data_dict, columns=['Series', 'X', 'Y'])
csv_file_path = 'myoutput.csv'
df.to_csv(csv_file_path, index=False)
export xml to csv
convert XML file into CSV using python

 

 

Thanks
Data Science Learner Team

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages Datasciencelearner.com, where he and his team share knowledge and help others learn more about data science.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner