Data Scientist needs some data sources . Some type databases , PDFs or various file format parsing . In the same fashion Obtaining data from RSS feed (Rich Site Summary) is really a good option . Well If you are looking How to Read RSS feed in Python? This article will end up your search related to Parsing of RSS feeds .
Read RSS feed in Python – Step by Step Guide –
Install the feedparser using pip module. Refer the below code –
pip install feedparser
after installing we need to import the module .
Step 3 –
In this step, We will pass the url for RSS feed . This will crawl the data in Python object . Lets get the complete code first then we will understand –
Feed = feedparser.parse('http://www.reddit.com/r/python/.rss') pointer = Feed.entries print (pointer.summary) print (pointer.link)
Here we have used the URL – ‘http://www.reddit.com/r/python/.rss’. You may choose different URLs for Parsing RSS feed. This will bring up the complete detail of the feed. The collected feed has so many attributes of interest. These attributes are summary and link etc.
Complete Code and Output –
Here is the complete code in one shot. Lets check out its output –
Why RSS Feed is important ?
RSS feed saves a lot of time which developer wastes in ordinary scrapping. In the Scrapping scenario , We have to first crawl the website and filter the Relevant data . RSS feed basically provide such interface in API form using respective module . FYI you may use beautiful SOAP python library for web scrapping .
These short article series is an initiative from the Data Science Learner Team for time save reference for developers. We tried to generate sharp and clear content on various small topics . If you any suggestion on such topics please contact us. Your suggestion helps in creating valuable articles.
Data Science Learner Team