Creating a simple web scraper with Python

I want to introduce you to a useful tool that you can make with python. Very easy ! A web scraper that you can get links and parts from the web.

[Here are the basic preconditions]

1.Open Default settings in Pycharm -> Select Project interpreter

2.Download and install beautifulscraper and beatifulsoup4 if you don’t  have one.

3.import requests and  add the following code. Here is a great tutorial

on BeautifulSoup that I find useful so check this out too! http://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup

from bs4 import BeautifulSoup

4. Add the following code. I will not be explaining about every line of

code since I believe that the readers are quite exposed to python.
 
 def trade_spider(max_pages):
 page = 1
 while page <= max_pages:
 url = 'https://www.your url'
 source_code = requests.get(url)
 plain_text = source_code.text
 soup = BeautifulSoup(plain_text, "html.parser") #find all the titles and links
 for link in soup.findAll('a',{'class':'category-name'}):#findAll('a',{'class':'item-name'}):
 href = link.get('href')
 title = link.string #gets the title of the link
 if href != "":
 # print(title)
 # print(href)
 else:
 print("Wasn't able to get data ")
 page+=1

Quite short and simple isn’t it ?  Web scraper was one of first things that I learned when I started coding python and it was cool, as for now, I’m not sure what I can use this for since I don’t need to scrape data for my apps !

 

See you n the next blog !

Leave a Reply

Your email address will not be published. Required fields are marked *