Python Scrapy Installation And Example Crawler

Scrapy is an open source framework written in Python that allows you to extract data from structural content such as html and xml. It is able to scraping and crawling on websites especially fast enough. First of all, you should install python packet manager, pip.

Installing with using pip:

pip install scrapy

Starting a new project:

scrapy startproject tutorial

When a scrapy project is created, a file / directory structure will be created as follows.

tutorial/
    scrapy.cfg            # deploy configuration file

    tutorial/             # project's Python module, you'll import your code from here
        __init__.py

        items.py          # project items definition file

        pipelines.py      # project pipelines file

        settings.py       # project settings file

        spiders/          # a directory where you'll later put your spiders
            __init__.py

Example spider:

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        'http://quotes.toscrape.com/page/1/',
        'http://quotes.toscrape.com/page/2/',
    ]

    def parse(self, response):
        for quote in response.css('div.quote'):
            yield {
                'text': quote.css('span.text::text').extract_first(),
                'author': quote.css('small.author::text').extract_first(),
                'tags': quote.css('div.tags a.tag::text').extract(),
            }

Scrapy spiders can scan and extract data on one or more addresses. With Scrapy selectors, the desired fields can be selected and filtered. Scrapy is supported by xpath scrapy in selectors.

For crawling:

scrapy crawl quotes

For writing output to json file:

scrapy crawl quotes -o qutoes.json

原文链接：Python Scrapy Installation And Example Crawler

文章版权声明 1、本网站名称：拾光赋
2、本站永久网址：https://www.blogs.ink
3、本网站的文章部分内容可能来源于网络，仅供大家学习与参考，如有侵权，请联系站长QQ：805375623进行删除处理。
4、本站一切资源不代表本站立场，并不代表本站赞同其观点和对其真实性负责。
5、本站一律禁止以任何方式发布或转载任何违法的相关信息，访客发现请向站长举报
6、本站资源大多存储在云盘，如发现链接失效，请联系我们我们会第一时间更新。

THE END

Python Scrapy Installation And Example Crawler

请登录后发表评论