Skip to content Skip to sidebar Skip to footer

How To Pass Arguments (for Feed_uri) To Scrapy Spider's Instane For Dynamically Naming Output File

I want to send arguments to spider and get output (json, csv) named accordingly to arguments. F.e., $ scrapy crawl spider_name -a category=category1 -a subcategory=subcategory1 a

Solution 1:

You can get those parameters from kwargs of __init__ and use in FEED_URI like this:

classMySpider(scrapy.Spider):
    name = 'my_spider'

    custom_settings = {
      'FEED_URI' : '%(category)s_%(subcategory)s.json'
     }

    def__init__(self, *args, **kwargs):
        self.category = kwargs.pop('category', '')
        self.subcategory = kwargs.pop('subcategory', '')
        super(MySpider, self).__init__(*args, **kwargs)
        

Docs: https://doc.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters

Post a Comment for "How To Pass Arguments (for Feed_uri) To Scrapy Spider's Instane For Dynamically Naming Output File"