FlipKart.py main spider file for scrap name, price, and link from flipkart.com
import scrapy
from ..items import FlipkartScraperItem
class FlipkartSpider(scrapy.Spider):
name = 'FlipKart'
allowed_domains = ['www.flipkart.com']
start_urls = ['https://www.flipkart.com/search?q=mobile']
def parse(self, response):
products = response.css('._2kHMtA')
for product in products:
item = FlipkartScraperItem()
item['name'] = product.css('._4rR01T').get(),
item['price'] = product.css('._2kHMtA ._1_WHN1').get(),
item['link'] = product.css("._1fQZEK::attr('href')").get()
yield item
Items.py File Here I wanted to print the name variable
import scrapy
from scrapy.loader import ItemLoader
from itemloaders.processors import TakeFirst # TakeFirst text from data
from itemloaders.processors import MapCompose # For function calling
from w3lib.html import remove_tags # For removing html tags
def removeRupeeSymbol(value):
return value.replace('₹', '').strip()
class FlipkartScraperItem(scrapy.Item):
# define the fields for your item here like:
name = scrapy.Field(input_processor = MapCompose(remove_tags), output_processor = TakeFirst())
price = scrapy.Field(input_processor = MapCompose(remove_tags, removeRupeeSymbol), output_processor = TakeFirst())
link = scrapy.Field()
print(name)
I want to scrap Flipkart mobile phones data and store them in CSV with some changes in that data.
I have written a function called removeRupeeSymbol to clean data and then after I want to store that data in CSV file but I am not able to access that data
when I try to print those data it gives me the address of the variable instead of the data.
here is the result when I print the name variable
{'input_processor': <itemloaders.processors.MapCompose object at 0x000001DE10CBD290>, 'output_processor': <itemloaders.processors.TakeFirst object at 0x000001DE10CBD390>}
CodePudding user response:
To pull the desired data, you can try to implement the next working example.
Full working code as an example:
import scrapy
from ..items import FlipkartScraperItem
from itemloaders import ItemLoader
class FlipkartSpider(scrapy.Spider):
name = 'flipKart'
allowed_domains = ['www.flipkart.com']
start_urls = ['https://www.flipkart.com/search?q=mobile']
def parse(self, response):
products = response.css('._2kHMtA')
for product in products:
u = 'https://www.flipkart.com' product.css( "._1fQZEK::attr('href')").get()
loader = ItemLoader(item=FlipkartScraperItem(),selector = product)
loader.add_css('name', '._4rR01T::text')
loader.add_css('price', '._2kHMtA ._1_WHN1::text')
loader.add_value('link', u)
item = loader.load_item()
yield item
items.py file:
import scrapy
from scrapy.loader import ItemLoader
from itemloaders.processors import TakeFirst # TakeFirst text from data
from itemloaders.processors import MapCompose # For function calling
#from w3lib.html import remove_tags # For removing html tags
def removeRupeeSymbol(value):
return value.replace('₹', '').strip()
class FlipkartScraperItem(scrapy.Item):
# define the fields for your item here like:
name = scrapy.Field(output_processor = TakeFirst())
price = scrapy.Field(input_processor = MapCompose(removeRupeeSymbol), output_processor = TakeFirst())
link = scrapy.Field(output_processor = TakeFirst())
Output:
'name': 'APPLE iPhone 11 (White, 128 GB)',
'price': '44,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-rainbow-fantasy-128-gb/p/itm594222523bd8f?pid=MOBGB9TYGW5NGXVH&lid=LSTMOBGB9TYGW5NGXVHWMF5TV&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_3&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGB9TYGW5NGXVH.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Rainbow Fantasy, 128 GB)',
'price': '15,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-rainbow-fantasy-128-gb/p/itm594222523bd8f?pid=MOBGB9TYFQR3FQZT&lid=LSTMOBGB9TYFQR3FQZTZ6EEUD&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_4&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGB9TYFQR3FQZT.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Rainbow Fantasy, 128 GB)',
'price': '16,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-starlight-black-128-gb/p/itm594222523bd8f?pid=MOBGB9TYF7P7RNYX&lid=LSTMOBGB9TYF7P7RNYX5GJVDV&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_5&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGB9TYF7P7RNYX.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Starlight Black, 128 GB)',
'price': '16,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-starlight-black-128-gb/p/itm594222523bd8f?pid=MOBGB9TYNDFYKNQ6&lid=LSTMOBGB9TYNDFYKNQ6QMGH15&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_6&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGB9TYNDFYKNQ6.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Starlight Black, 128 GB)',
'price': '15,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-44w-starry-sky-128-gb/p/itm2a08ebbea3689?pid=MOBGDRHVHNBBBBP5&lid=LSTMOBGDRHVHNBBBBP5SY2MJL&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_7&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGDRHVHNBBBBP5.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 44W (Starry Sky, 128 GB)',
'price': '14,499'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-44w-starry-sky-128-gb/p/itm2a08ebbea3689?pid=MOBGDRHVMW2UDXZY&lid=LSTMOBGDRHVMW2UDXZYVNQXYN&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_8&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGDRHVMW2UDXZY.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 44W (Starry Sky, 128 GB)',
'price': '15,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-44w-midnight-galaxy-128-gb/p/itm2a08ebbea3689?pid=MOBGDRHVZN29ZJF4&lid=LSTMOBGDRHVZN29ZJF4WEHAX7&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_9&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGDRHVZN29ZJF4.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 44W (Midnight Galaxy, 128 GB)',
'price': '14,499'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-starlight-black-128-gb/p/itm594222523bd8f?pid=MOBGB9TYEDGEXQRA&lid=LSTMOBGB9TYEDGEXQRAKBELKI&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_10&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGB9TYEDGEXQRA.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Starlight Black, 128 GB)',
'price': '19,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-5g-silky-white-128-gb/p/itm594222523bd8f?pid=MOBGHNKGG77MVYBG&lid=LSTMOBGHNKGG77MVYBGQMSTRZ&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_11&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGHNKGG77MVYBG.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 5G (Silky White, 128 GB)',
'price': '15,990'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/samsung-galaxy-f13-nightsky-green-64-gb/p/itmeadfda1bd23fa?pid=MOBGENJWF4KJTPEN&lid=LSTMOBGENJWF4KJTPENAUQVSZ&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_12&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGENJWF4KJTPEN.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'SAMSUNG Galaxy F13 (Nightsky Green, 64 GB)',
'price': '11,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/redmi-10-midnight-black-64-gb/p/itmd93641e4ebb47?pid=MOBGC9GYEBH3GZ4E&lid=LSTMOBGC9GYEBH3GZ4ESWAKTT&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_13&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGC9GYEBH3GZ4E.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'REDMI 10 (Midnight Black, 64 GB)',
'price': '10,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/motorola-e40-carbon-gray-64-gb/p/itm0ca635007c9e2?pid=MOBG2EMWUMUFGSZE&lid=LSTMOBG2EMWUMUFGSZEJNGZMU&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_14&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBG2EMWUMUFGSZE.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'MOTOROLA e40 (Carbon Gray, 64 GB)',
'price': '9,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/motorola-g40-fusion-frosted-champagne-128-gb/p/itm78278061a0e25?pid=MOBFWSF8Q3XAHTZH&lid=LSTMOBFWSF8Q3XAHTZHU8WKTS&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_15&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBFWSF8Q3XAHTZH.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'MOTOROLA G40 Fusion (Frosted Champagne, 128 GB)',
'price': '13,499'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/redmi-10-pacific-blue-64-gb/p/itm0f2a6a2112b75?pid=MOBGC9GYCHQZK9GW&lid=LSTMOBGC9GYCHQZK9GW8N0WII&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_16&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGC9GYCHQZK9GW.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'REDMI 10 (Pacific Blue, 64 GB)',
'price': '10,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/apple-iphone-11-black-64-gb/p/itm4e5041ba101fd?pid=MOBFWQ6BXGJCEYNY&lid=LSTMOBFWQ6BXGJCEYNYZXSHRJ&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_17&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBFWQ6BXGJCEYNY.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'APPLE iPhone 11 (Black, 64 GB)',
'price': '39,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/samsung-galaxy-f13-waterfall-blue-64-gb/p/itm583ef432b2b0c?pid=MOBGENJWBPFYJSFT&lid=LSTMOBGENJWBPFYJSFT1ZY7B0&marketplace=FLIPKART&q=mobile&store=tyy/4io&spotlightTagId=BestsellerId_tyy/4io&srno=s_1_18&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGENJWBPFYJSFT.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'SAMSUNG Galaxy F13 (Waterfall Blue, 64 GB)',
'price': '11,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1x-space-blue-64-gb/p/itm9e8207e7825a9?pid=MOBGG56ZFNPMHBWE&lid=LSTMOBGG56ZFNPMHBWEB8Y2U5&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_19&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGG56ZFNPMHBWE.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1X (Space Blue, 64 GB)',
'price': '11,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1x-gravity-black-64-gb/p/itm9e8207e7825a9?pid=MOBGG56ZMXMNUCYF&lid=LSTMOBGG56ZMXMNUCYFQ5HT8S&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_20&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGG56ZMXMNUCYF.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1X (Gravity Black, 64 GB)',
'price': '11,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-44w-midnight-galaxy-128-gb/p/itm2a08ebbea3689?pid=MOBGDRHVXFVCGS23&lid=LSTMOBGDRHVXFVCGS23RDLBPG&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_21&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGDRHVXFVCGS23.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 44W (Midnight Galaxy, 128 GB)',
'price': '15,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/vivo-t1-44w-midnight-galaxy-128-gb/p/itm2a08ebbea3689?pid=MOBGDRHVWJUFTYQJ&lid=LSTMOBGDRHVWJUFTYQJ2SYXAC&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_22&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGDRHVWJUFTYQJ.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'vivo T1 44W (Midnight Galaxy, 128 GB)',
'price': '17,999'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/samsung-galaxy-f23-5g-aqua-blue-128-gb/p/itme54bc0c2292f4?pid=MOBGBKQF45XPEUHA&lid=LSTMOBGBKQF45XPEUHAYAHBJE&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_23&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBGBKQF45XPEUHA.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)',
'price': '18,499'}
2022-11-20 22:15:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/search?q=mobile>
{'link': 'https://www.flipkart.com/motorola-e40-pink-clay-64-gb/p/itm5d6f2871d1bbf?pid=MOBG2EMW2ZUR4BFG&lid=LSTMOBG2EMW2ZUR4BFGEC0C0J&marketplace=FLIPKART&q=mobile&store=tyy/4io&srno=s_1_24&otracker=search&fm=organic&iid=57f0e76e-5286-4a0b-a6b6-f37cd935bfd2.MOBG2EMW2ZUR4BFG.SEARCH&ppt=None&ppn=None&ssid=hz6twhlgww0000001668960934301&qH=532c28d5412dd75b',
'name': 'MOTOROLA e40 (Pink Clay, 64 GB)',
'price': '9,999'}