Home > Blockchain >  How to write css selector for following html in Scrapy
How to write css selector for following html in Scrapy

Time:10-22

I want to get the amount written in p tag under td tag , but I am getting p element content as a string, I only want to extract the amount written.

The HTML code: The HTML code

The command I used: enter image description here

response.css("#__next > div > div:nth-child(2) > div > div.data-table_container__pPKXQ > table > tbody > tr:nth-child(1) > td:nth-child(3) > p: nth-child(1)").get()

CodePudding user response:

With xpath:

In [1]: all_tr = response.xpath('//tbody/tr')

In [2]: for example in all_tr:
   ...:     print(example.xpath('./td/p/text()[2]').get())
   ...:
$533.2M
$10.4B
$811.5M
$518.8M
$39.6M
$264.7M
$390M
$3.2B
$508.1M
$404.3M
$7.4B
$410.3M
$14.2M
$33.3M
$11.9M
$1.4B
$745.2M
$1.9B
$70M
$72.7M
$580M
$100.2M
$1.8B
$143.4M
$150M

With CSS:

In [1]: all_tr = response.css('tbody > tr')

In [2]: for example in all_tr:
   ...:     print(example.css('td > p::text').getall()[1])
   ...:
$533.2M
$10.4B
$811.5M
$518.8M
$39.6M
$264.7M
$390M
$3.2B
$508.1M
$404.3M
$7.4B
$410.3M
$14.2M
$33.3M
$11.9M
$1.4B
$745.2M
$1.9B
$70M
$72.7M
$580M
$100.2M
$1.8B
$143.4M
$150M

CodePudding user response:

You need to select the attribute you want witch in your case would be the text.

Add ::text to the end of your css selector.

response.css("#__next > div > div:nth-child(2) > div > div.data-table_container__pPKXQ > table > tbody > tr:nth-child(1) > td:nth-child(3) > p: nth-child(1)::text").get()

If you are trying to grab the same value from all the rows in the table, then your selector could be simplified as well.

For example:

response.xpath("//tr/td/p/text()").getall()

Also take a look at the examples in SuperUser's answer

  • Related