I'm sending a POST request to an API using scrapy.FormRequest
and receiving a TextResponse
object back. The body of this object looks like so:
{
"Refiners": ...
"Results": ...
}
I am only interested in the Results
portion of the response as it contains HTML that I would like to parse.
As such, I am trying to creating a new TextResponse
object containing only the Results
portion in the body, so that I am able to use the response.css
method on it.
I tried the following and it yielded an empty response body. Any thoughts on why and how to fix this?
new_response = scrapy.http.TextResponse(response.json()["Results"])
CodePudding user response:
You can use the HTMLResponse
class and you need to provide the body
and encoding
arguments in the constructor.
from scrapy.http import HtmlResponse
new_response = HtmlResponse(url="some_url", body=response.json()["Results"], encoding="utf-8")
You can then use new_response.css(...)
to select elements.