I'm trying to select and copy all the text in a div, except H1 tag. The text looks as shown image.
To select all content, I could do by using the below code
browser.find_element(By.XPATH, "//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div").send_keys(Keys.CONTROL "a") #select all
But when I try to avoid the H1 tag content, which is "Do You Have Dry Pack...", with it's corresponding div which is
"//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div/h1"
with the code like below, it's showing error
browser.find_element(By.XPATH, "//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div"[not(("//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div/h1"))]).send_keys(Keys.CONTROL "a") #select all
How can I overcome this?
CodePudding user response:
Firstly, you can use //
to Selects nodes in the document from the current node that match the selection no matter where they are
. XPath Syntax
Secondly, about your error, you define a string in JS with double quotes, but there are double quotes in your string. Therefore, you have to escape those with character \
. For e.g:
browser.find_element(By.XPATH, "//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div\"[not((\"//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div/h1"))]).send_keys(Keys.CONTROL "a")
Lastly, instead of using element's index (e.g: div[2]
), you should find another way to define the unique path to that element. Element's id, class, attributes are commonly used. In case that didn't work, you can use //
to reach the child nodes, then use ..
to reach their parent.
CodePudding user response:
With that xpath you are getting the entire div, but when you put the "not" you are telling: "take me all the divs that hasn't a h1 inside". But you want to exclude only the h1 so you need to take the elements INSIDE THE DIV, not the div. You can do that with this xpath:
/div/*[not(self::h1)]
//*[@id='__next']/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[2]/div[2]/div/*[not(self::h1)]
Then in the code you have to get this as a list, not a single element.
And please, try to create more clean and less dangerous xpaths as the other answer said, something like: //*[@id='__next']//h1
it's better.