Need Xpath to target table cell node as long as cell above it does NOT contain certain text-CodePudding

So I have a basic table structured:

<tbody>
  <tr>
    <td></td>
    <td></td>
  <tr>
    <td></td>
    <td></td>
  </tr>
  <tr>
    <td></td>
    <td></td>
  </tr>
</tbody> etc....

I'm trying to target a link <a> element in one cell only if the cell above it does NOT contain a word. For example:

<tbody>
  <tr>
    <td></td>
    <td></td>
  <tr>
  <tr>
    <td></td>
    <td><b>Fire Sale!</b></td>
  </tr>
  <tr>
    <td></td>
    <td><a href="something">linktext</a></td>
  </tr>
/tbody>

So I'd want to target the <a> only if the cell above it does NOT contain "Fire Sale!".

The problem is no matter what I do I can't keep the conditional axes to find the cell right above.

<tbody>
  <tr>
    <td></td>
    <td><b>Fire Sale!</b></td>
  </tr>
  <tr>
    <td><a href="somethingelse">link I don't want</a></td>
    <td><a href="something">linktext</a></td>
  </tr>
/tbody>

I've tried stuff like:

//tr/td/b/a[@href]/ancestor::tbody/tr/td/b[contains(text(),'Fire Sale!')]

But no matter what, because of the odd relationship between tr and td I always end up getting an affirmative conditional. That is, they share the same ancestor tree structure for the most part and targeting back down to the <td> above my main target seems impossible. Is there some way to use variables or I feel count() might help but I'm just not sure of the syntax for the whole thing.

Any ideas?

EDIT: Here is the real HTML

<table border="0" width="100%" style="border-collapse: collapse">
    <tr>
        <td width="33%" valign="top" height="225" align="center"><img border="0" src="" width="296" height="225"></td>
        <td width="33%" valign="top" height="225" align="center"><br><br><br><br><b>Unassigned</b></td>
        <td width="33%" valign="top" height="225" align="center"></td>
    </tr>
    <tr>
        <td width="33%" valign="top" height="30" align="center"><b><a href="">AAAAA</a></b><br>
                <b>XXXXXXX</b><br><b><font color="#FF0000">YYYYYYYYY</font><br></b><br></td>
        <td width="33%" valign="top" height="30" align="center"><b><a href="">BBBBB</a><br></b><br></td>
        <td width="33%" valign="top" height="30" align="center"></td>
    </tr>
    <tr>
        <td width="100%" colspan="4" height="80" align="center">
        | <a href=""> Home</a> |<br>
        | <a href="">Design</a> 
        | <a href="">Styles</a> 
        | <a href="">X Listings</a> 
        | <a href="">Y Listings</a> |<br>
        | <a href="">About the Author</a> |</td>
    </tr>
    <tr>
        <td width="100%" colspan="4" height="60" align="center">
        Copyright Some Dude, 2020<br>
        Email: <a href="">[email protected]</a></td>
    </tr>
</table>

So basically I want the link containing BBBBB only if the word 'Unassigned' does not appear above it.

EDIT 2 to clarify that the links should only be targeted when text in the above cell does NOT exist.

CodePudding user response：

Try the following somewhat complex XPath-1.0 expression. It will give you <a> links' href attribute for the preceding row's cell index containing a given string:

//tr/td[count(../preceding-sibling::tr[1]/td[contains(.,'Fire Sale!')]/preceding-sibling::td) 1]/a/@href

EDIT1:
A stricter version that selects the link if the new given value "Unassigned" is present is the following:

//tr[preceding-sibling::tr[1]/td[contains(.,'Unassigned')]]/td[count(../preceding-sibling::tr[1]/td[contains(.,'Unassigned')]/preceding-sibling::td) 1]//a

CodePudding user response：

You might first get the tbody/tr/td/b that contains Fire Sale! and then navigate to the next tr through the ancestor tr.

Note that in your expression this part //tr/td/b/a[@href] would not match as there is no anchor wrapped in a b tag in the example data.

//tbody/tr/td/b[contains(text(),'Fire Sale!')]/ancestor::tr/following-sibling::tr[1]/td/a[@href]

CodePudding user response：

I'll use a more simplified search-text ("xx") while try a bit more complex element structure in order to prove my approach.

Using this input:

<tbody>
  <tr>
    <td></td>
    <td><b>xx</b></td>
  </tr>
  <tr>
    <td><a href="somethingelse">link I don't want</a></td>
    <td><a href="something">linktext</a></td>
  </tr>
  
  <tr>
    <td></td>
    <td>xx</td>
    <td>  </td>
    <td>xx</td>
  </tr>
  <tr>
    <td><a href="nok">don't want it</a></td>
    <td><a href="ok">want it</a></td>
    <td><a href="nok">don't want it</a></td>
    <td><a href="ok">want it</a></td>
  </tr>
  
</tbody>

and applying this XPath expression:

    //td[a and count(preceding-sibling::td) = 
parent::tr/preceding-sibling::tr[1]/td[.//text() = 'xx']/count(preceding-sibling::td)]/a

I get the three wanted <a>'s. Idea is to count the number of <td>s before "me" and check whether in line above (parent::tr/preceding-sibling::tr[1]) a <td> exists that contains the search string and has the same number of <td>'s before it.