Home > Mobile >  How to extract all attributes value inside nested android tag or how to extract the value/string ins
How to extract all attributes value inside nested android tag or how to extract the value/string ins

Time:07-19

I want to parse the html(or andriod content?) from an mobile app, and i am doing something like

pageSource = driver.page_source
print("page = ",pageSource)

and what i got is the following:

   page =  <?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<hierarchy index="0"  rotation="0" width="1080" height="2274">
  <android.widget.FrameLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
    <android.widget.LinearLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
      <android.widget.FrameLayout index="0" package="testapp"  text="" resource-id="android:id/content" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
        <android.widget.FrameLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
          <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
            <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
              <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
                <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
                  <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,220]" displayed="true">
                    <android.widget.ImageView index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[22,66][154,220]" displayed="true" />
                    <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[198,77][880,209]" displayed="true" />
                  </android.view.View>
                  <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,264][142,396]" displayed="true" />
                  <android.view.View index="2" package="testapp"  text="" content-desc="1" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[430,289][650,371]" displayed="true" />
                  <android.widget.ScrollView index="3" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="true" selected="false" bounds="[0,440][1080,2102]" displayed="true">
                    <android.widget.Button index="0" package="testapp"  text="" content-desc="2;Po Lam" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,440][1026,608]" displayed="true" />
                    <android.view.View index="1" package="testapp"  text="" content-desc="3" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[68,666][1011,770]" displayed="true" />
                    <android.widget.ImageView index="2" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[68,770][115,823]" displayed="true" />
                    <android.view.View index="3" package="testapp"  text="" content-desc="4" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[115,770][446,823]" displayed="true" />
                    <android.view.View index="4" package="testapp"  text="" content-desc="5" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[540,770][1012,823]" displayed="true" />
                    <android.view.View index="5" package="testapp"  text="" content-desc="6" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,940][540,992]" displayed="true" />
                    <android.view.View index="6" package="testapp"  text="" content-desc="about" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[714,940][750,992]" displayed="true" />
                
                  </android.widget.ScrollView>
                  <android.widget.ImageView index="4" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2046][1080,2102]" displayed="true" />
                  <android.view.View index="5" package="testappp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2102][1080,2274]" displayed="true">
                    <android.widget.ImageView index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2102][1080,2274]" displayed="true" />
                    <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2114][216,2262]" displayed="true" />
                    <android.widget.ImageView index="2" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[432,2114][648,2262]" displayed="true" />
                    <android.widget.ImageView index="3" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[648,2114][864,2262]" displayed="true" />
                    <android.widget.ImageView index="4" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[864,2114][1080,2262]" displayed="true" />
                  </android.view.View>
                </android.view.View>
              </android.view.View>
            </android.view.View>
          </android.view.View>
        </android.widget.FrameLayout>
      </android.widget.FrameLayout>
    </android.widget.LinearLayout>
    <android.view.View index="2" package="testapp"  text="" resource-id="android:id/navigationBarBackground" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2274][1080,2340]" displayed="true" />
  </android.widget.FrameLayout>
</hierarchy>

I want to get all the content of "content-desc"

Updated with full resource get from the webdriver and what i want is the "number" vs the text inside "content-desc".

I have tried

soup = BeautifulSoup(pageSource,"lxml")

with the soup return null

CodePudding user response:

Since you want all the tags that has an attribute content-desc inside it. You can use regex to do this, as the xml has nested android.view.view tags and content-disc attribute are also present inside other tags in the xml.

Here is how we can do this :

Creating data

page = '''<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<hierarchy index="0"  rotation="0" width="1080" height="2274">
  <android.widget.FrameLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
    <android.widget.LinearLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
      <android.widget.FrameLayout index="0" package="testapp"  text="" resource-id="android:id/content" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
        <android.widget.FrameLayout index="0" package="testapp"  text="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
          <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
            <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
              <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
                <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,2274]" displayed="true">
                  <android.view.View index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,0][1080,220]" displayed="true">
                    <android.widget.ImageView index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[22,66][154,220]" displayed="true" />
                    <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[198,77][880,209]" displayed="true" />
                  </android.view.View>
                  <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,264][142,396]" displayed="true" />
                  <android.view.View index="2" package="testapp"  text="" content-desc="1" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[430,289][650,371]" displayed="true" />
                  <android.widget.ScrollView index="3" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="true" selected="false" bounds="[0,440][1080,2102]" displayed="true">
                    <android.widget.Button index="0" package="testapp"  text="" content-desc="2;Po Lam" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,440][1026,608]" displayed="true" />
                    <android.view.View index="1" package="testapp"  text="" content-desc="3" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[68,666][1011,770]" displayed="true" />
                    <android.widget.ImageView index="2" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[68,770][115,823]" displayed="true" />
                    <android.view.View index="3" package="testapp"  text="" content-desc="4" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[115,770][446,823]" displayed="true" />
                    <android.view.View index="4" package="testapp"  text="" content-desc="5" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[540,770][1012,823]" displayed="true" />
                    <android.view.View index="5" package="testapp"  text="" content-desc="6" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[54,940][540,992]" displayed="true" />
                    <android.view.View index="6" package="testapp"  text="" content-desc="about" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[714,940][750,992]" displayed="true" />
                
                  </android.widget.ScrollView>
                  <android.widget.ImageView index="4" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2046][1080,2102]" displayed="true" />
                  <android.view.View index="5" package="testappp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2102][1080,2274]" displayed="true">
                    <android.widget.ImageView index="0" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="false" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2102][1080,2274]" displayed="true" />
                    <android.widget.ImageView index="1" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2114][216,2262]" displayed="true" />
                    <android.widget.ImageView index="2" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[432,2114][648,2262]" displayed="true" />
                    <android.widget.ImageView index="3" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[648,2114][864,2262]" displayed="true" />
                    <android.widget.ImageView index="4" package="testapp"  text="" resource-id="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[864,2114][1080,2262]" displayed="true" />
                  </android.view.View>
                </android.view.View>
              </android.view.View>
            </android.view.View>
          </android.view.View>
        </android.widget.FrameLayout>
      </android.widget.FrameLayout>
    </android.widget.LinearLayout>
    <android.view.View index="2" package="testapp"  text="" resource-id="android:id/navigationBarBackground" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" long-clickable="false" password="false" scrollable="false" selected="false" bounds="[0,2274][1080,2340]" displayed="true" />
  </android.widget.FrameLayout>
</hierarchy>'''

Extracting attribute

import re
soup = BeautifulSoup(page, 'lxml')
for content_disc_element in soup.findAll(re.compile(r".*"), {"content-desc" : re.compile(r".*")}):
    print(content_disc_element['content-desc'])

Output :

This gives us the expected output present for attribute content-desc :

enter image description here

1
2;Po Lam
3
4
5
6
about

CodePudding user response:

The problem can be solved easier (without regex complexity):

from bs4 import BeautifulSoup


page =  '''...'''

soup = BeautifulSoup(page, 'lxml')
elems = soup.find_all()
for x in elems:
    if x.has_attr('content-desc'):
        print(x['content-desc'])

This will return:

1
2;Po Lam
3
4
5
6
about
8
store;Po Lam
add
open
value
10
11
32
13
14
15
16
17
18
19
20
33
34
35
20

Also, you should avoid using findAll in newer bs4 versions, and instead use find_all

  • Related