I am trying to create a dataframe by adding a condition to pick up only the uptill a certain attribute say "stack".
I have pandas 1.3.4 installed in my windows 10 and spyder IDE.
import pandas as pd
df_employee = pd.read_xml('employee.xml',xpath='employee_name')
df_cor = pd.read_xml('employee.xml',xpath='employee_name/email')
df_id = pd.read_xml('employee.xml',xpath='employee_name/email/id')
df_id2 = pd.read_xml('employee.xml',xpath='//id')
df_address = pd.read_xml('employee.xml',xpath='employee_name/email/id/address')
df_address2 = pd.read_xml('employee.xml',xpath='//address')
df_street = pd.read_xml('employee.xml',xpath='employee_name/email/id/address/street')
df_street2 = pd.read_xml('employee.xml',xpath='//street')
df_state = pd.read_xml('employee.xml',xpath='employee_name/email/id/address/street/state')
df_cell_cap = pd.read_xml('employee.xml',xpath='employee_name/email/id[@name="stack"]//address')
df_street_stack = pd.read_xml('employee.xml',xpath='employee_name/email/id[@name="stack"]//street')
The code works fine uptill this stage but when it hits the below line it throws error.
df_cell_cap = pd.read_xml('employee.xml',xpath='employee_name/email/id[@name="stack"]//address')
I have tried the below ways but the error is still there.
df_address_stack = pd.read_xml('employee.xml',xpath='employee_name/email/id[contains(@name,"stack")]//address')
df_address_stack = pd.read_xml('employee.xml',xpath='employee_name/email/id/*[name() = 'stack']/address'
Error:
ValueError: xpath does not return any nodes. Be sure row level nodes are in xpath. If document uses namespaces denoted with xmlns, be sure to define namespaces and use them in xpath.
Is there anything that i am missing ? Below is my sample xml:
<?xml version="1.0" encoding="UTF-8"?>
<employee_name name="ndlkjfidm" date="dfhkryi">
<email name="nnn" P="ffgnp" V="0.825" T="125c">
<id name="stack">
<address name="adas_jk3" type="entry">
<street name="VSS" voltage="0.000000" vector="!ENXB" active_input="NA" active_ouput="ENX">
<temp name="ADS_DEFAULT_temp_LOW">
<raw nod="VBP" alt="7.05537e-15" jus="74.4619" />
<raw nod="VDDC" alt="4.63027e-10" jus="115.178" />
</temp>
</street>
<street name="VSS" voltage="0.000000" vector="ENXB" active_input="NA" active_ouput="ENX">
<temp name="ADS_DEFAULT_temp_HIGH">
<raw nod="VBP" alt="7.05537e-15" jus="74.4644" />
<raw nod="VDDC" alt="1.52578e-14" jus="311.073" />
</temp>
</street>
</address>
</id>
</email>
</employee_name>
CodePudding user response:
Your sample XML :
<employee_name name="ndlkjfidm" date="dfhkryi">
<email name="nnn" P="ffgnp" V="0.825" T="125c">
<id name="stack">
<address name="adas_jk3" type="entry">
<street name="VSS" voltage="0.000000" vector="!ENXB" active_input="NA" active_ouput="ENX">
<temp name="ADS_DEFAULT_temp_LOW">
<raw nod="VBP" alt="7.05537e-15" jus="74.4619" />
<raw nod="VDDC" alt="4.63027e-10" jus="115.178" />
</temp>
</street>
<street name="VSS" voltage="0.000000" vector="ENXB" active_input="NA" active_ouput="ENX">
<temp name="ADS_DEFAULT_temp_HIGH">
<raw nod="VBP" alt="7.05537e-15" jus="74.4644" />
<raw nod="VDDC" alt="1.52578e-14" jus="311.073" />
</temp>
</street>
</address>
</id>
</email>
</employee_name>
If you go through the documentation for pandas.read_xml
, you will find that xpath needs to be specified with a //
prefix. You may or may not specify .
period before slashes.
So your code should work fine with the following change
df_employee = pd.read_xml(filename,xpath='//employee_name')
name date email
0 ndlkjfidm dfhkryi NaN
df_cor = pd.read_xml(filename,xpath='//employee_name/email')
name P V T id
0 nnn ffgnp 0.825 125c NaN
df_id = pd.read_xml(filename,xpath='//employee_name/email/id')
name address
0 stack NaN
df_address_stack = pd.read_xml(filename,xpath='//employee_name/email/id[contains(@name,"stack")]//address')
name type street
0 adas_jk3 entry NaN
which gives us the expected output
CodePudding user response:
Use:
df_id = pd.read_xml('employee.xml',xpath="//employee_name/email/id[contains(@name, 'stack')]")
Output:
name address
0 stack NaN