I'm trying to parse an Visual Studio project file with lxml and Python 2.7. However, no matter what I do, I cannot get the xpath() function to return anything besides an empty list. I even pretty printed my etree right before calling xpath() to make sure everything in the etree looked good.
Here is an example of one of the many xpath paths I've tried
v = self.tree.xpath('/Project/ItemDefinitionGroup[1]/Link/LinkerScript')
And here is a snippet of the Visual Studio project file:
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|VisualGDB">
<Configuration>Debug</Configuration>
<Platform>VisualGDB</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|VisualGDB">
<Configuration>Release</Configuration>
<Platform>VisualGDB</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<VCProjectVersion>16.0</VCProjectVersion>
<ProjectGuid>{52B4E371-970C-43AA-AE3C-3D3C44EB7627}</ProjectGuid>
<BSP_ID>com.sysprogs.arm.stm32</BSP_ID>
<BSP_VERSION>2021.02</BSP_VERSION>
<InPlaceBSPSubdir />
<RelativeBSPPath />
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Label="Configuration" Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
</PropertyGroup>
<PropertyGroup Label="Configuration" Condition="'$(Configuration)|$(Platform)'=='Release|VisualGDB'">
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="Shared">
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
<GNUConfigurationType>Debug</GNUConfigurationType>
<ToolchainID>e368e833-a86e-4937-91b5-de07ceafe604</ToolchainID>
<ToolchainVersion>10.3.1/(GNU/r0</ToolchainVersion>
<MCUPropertyListFile>$(ProjectDir)stm32.props</MCUPropertyListFile>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|VisualGDB'">
<ToolchainID>e368e833-a86e-4937-91b5-de07ceafe604</ToolchainID>
<ToolchainVersion>10.3.1/(GNU/r0</ToolchainVersion>
<MCUPropertyListFile>$(ProjectDir)stm32.props</MCUPropertyListFile>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
<ClCompile>
<AdditionalIncludeDirectories></AdditionalIncludeDirectories>
<PreprocessorDefinitions></PreprocessorDefinitions>
<AdditionalOptions>-fms-extensions</AdditionalOptions>
<CLanguageStandard>C11</CLanguageStandard>
<CPPLanguageStandard />
<ForcedIncludeFiles>..\Source\Assert\Assert.h;%(ForcedIncludeFiles)</ForcedIncludeFiles>
<CharSign>Unsigned</CharSign>
</ClCompile>
<Link>
<LibrarySearchDirectories>%(Link.LibrarySearchDirectories)</LibrarySearchDirectories>
<AdditionalLibraryNames>%(Link.AdditionalLibraryNames)</AdditionalLibraryNames>
<AdditionalLinkerInputs>%(Link.AdditionalLinkerInputs)</AdditionalLinkerInputs>
<AdditionalOptions>-specs=nano.specs -specs=nosys.specs -lc -lm</AdditionalOptions>
<GenerateMapFile>true</GenerateMapFile>
<MapFileName>Project.map</MapFileName>
<LinkerScript>STM32F437VI_flash.lds</LinkerScript>
</Link>
</ItemDefinitionGroup>
</Project>
There's more to the file, but the file name in the second to last line <LinkerScript>STM32F437VI_flash.lds</LinkerScript>
is what I'm trying to get.
I've tried making my own paths as well as getting some generated ones from: Online xpath Ganerator
I've tried the simplest xpaths I can think of, but xpath() still returns nothing but an empty list. Does anyone have any ideas what could be going on?
CodePudding user response:
The below seems to work (no external lib is used - just ElementTree).
The idea is use the namespace as part of the search string.
Read more here.
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|VisualGDB">
<Configuration>Debug</Configuration>
<Platform>VisualGDB</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|VisualGDB">
<Configuration>Release</Configuration>
<Platform>VisualGDB</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<VCProjectVersion>16.0</VCProjectVersion>
<ProjectGuid>{52B4E371-970C-43AA-AE3C-3D3C44EB7627}</ProjectGuid>
<BSP_ID>com.sysprogs.arm.stm32</BSP_ID>
<BSP_VERSION>2021.02</BSP_VERSION>
<InPlaceBSPSubdir />
<RelativeBSPPath />
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Label="Configuration" Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
</PropertyGroup>
<PropertyGroup Label="Configuration" Condition="'$(Configuration)|$(Platform)'=='Release|VisualGDB'">
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="Shared">
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
<GNUConfigurationType>Debug</GNUConfigurationType>
<ToolchainID>e368e833-a86e-4937-91b5-de07ceafe604</ToolchainID>
<ToolchainVersion>10.3.1/(GNU/r0</ToolchainVersion>
<MCUPropertyListFile>$(ProjectDir)stm32.props</MCUPropertyListFile>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|VisualGDB'">
<ToolchainID>e368e833-a86e-4937-91b5-de07ceafe604</ToolchainID>
<ToolchainVersion>10.3.1/(GNU/r0</ToolchainVersion>
<MCUPropertyListFile>$(ProjectDir)stm32.props</MCUPropertyListFile>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|VisualGDB'">
<ClCompile>
<AdditionalIncludeDirectories></AdditionalIncludeDirectories>
<PreprocessorDefinitions></PreprocessorDefinitions>
<AdditionalOptions>-fms-extensions</AdditionalOptions>
<CLanguageStandard>C11</CLanguageStandard>
<CPPLanguageStandard />
<ForcedIncludeFiles>..\Source\Assert\Assert.h;%(ForcedIncludeFiles)</ForcedIncludeFiles>
<CharSign>Unsigned</CharSign>
</ClCompile></ItemDefinitionGroup>
<Link>
<LibrarySearchDirectories>%(Link.LibrarySearchDirectories)</LibrarySearchDirectories>
<AdditionalLibraryNames>%(Link.AdditionalLibraryNames)</AdditionalLibraryNames>
<AdditionalLinkerInputs>%(Link.AdditionalLinkerInputs)</AdditionalLinkerInputs>
<AdditionalOptions>-specs=nano.specs -specs=nosys.specs -lc -lm</AdditionalOptions>
<GenerateMapFile>true</GenerateMapFile>
<MapFileName>Project.map</MapFileName>
<LinkerScript>STM32F437VI_flash.lds</LinkerScript>
</Link>
</Project>'''
root = ET.fromstring(xml)
print(root.find('.//{http://schemas.microsoft.com/developer/msbuild/2003}LinkerScript').text)
output
STM32F437VI_flash.lds
CodePudding user response:
After going through some of the other queries that I needed to write for this parser, I figured out the solution I was looking for using xpath(). Thanks to the comment from @mzjn. There might be more eloquent ways to do this, but here it is:
The namespace reference was the big thing I was missing. This post is what I based my solution on: how to query xml data with namespaces using xpath in python
I defined my namespace and then added the namespace prefix on the front of each XML tag in my query.
ns = {'n': 'http://schemas.microsoft.com/developer/msbuild/2003'}
v = self.tree.xpath('/n:Project/n:ItemDefinitionGroup[1]/n:Link/n:LinkerScript', namespaces=ns)
Works like a charm, and it was easy to adapt for more complex queries!
Thanks also to @balderman whose answer helped me make some progress after being stuck for a while.