Background
Third-party software libraries sometimes include a set of licenses developers may choose when using the library. Some licenses can be detected using ORT to produce an SPDX identifier. We'd like to eliminate the manual process of selecting particular licenses each time, deferring instead to a predefined, prioritized list.
Not all licenses are included in the list.
Code
This section defines the source files.
priorities.xml
The order of entries determines the license to choose when a selection is available:
<priorities>
<license>Apache-2.0</license>
<license>MIT</license>
<license>BSD-2-Clause</license>
<license>BSD-3-Clause</license>
<license>CDDL-1.0</license>
<license>EPL-2.0</license>
<license>MPL-2.0</license>
<license>LGPL-3.0</license>
</priorities>
This document is loaded using:
<xsl:variable
name="PRIORITY"
select="document( resolve-uri( 'priorities.xml', base-uri( / ) ) )" />
libraries.xml
These simplified license entries form the input document:
<copyrights>
<copyright>
<title>Grizzly HTTP framework</title>
<licenses>
<license>CDDL-1.0</license>
<license>GPL-2.0-or-later</license>
</licenses>
</copyright>
<copyright>
<title>Java™ JSON Tools Jackson Coreutils</title>
<licenses>
<license>LGPL-3.0-or-later</license>
<license>Apache-2.0</license>
</licenses>
</copyright>
<copyright>
<title>Javassist</title>
<licenses>
<license>LGPL-2.1-only</license>
<license>MPL-1.1</license>
<license>Apache-2.0</license>
</licenses>
</copyright>
<copyright>
<title>Linux Kernel</title>
<licenses>
<license with="Linux-syscall-note" order="before">GPL-2.0-only</license>
</licenses>
</copyright>
<copyright>
<title>Eclipse Temurin™</title>
<licenses>
<license with="Classpath-exception-2.0">GPL-2.0-only</license>
</licenses>
</copyright>
</copyrights>
Output
The desired output reduces the licenses to a single entry:
<copyrights>
<copyright>
<title>Grizzly HTTP framework</title>
<licenses>
<license>CDDL-1.0</license>
</licenses>
</copyright>
<copyright>
<title>Java™ JSON Tools Jackson Coreutils</title>
<licenses>
<license>Apache-2.0</license>
</licenses>
</copyright>
<copyright>
<title>Javassist</title>
<licenses>
<license>Apache-2.0</license>
</licenses>
</copyright>
<copyright>
<title>Linux Kernel</title>
<licenses>
<license with="Linux-syscall-note" order="before">GPL-2.0-only</license>
</licenses>
</copyright>
<copyright>
<title>Eclipse Temurin™</title>
<licenses>
<license with="Classpath-exception-2.0">GPL-2.0-only</license>
</licenses>
</copyright>
</copyrights>
Problem
Conceptually, I'd like to filter out the licenses based on the position of each input license in the priorities list, then select the first. As a series of transforms the relevant section of the input document might start as:
<licenses>
<license>LGPL-2.1-only</license>
<license>MPL-1.1</license>
<license>Apache-2.0</license>
</licenses>
Then we could assign the priority based on the position in the priorities list:
<licenses>
<license priority="INFINITY">LGPL-2.1-only</license>
<license priority="7">MPL-1.1</license>
<license priority="1">Apache-2.0</license>
</licenses>
Then sort based on priority:
<licenses>
<license priority="1">Apache-2.0</license>
<license priority="7">MPL-1.1</license>
<license priority="INFINITY">LGPL-2.1-only</license>
</licenses>
Then select the first child:
<licenses>
<license priority="1">Apache-2.0</license>
</licenses>
I believe this "algorithm" would ensure that any license that doesn't have a corresponding priority would be selected by default.
Constraints
These constraints will have been met before the transformation step occurs:
It is an error to have multiple licenses present in the input document without at least one match in the license priorities listing. I don't think we can realistically select the first license in such cases.
It is an error for each entry in the input document to not have a license (i.e., we can enforce this using schema validation).
Question
What would be an expedient way to filter the licenses using XSLT 3.0?
-or-
How would you inject the priority
attribute as shown in the algorithmic steps?
CodePudding user response:
Here's my suggestion.
To simplify my test I defined the PRIORITY
variable inline instead of using document()
but of course you can stick with your approach of reading it from an external file.
An explanation:
The priority XML document is converted to a map in which the keys are the license names and the values are the integer position of that license in the priority list. So the key 'Apache-2.0'
has the value 1
, etc.
The template matching licenses
copies only one of the license
child elements.
First it uses the XPath 3 sort
function to sort the licenses. The last parameter to that function is a function which maps an item to a sort key; the supplied function looks up the item (i.e. the license name) in the $license-priority
map, and returns the resulting priority integer, or if it's not found in the map, it returns infinity.
Then the first (highest priority) license from that sorted sequence is copied.
<xsl:stylesheet
version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:map="http://www.w3.org/2005/xpath-functions/map">
<xsl:variable name="PRIORITY">
<priorities>
<license>Apache-2.0</license>
<license>MIT</license>
<license>BSD-2-Clause</license>
<license>BSD-3-Clause</license>
<license>CDDL-1.0</license>
<license>EPL-2.0</license>
<license>MPL-2.0</license>
<license>LGPL-3.0</license>
</priorities>
</xsl:variable>
<xsl:variable name="license-priority" select="
map:merge(
$PRIORITY//license ! map:entry(., position())
)
"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="licenses" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:copy>
<xsl:copy-of select="
sort(
license,
(),
function($license) {
($license-priority($license), xs:double('INF'))[1]
}
)[1]
"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
CodePudding user response:
Isn't it just
<xsl:template match="copyright/licenses">
<licenses>
<xsl:copy-of
select="doc('priorities.xml')
//license[. = current()/license][1]"/>
</licenses>
</xsl:template>
or have I missed something?
This is taking advantage of the ability to do a one-to-many comparison using "=".