Home > Net >  Can I validate that nodes exist that edges in a graph point to with JSON SCHEMA?
Can I validate that nodes exist that edges in a graph point to with JSON SCHEMA?

Time:12-01

I want to describe a network graph of vertices and edges with JSON Schema.

An example JSON could look like this:

"V":["1","2","3"],
"E":[{
    "v1":"1",
    "v2":"2"
  },{
    "v1":"2",
    "v2":"3"
  }

I have a set of 3 vertices and 2 edges to connect them. I want all vertices to have an arbitrary string identifier, so it could also be "node1" or "panda". However, is there a way to validate that the endpoints of my edges only point to existing vertices?

I.e.: Should NOT pass:

"V":["n1","n2","n3"],
"E":[{
    "v1":"n1",
    "v2":"IdThatDoesNotExistAbove"
  }

I looked at ENUMs, however, I struggle to have them point at data from a JSON that I want to validate rather than to the specification itself.

CodePudding user response:

With jq this task can be solved.

jq -r '([.E[] | to_entries[].value] | unique) - .V |
       if length == 0
       then "all vertices defined"
       else "undefined vertices: \(.)\n" | halt_error(1)
       end
' "$FILE"
echo "exit code: $?"

Output valid file

all vertices defined
exit code: 0

Output invalid file

undefined vertices: ["IdThatDoesNotExistAbove"]
exit code: 1

If you are not interested which vertices are undefined you can use a shorter version

jq -e '([.E[] | to_entries[].value]) - .V | length == 0' "$FILE"
echo "exit code: $?"

Output valid file

true
exit code: 0

Output invalid file

false
exit code: 1

CodePudding user response:

JSON Schema doesn't define a way to reference data like this, but it does have extension vocabularies, which allow the definition of custom keywords. I have created a data vocabulary that does precisely what you're looking to do.

{
    "$schema": "https://json-everything.net/meta/data-2022",
    "type": "object",
    "$defs": {
        "user-defined-vertex": {
            "data": {
                "enum": "/V"
            }
        }
    },
    "properties": {
        "V": {
            "type": "array",
            "items": {"type": "string"}
        },
        "E": {
            "type": "array",
            "items": {
                "type": "object",
                "properties":{
                    "v1": { "$ref": "#/$defs/user-defined-vertex" },
                    "v2": { "$ref": "#/$defs/user-defined-vertex" }
                },
                "required": ["v1", "v2"],
                "additionalProperties": false
            }
        }
    },
    "additionalProperties": false
}

The key part of this is the data keyword in #/$defs.

data takes an object with schema keywords as keys and JSON Pointers or URIs as values. If you want to extract values from the instance data, you'll use JSON Pointers. For anything else, you'll use a URI.

So for this case, I have

{
    "data": {
        "enum": "/V"
    }
}

which says to take the value from /V in the instance data and use that as the value for the enum keyword.

In #/properties/V you define that /V must be an array with string values.


However, to my knowledge, this vocabulary is only implemented for my library, JsonSchema.Net and you'll need the extension package JsonSchema.Net.Data.

  • Related