Home > Software design >  Merge two JSON objects recursively with proper handling of arrays
Merge two JSON objects recursively with proper handling of arrays

Time:06-07

I have to merge 2 jsons files in powershell but I'm having problems when it comes to a list

$j1 = Get-Content 'json1.json' | Out-String | ConvertFrom-Json
$j2 = Get-Content 'json2.json' | Out-String | ConvertFrom-Json
function Join-Objects($source, $extend){
    if($source.GetType().Name -eq "PSCustomObject" -and $extend.GetType().Name -eq "PSCustomObject"){
        foreach($Property in $source | Get-Member -type NoteProperty, Property){
            if($null -eq $extend.$($Property.Name)){
              continue;
            }
            $source.$($Property.Name) = Join-Objects $source.$($Property.Name) $extend.$($Property.Name)
        }
    }else{
       $source = $extend;
    }
    return $source
}

function AddPropertyRecurse($source, $toExtend){
    if($source.GetType().Name -eq "PSCustomObject"){
        foreach($Property in $source | Get-Member -type NoteProperty, Property){
            if($null -eq $toExtend.$($Property.Name)){
              $toExtend | Add-Member -MemberType NoteProperty -Value $source.$($Property.Name) -Name $Property.Name `
            }
            else{
               $toExtend.$($Property.Name) = AddPropertyRecurse $source.$($Property.Name) $toExtend.$($Property.Name)
            }
        }
    }
    return $toExtend
}
function Merge-Json($extend, $source){
    $merged = Join-Objects $extend $source
    $extended = AddPropertyRecurse $source $merged
    return $extended
}

Merge-Json $j1 $j2 | ConvertTo-Json -Depth 10

This is json1.file

{
    "AppConfig": {
        "Host": {
            "JobServer3Camadas": "true",
            "Port": "8050",
            "ApiPort": "8051",
            "HttpPort": "8051"
        },
        "RM": {
            "JobServer3Camadas": "true",
            "ActionsPath": "D:\\totvs\\CorporeRM\\RM.Net;D:\\totvs\\CorporeRM\\Corpore.Net\\Bin",
            "LibPath": "D:\\totvs\\CorporeRM\\RM.Net"
        },
        "Portal": {
            "JobServer3Camadas": "true",
            "ActionsPath": "D:\\totvs\\CorporeRM\\RM.Net;D:\\totvs\\CorporeRM\\Corpore.Net\\Bin",
            "LibPath": "D:\\totvs\\CorporeRM\\RM.Net"
        }
    },
    "DbConfig": {
        "AppServer": [
            {
                "Alias": "rapha139686",
                "DbType": "SqlServer"
            }
        ],
        "JobServer": [
            {
                "Alias": "rapha139686",
                "DbType": "SqlServer"
            }
        ]
    }
}

An this is the json2 file

{
    "AppConfig": {
        "Host": {
            "Teste": "AAA",
            "JobServer3Camadas": "false"
        },
        "RM": {
            "Teste": "AAA",
            "JobServer3Camadas": "false"
        },
        "Portal": {
            "Teste": "AAA",
            "JobServer3Camadas": "false"
        }
    },
    "DbConfig": {
        "AppServer": [
            {
                "Alias": "teste"
            }
        ],
        "JobServer": [
            {
                "Alias": "teste"
            }
        ]
    }
}

When the code runs I have the following return, on AppConfig the output is Ok, Teste tag is created and JobServer3Camadas changed. The problem occours on DbConfig returning:

{
    "AppConfig": {
        "Host": {
            "JobServer3Camadas": "false",
            "Port": "8050",
            "ApiPort": "8051",
            "HttpPort": "8051",
            "Teste": "AAA"
        },
        "RM": {
            "JobServer3Camadas": "false",
            "ActionsPath": "D:\\totvs\\CorporeRM\\RM.Net;D:\\totvs\\CorporeRM\\Corpore.Net\\Bin",
            "LibPath": "D:\\totvs\\CorporeRM\\RM.Net",
            "Teste": "AAA"
        },
        "Portal": {
            "JobServer3Camadas": "false",
            "ActionsPath": "D:\\totvs\\CorporeRM\\RM.Net;D:\\totvs\\CorporeRM\\Corpore.Net\\Bin",
            "LibPath": "D:\\totvs\\CorporeRM\\RM.Net",
            "Teste": "AAA"
        }
    },
    "DbConfig": {
        "AppServer": {
            "Alias": "teste"
        },
        "JobServer": {
            "Alias": "teste"
        }
    }
}

And this is the expected response:

 "DbConfig": {
        "AppServer": [
            {
                "Alias": "teste",
                "DbType": "SqlServer"
            }
        ],
        "JobServer": [
            {
                "Alias": "teste",
                "DbType": "SqlServer"
            }
        ]
    }

CodePudding user response:

As I already commented, you need to specifically handle the array type. Otherwise the else branches of your type checks will always overwrite arrays, without recursing into their elements. Furthermore, due to PowerShell's automatic enumeration of arrays, you end up with objects when the array consists of only a single element.

As I found it too difficult to modify your existing code I just rewrote it completely:

function Merge-Json( $source, $extend ){
    if( $source -is [PSCustomObject] -and $extend -is [PSCustomObject] ){

        # Ordered hashtable for collecting properties
        $merged = [ordered] @{}

        # Copy $source properties or overwrite by $extend properties recursively
        foreach( $Property in $source.PSObject.Properties ){
            if( $null -eq $extend.$($Property.Name) ){
                $merged[ $Property.Name ] = $Property.Value
            }
            else {
                $merged[ $Property.Name ] = Merge-Json $Property.Value $extend.$($Property.Name)
            }
        }

        # Add $extend properties
        foreach( $Property in $extend.PSObject.Properties ){
            if( $null -eq $source.$($Property.Name) ) {
                $merged[ $Property.Name ] = $Property.Value
            }
        }

        # Convert hashtable into PSCustomObject and output
        [PSCustomObject] $merged
    }
    elseif( $source -is [Collections.IList] -and $extend -is [Collections.IList] ){

        $maxCount = [Math]::Max( $source.Count, $extend.Count )

        [array] $merged = for( $i = 0; $i -lt $maxCount;   $i ){
            if( $i -ge $source.Count ) { 
                # extend array is bigger than source array
                $extend[ $i ]
            }              
            elseif( $i -ge $extend.Count ) {
                # source array is bigger than extend array
                $source[ $i ]
            }
            else {
                # Merge the elements recursively
                Merge-Json $source[$i] $extend[$i]
            }
        }

        # Output merged array, using comma operator to prevent enumeration 
        , $merged
    }
    else{
        # Output extend object (scalar or different types)
        $extend
    }
}
  • It is only a single function instead of separate join and add operations.
  • Instead of modifying the source object, the function outputs a completely new object. This makes the code much easier to reason about and avoids problems such as modifying properties that are currently being enumerated (that's why you needed two separate functions).
  • Instead of comparing type names, the -is operator is more idiomatic. It has the advantage of handling derived types as well. E. g. in the case of arrays I don't need to know the exact collection type (which is an implementation detail) used by ConvertFrom-Json, instead I check only for the basic IList interface, which is implemented by all array-like types, such as ArrayList or its generic List variant.
  • Note that I don't use the return statement anywhere. PowerShell implicitly outputs anything that is not assigned or piped/redirected. You only need the return statement to exit early from a function.
  • The $merged array is created automatically by capturing output of the for loop. I'm using the type constraint [array] to make sure we always get an array, even if it contains only a single element. Likewise, I'm using the comma operator in the , $merged line to prevent the array from being unrolled by PowerShell, which again prevents PowerShell from changing a 1-element-array into a single object.
  • PSObject is a hidden member available to all PowerShell objects which allows us to enumerate properties directly, instead of using Get-Member.
  • Related