Sort JSON with PHP based on aggregated nested values-CodePudding

I need to:

Sort the arrays by the aggregate price of each combination.
Return only 40% of them with the highest aggregate price.

$combinations = '[
    [                          //1st combination
        {"id":1,"price":11900},
        {"id":2,"price":499},
        {"id":3,"price":2099}
    ],
    [                          //2nd combination
        {"id":1,"price":11900},
        {"id":2,"price":499},
        {"id":4,"price":999}
    ],
    [                          //3rd combination
        {"id":1,"price":11900},
        {"id":2,"price":499},
        {"id":5,"price":899}
    ],
    [                          //4th combination
        {"id":1,"price":11900},
        {"id":2,"price":499},
        {"id":6,"price":2999}
    ]
]';

CodePudding user response：

<?php

$json = json_decode('[
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":3,"price":2099}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":4,"price":999}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":5,"price":899}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":6,"price":2999}]
    ]');

// var_dump($json);

// ($a, $b) for ASC sorting
// ($b, $a) for DESC sorting
usort($json, function ($b, $a) {

    $a_prices = 0;
    foreach($a as $aa)
        $a_prices  = $aa->price;

    $b_prices = 0;
    foreach($b as $bb)
        $b_prices  = $bb->price;

    return $a_prices - $b_prices;
});

// Find where 40% stops
// It is up to you to choose between round(), ceil() or floor()
$breakpoint = round(sizeof($json) * 40 / 100);

$sorted_chunk = array_slice($json, 0, $breakpoint);
var_dump($sorted_chunk);

CodePudding user response：

While the answer by @medilies is simple and correct, here's a more economical way to sort the data. If we're working with a large dataset, a straight-up usort can turn out rather expensive — as the comparison values have to be re-calculated for each $a vs. $b comparison. We can instead calculate the sums up-front and use the "cached" values for our comparison.

// Here's the data; decoded as an array:

$json = json_decode('[
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":3,"price":2099}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":4,"price":999}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":5,"price":899}],
    [{"id":1,"price":11900},{"id":2,"price":499},{"id":6,"price":2999}]
    ]', true);
    
// Calculate the sums for all prices per row up-front.
// Map array into sums: Get the sum for each row's "price" columns

$sums = array_map(fn($v) => array_sum(array_column($v, 'price')), $json);

// Use $sums in our key-based sorter for the comparison values:

uksort($json, function($b, $a) use ($sums) {
    return $sums[$a] <=> $sums[$b];
});

// See the sums, get the sorted data:

var_dump($sums, $json);

Here we use uksort instead of usort, since we only need to know the keys of the array members being sorted. Our "comparison cache" or the $sums array, with keys matching the target array, is passed with use() into the sorting function. Inside the function, we simply compare matching values in $sums[$a] and $sums[$b], not repeating the sum calculations. Demo: https://3v4l.org/sNluJ#v8.1.3

In this case it'd take a large dataset to make a significant difference. If there were more expensive iterations (say, multiple "heavy" function calls) required to arrive at the value to compare, the "up-front and only once" evaluation would save a lot of unnecessary computing cycles.

On returning the final top 40% result OP wants, please refer to the accepted answer.