Home > database >  Group rows of data by subarray in column and create subsets with variable depth
Group rows of data by subarray in column and create subsets with variable depth

Time:11-09

I have an array with two items, which are also arrays themselves: product and countries.

There are cases in which the countries array is the same for more than one product, like basic and pro in the example below.


Given this array:

$array = [

    [
        'product'   => [
            'value' => 'basic',
            'label' => 'Basic'
        ],
        'countries'  => [
            'Japan', // these
            'Korea'  // two...
        ],
    ],

    [
        'product'   => [
            'value' => 'pro',
            'label' => 'Pro'
        ],
        'countries'  => [
            'Japan', // ...and these two
            'Korea'  // are identical...
        ],
    ],

    [
        'product'   => [
            'value' => 'expert',
            'label' => 'Expert'
        ],
        'countries'  => [
            'Japan',
            'France'
        ],
    ]

];

I would like to create new arrays grouped by countries, more precisely,

this is the result I'm after:

$array = [

    [
        'product'   => [
            [
                'value' => 'basic',
                'label' => 'Basic'
            ],
            [
                'value' => 'pro',
                'label' => 'Pro'
            ]
        ],
        'countries'  => [
            'Japan', // ...so they are now one single array
            'Korea'  // as the two products 'basic' and 'pro' have been grouped
        ],
    ],

    [
        'product'   => [
            'value' => 'expert',
            'label' => 'Expert'
        ],
        'countries'  => [
            'Japan',
            'France'
        ],
    ]

];

As you can see in the second snippet, what I'm trying to do is to group basic and pro together in the same array, since they both share the exact same countries (Korea and Japan).

I've been trying for days to play around with this code, but it only seems to work if product and countries are strings rather than arrays:

$grouped = array();
foreach ($array as $element) {
    $grouped[$element['countries']][] = $element;
}
var_dump($grouped);

CodePudding user response:

This might be what you want

$productsByCountrySet = [];
foreach ($array as $product) {
    $countries = $product['countries'];
    sort($countries);
    $countrySet = implode('/', $countries);
    if (isset($productsByCountrySet[$countrySet])) {
        $productsByCountrySet[$countrySet]['product'][] = $product['product'];
    } else {
        $productsByCountrySet[$countrySet] = [
            'product' => [$product['product']],
            'countries' => $countries,
        ];
    }
}
$products = [];
foreach ($productsByCountrySet as $p) {
    if (count($p['product']) == 1) {
        $p['product'] = $p['product'][0];
    }
    $products[] = $p;
}
print_r($products);

It produces the output you're aiming for. It assumes that the order of countries is not significant (ie ['Japan', 'Korea'] is the same as ['Korea', 'Japan'])

It works by turning your countries array into a string (['Japan', 'Korea'] becomes 'Japan/Korea'), then uses that as a unique key for the entries. It builds up the desired output array by first assembling the unique key (I called it 'country set') and then checking if it has already been seen. If it has, the product is appended, if not, a new item is added to the output array.

The final section handles the case where there is only one product for a country set. We loop and catch this state, modifying the output accordingly.

CodePudding user response:

I personally would not build the result structure that you are seeking because it would make the array processing code more convoluted, but hey, it's your project.

You need to establish consistent, first-level string keys in your result array so that you can determine if a set of countries has been encountered before.

If never encountered, save the full row data to the group.

If encountered, specifically, for a second time, you need to restructure the group's data (this is the elseif() logic).

If encountered more than twice, you can safely push the product's row data as a new child of the deeper structure.

Code: (Demo)

$result = [];
foreach ($array as $row) {
    sort($row['countries']);
    $compositeKey = implode('_', $row['countries']);
    if (!isset($result[$compositeKey])) {
        $result[$compositeKey] = $row;
    } elseif (isset($result[$compositeKey]['product']['value'])) {
        $result[$compositeKey]['product'] = [
            $result[$compositeKey]['product'],
            $row['product']
        ];
    } else {
        $result[$compositeKey]['product'][] = $row['product'];
    }
}
echo json_encode(array_values($result), JSON_PRETTY_PRINT);

This general approach is efficient and direct because it only makes one pass over the array of data.

See my related answer: Group array row on one column and form subarrays of varying depth/structure

CodePudding user response:

<?php

$array = [

    [
        'product'   => [
            'value' => 'basic',
            'label' => 'Basic'
        ],
        'countries'  => [
            'Japan', // these
            'Korea'  // two...
        ],
    ],

    [
        'product'   => [
            'value' => 'pro',
            'label' => 'Pro'
        ],
        'countries'  => [
            'Japan', // ...and these two
            'Korea'  // are identical...
        ],
    ],

    [
        'product'   => [
            'value' => 'expert',
            'label' => 'Expert'
        ],
        'countries'  => [
            'Japan',
            'France'
        ],
    ]

];

// print(serialize($array));

$newarr = [];

//Here I am sorting the countries so that it can be compared and making a new array
foreach ($array as $key) {
    $new = $key['countries'];
    sort($key['countries']);
    sort($key['product']);
    $newarr[] = $key;
}

$result = [];
foreach($newarr as $key => $value) {

    //Genetraing a unique key for each array type so that it can be compared
    $ckey = md5(serialize($value['countries']));
    $pkey = md5(serialize($value['product']));

    //In the new array, the unique Countries key is used to generate a new array which will contain the product & countries
    $result[$ckey]['product'][$pkey] = $value['product'];

    //Product key is used to reduce redunant entires in product array 
    $result[$ckey]['countries'] = $value['countries'];

    //This new loop is used to compare other arrays and group them together
    foreach($newarr as $key2 => $value2) {
        if($key != $key2 && $value['countries'] == $value2['countries']) {
            $result[$ckey]['product'][$pkey] = $value2['product'];
        }
    }
}


print_r($result);

And the output is

Array
(
    [00a9d5d0be04135916148f84706a2073] => Array
        (
            [product] => Array
                (
                    [1c24c036cffc896aebf291da101ff88d] => Array
                        (
                            [0] => Pro
                            [1] => pro
                        )

                    [712ef34513bad5c6dd490337c22a5807] => Array
                        (
                            [0] => Basic
                            [1] => basic
                        )

                )

            [countries] => Array
                (
                    [0] => Japan
                    [1] => Korea
                )

        )

    [ae57f65be4cd65148d6f4ed3def12c8f] => Array
        (
            [product] => Array
                (
                    [be5b95a64169e073ed0b6a72dfb79a83] => Array
                        (
                            [0] => Expert
                            [1] => expert
                        )

                )

            [countries] => Array
                (
                    [0] => France
                    [1] => Japan
                )

        )

)

This way is little hacky and not the fastest solution but a working one. Since you don't have to do complex array operations it has unique keys rather than an index, that makes the process easy.l

Please read the code comments, I said how it works.

CodePudding user response:

I have a solution in which I create a new array for the result called $newArray and I put the first element from $array into it. I then loop through each element in $array (except for the first one which I exclude using its key). For each element in $array, I loop through each element in $newArray. If both country names are present in $newArray, I just add the product array to $newArray. If there is no element in $newArray with both countries from the $array element being considered then I add the full $array element to $newArray. It does give your required array given your input array.

I had to change the way the product array appears in $newArray which explains the second and third lines of code below.

The & in &$subNewArr has the effect that $subNewArr is 'passed by reference' which means that it can be altered by the code where it is being used (see https://www.php.net/manual/en/language.types.array.php).

$newArray = [$array[0]];
$newArray[0]['product'] = [];
$newArray[0]['product'][] = $array[0]['product'];

foreach($array as $key => $subArr){
  if($key > 0){
    foreach($newArray as &$subNewArr){
      if(
        in_array($subArr['countries'][0], $subNewArr['countries']) &&
        in_array($subArr['countries'][1], $subNewArr['countries'])
      ){
        array_push($subNewArr['product'], $subArr['product']);
        continue 2;
      }
    }
    $newArray[] = $subArr;
  }
}
  • Related