First of all, I have following data returned from database. I will have two different data as below respectively
sum1 | count1 | sm__state_name__ | om__order_date__year | om__order_date__quarter | om__order_date__month |
---|---|---|---|---|---|
5645000 | 4 | Luanda | 2017 | 3 | 8 |
213985939.8600001 | 1606 | Luanda | 2017 | 3 | 9 |
7729331.52 | 119 | Benguela | 2017 | 3 | 9 |
1012936 | 17 | Zaire | 2017 | 3 | 9 |
1054883 | 19 | Bie | 2017 | 3 | 9 |
2347944 | 26 | Cuando Cubango | 2017 | 3 | 9 |
428769.6000000001 | 60 | Bengo | 2017 | 3 | 9 |
6444569 | 86 | Huila | 2017 | 3 | 9 |
4914030 | 25 | Cunane | 2017 | 3 | 9 |
1167200 | 26 | Cuanza North | 2017 | 3 | 9 |
750080 | 10 | Cuanza Sul | 2017 | 3 | 9 |
2178100 | 6 | Huambo | 2017 | 3 | 9 |
1099934 | 25 | Lunda North | 2017 | 3 | 9 |
410135 | 12 | Malange | 2017 | 3 | 9 |
In array format
array (
0 =>
array (
'sum1' => '5645000',
'count1' => '4',
'sm__state_name__' => 'Luanda',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '8',
),
1 =>
array (
'sum1' => '213985939.8600001',
'count1' => '1606',
'sm__state_name__' => 'Luanda',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
2 =>
array (
'sum1' => '352839.60000000003',
'count1' => '9',
'sm__state_name__' => NULL,
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
3 =>
array (
'sum1' => '7729331.52',
'count1' => '119',
'sm__state_name__' => 'Benguela',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
4 =>
array (
'sum1' => '1012936',
'count1' => '17',
'sm__state_name__' => 'Zaire',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
5 =>
array (
'sum1' => '1054883',
'count1' => '19',
'sm__state_name__' => 'Bie',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
6 =>
array (
'sum1' => '2347944',
'count1' => '26',
'sm__state_name__' => 'Cuando Cubango',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
7 =>
array (
'sum1' => '428769.6000000001',
'count1' => '60',
'sm__state_name__' => 'Bengo',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
8 =>
array (
'sum1' => '6444569',
'count1' => '86',
'sm__state_name__' => 'Huila',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
9 =>
array (
'sum1' => '4914030',
'count1' => '25',
'sm__state_name__' => 'Cunane',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
10 =>
array (
'sum1' => '1167200',
'count1' => '26',
'sm__state_name__' => 'Cuanza North',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
11 =>
array (
'sum1' => '750080',
'count1' => '10',
'sm__state_name__' => 'Cuanza Sul',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
12 =>
array (
'sum1' => '2178100',
'count1' => '6',
'sm__state_name__' => 'Huambo',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
13 =>
array (
'sum1' => '1099934',
'count1' => '25',
'sm__state_name__' => 'Lunda North',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
14 =>
array (
'sum1' => '410135',
'count1' => '12',
'sm__state_name__' => 'Malange',
'om__order_date__year' => '2017',
'om__order_date__quarter' => '3',
'om__order_date__month' => '9',
),
)
Array
(
"sm__state_name__",
"om__order_date__year",
"om__order_date__quarter",
"om__order_date__month",
)
ABOBE ARRAY INCLUDES CAN BE ANY NUMBER OF FIELDS
Below is sample output of What I want in return
{
"data": [
{
"key": "Luanda",
"items": [
{
"key": 2017,
"items": [
{
"key": 3,
"items": [
{
"key": 8,
"items": null,
"count": 4,
"summary": [
438380.9935
]
},
{
"key": 9,
"items": null,
"count": 1606,
"summary": [
438380.9935
]
},
],
"summary": [
1285085.9636
]
}
],
"summary": [
1285085.9636
]
}
],
"summary": [
1285085.9636
]
},
{
"key": "Benguela",
"items": [
{
"key": 2017,
"items": [
{
"key": 3,
"items": [
{
"key": 9,
"items": null,
"count": 679,
"summary": [
4781987.8575
]
},
],
"summary": [
15017212.0305
]
}
],
"summary": [
15017212.0305
]
}
],
"summary": [
15017212.0305
]
},
{...},
{...},
{...},
],
"totalCount": 22854
}
Don't mind the summary value. I just put dummy values there.
Is this kind of process even possible? Because I think of many different things, recursion, multiple loops, triple loop but couldn't think of way this could work.
I know it's not an issue or bug. Sorry for that. But it would be great if someone could point me to right direction.
CodePudding user response:
Ok, so basically if I understand this correctly, you have a database with a bunch of records. Then you want to create some massive JSON based off the database that will have 4 dimensions based on the columns for the state, year, quarter, and month. Then the outer-most array will contain the records of the database grouped by the state, then the items for any specific state will further constrict into a set of items based on the key for the year, etc.
I'm thinking the most efficient way to do this is a single loop where you'd pull all the records in the database then loop through it in PHP and construct some new arrays based on the current record iteration which can be referred to later for look-up purposes.
Begin by setting:
$dataItems = [];
This would be used for storing and structuring data for easily look up and calculations.
Iterate through the list. The first item would be:
sum1 | count1 | state_name | order_year | order_quarter | order_month |
---|---|---|---|---|---|
5645000 | 4 | Luanda | 2017 | 3 | 8 |
Then you'd run code for the iteration that would look something like this to help populate the array or arrays you'd be constructing:
if (empty($dataItems['Luanda'])) {
$dataItems['Luanda'] = [];
}
if (empty($dataItems['Luanda'][2017])) {
$dataItems['Luanda'][2017] = [];
}
if (empty($dataItems['Luanda'][2017][3])) {
$dataItems['Luanda'][2017][3] = [];
}
if (empty($dataItems['Luanda'][2017][3][8])) {
$dataItems['Luanda'][2017][3][8] = [];
}
$dataItems['Luanda'][2017][3][8][] = ['sum1' => 5645000, 'count1' => 4];
The second item would be:
sum1 | count1 | state_name | order_year | order_quarter | order_month |
---|---|---|---|---|---|
213985939.8600001 | 1606 | Luanda | 2017 | 3 | 9 |
The PHP code for this iteration would look like this:
if (empty($dataItems['Luanda'])) {
$dataItems['Luanda'] = [];
}
if (empty($dataItems['Luanda'][2017])) {
$dataItems['Luanda'][2017] = [];
}
if (empty($dataItems['Luanda'][2017][3])) {
$dataItems['Luanda'][2017][3] = [];
}
if (empty($dataItems['Luanda'][2017][3][9])) {
$dataItems['Luanda'][2017][3][9] = [];
}
$dataItems['Luanda'][2017][3][9][] = ['sum1' => 213985939.8600001, 'count1' => 1606];
Etc.
Then you'd loop through the $dataItems
structure once it's built and do stylizing logic like having a key called "items", etc and create your desired output structure, then finally output in JSON format by using json_encode.
If you need something like the count for the entire year, in your initial loop, you can write to a separate array to help keep track of it, adding it together along the way, then refer to it later when creating an array with the desired output structure. Things like summary you can easily add into the loop and keep track of that along the way.
Sounds like some fun, but yeah you'd only need one loop to create a lookUp or multiple lookUp sort of arrays then a second loop to refer to the loopUp array(s) to get your data in the desired output format.
CodePudding user response:
With a variable array of columns you need to group in hierarchical order, you'll certainly want a recursive solution to this problem. For each step in your recursive calls, check to see if a particular grouping level exists yet, and if not, then initialize it. Group using associative arrays for your items, then convert to flat arrays after. It's very simple conceptually, although perhaps a little confusing to look at:
function aggregateData($data, $db_row, $columns, $first_column = true) {
// Base case: with no more columns left, we just take the sum and return.
if(empty($columns)) {
$data['summary'] = $db_row['sum1'];
return $data;
}
$column = array_shift($columns);
$value = $db_row[$column];
if($first_column) {
// First column is a special case. We don't add anything here because every level's summary is the sum of its nested items.
if(!array_key_exists($value, $data)) {
$data[$value] = [
'key'=>$value,
'items'=>empty($columns) ? null : [],
'summary'=>0
];
}
$data[$value] = aggregateData($data[$value], $db_row, $columns, false);
} else {
// For all other columns, we add the sum to each nested level.
if(!array_key_exists($value, $data['items'])) {
$data['items'][$value] = [
'key'=>$value,
'items'=>empty($columns) ? null : [],
'summary'=>0
];
}
$data['summary'] = $db_row['sum1'];
$data['items'][$value] = aggregateData($data['items'][$value], $db_row, $columns, false);
}
return $data;
}
function flattenData($data) {
foreach($data as $key=>$value) {
if(is_null($value['items'])) {
break;
}
$data[$key]['items'] = flattenData($value['items']);
}
return array_values($data);
}
$db_rows = /* your DB retrieval code here */;
$columns = /* columns to group by in hierarchical order */;
$data = [];
foreach($db_rows as $db_row) {
$data = aggregateData($data, $db_row, $columns);
}
$data = flattenData($data);
To help understand what's going on, consider the top-most level, grouping by state. After the aggregateData()
calls, before flattening the arrays, it will produce a structure that looks like the following:
{
"Luanda": {
"key": "Luanda",
"items": {...},
"summary": ...,
},
"Benguela": {
"key": "Benguela",
"items": {...},
"summary": ...,
}
}
Notice that because each entry is associated with its key in an object, instead of an index in an array, this allows for easy lookups so we can aggregate information at each level. After flattening, we instead get this:
[
{
"key": "Luanda",
"items": [...],
"summary": ...,
},
{
"key": "Benguela",
"items": [...],
"summary": ...,
}
]
Each entry is no longer associated with its key, instead being the desired flat array. We lose the ability to do simple lookups, but we no longer need that capability at the end of our calculations.
The above doesn't solve the entirety of your problem as there are points of data not being included in this result, but as stackoverflow is not a free coding service and you have not provided any of your own code, I will be leaving the necessary modifications as an exercise. This should, however, remove the bulk of the work required and serve as a strong starting point for your solution.