Home > Blockchain >  How to iterate over json output and get the desirable result in dataframe?
How to iterate over json output and get the desirable result in dataframe?

Time:10-20

I am connecting to API & trying to fetch the data. Data extracted in JSON format but I want to convert it into dataframe with certain records. Could you please help me?

Example:

queue_id = 'a72dba75-0bc6-4a65-b120-8803364f8dc3'

queryResult: {'results': [{'data': [{'interval': '2021-10-11T11:46:25.000Z/2021-10-12T11:46:25.000Z',
                        'metrics': [{'metric': 'nOffered',
                                     'qualifier': None,
                                     'stats': {'count': 1,
                                               'count_negative': None,
                                               'count_positive': None,
                                               'current': None,
                                               'denominator': None,
                                               'max': None,
                                               'min': None,
                                               'numerator': None,
                                               'ratio': None,
                                               'sum': None,
                                               'target': None}}],
                        'views': None},
                       {'interval': '2021-10-13T11:46:25.000Z/2021-10-14T11:46:25.000Z',
                        'metrics': [{'metric': 'nOffered',
                                     'qualifier': None,
                                     'stats': {'count': 2,
                                               'count_negative': None,
                                               'count_positive': None,
                                               'current': None,
                                               'denominator': None,
                                               'max': None,
                                               'min': None,
                                               'numerator': None,
                                               'ratio': None,
                                               'sum': None,
                                               'target': None}}],
                        'views': None},
                       {'interval': '2021-10-14T11:46:25.000Z/2021-10-15T11:46:25.000Z',
                        'metrics': [{'metric': 'nOffered',
                                     'qualifier': None,
                                     'stats': {'count': 3,
                                               'count_negative': None,
                                               'count_positive': None,
                                               'current': None,
                                               'denominator': None,
                                               'max': None,
                                               'min': None,
                                               'numerator': None,
                                               'ratio': None,
                                               'sum': None,
                                               'target': None}}],
                        'views': None},
                       {'interval': '2021-10-15T11:46:25.000Z/2021-10-16T11:46:25.000Z',
                        'metrics': [{'metric': 'nOffered',
                                     'qualifier': None,
                                     'stats': {'count': 1,
                                               'count_negative': None,
                                               'count_positive': None,
                                               'current': None,
                                               'denominator': None,
                                               'max': None,
                                               'min': None,
                                               'numerator': None,
                                               'ratio': None,
                                               'sum': None,
                                               'target': None}}],
                        'views': None}],
              'group': {'mediaType': 'voice',
                        'queueId': '73643cff-799b-41ae-9a67-efcf5e593155'}}]}

I want result in below format in dataframe-

                queue_id            nOffered_count  nOffered_sum       interval_start   interval_end                       
0  a72dba75-0bc6-4a65-b120-8803364f8dc3   6         None        2021-10-11T11:46:25.000Z  2021-10-12T11:46:25.000Z
1  a72dba75-0bc6-4a65-b120-8803364f8dc3   1         None        2021-10-12T11:46:25.000Z  2021-10-13T11:46:25.000Z
2  a72dba75-0bc6-4a65-b120-8803364f8dc3  12         None        2021-10-13T11:46:25.000Z  2021-10-14T11:46:25.000Z
3  a72dba75-0bc6-4a65-b120-8803364f8dc3   6         None        2021-10-14T11:46:25.000Z  2021-10-15T11:46:25.000Z
4  a72dba75-0bc6-4a65-b120-8803364f8dc3   6         None        2021-10-15T11:46:25.000Z  2021-10-16T11:46:25.000Z

nOffered_sum & nOffered_count column name is coming like - metric_sum & metric_count

I tried below code but not giving proper result -

        column_names = []
        final_data_lst = []
        
        if(query_result.results != None):
            for item in query_result.results:
                data_lst = []
                for lst_data in item.data:
                    print("####################################")
                    print(lst_data)
                    print("####################################")
                    for met in lst_data.metrics:
                        metric_name = met.metric
                        column_names.append('Queue_Id')
                        column_names.append(metric_name '_count')
                        column_names.append(metric_name '_sum')
                        column_names.append('Interval Start')
                        column_names.append('Interval End')
                        data_lst.append(queue_id)
                        data_lst.append(met.stats.count)
                        data_lst.append(met.stats.sum)
                        data_lst.append(lst_data.interval.split('/')[0])
                        data_lst.append(lst_data.interval.split('/')[1])

CodePudding user response:

One of the approaches:

import pandas
from collections import defaultdict
queue_id = 'a72dba75-0bc6-4a65-b120-8803364f8dc3'
out = defaultdict(list)
for data in queryResult['results']:
    for d in data['data']:
        for k,v in d.items():
            out['queue_id'].append(queue_id)
            
            for metric in d['metrics']:
                out[metric['metric'] "_count"].append(metric['stats']['count'])
                out[metric['metric'] "_sum"].append(metric['stats']['sum'])
            interval = d['interval'].split('/')
            out['interval_start'].append(interval[0])
            out['interval_end'].append(interval[1])
df = pandas.DataFrame(out)
print (df)

CodePudding user response:

There were few mistakes in your code to fetch data from JSON but I solved it now you can Try the Below code and let me know if any help is needed.

import pandas as pd
queue_id_ls = []
nOffered_count = []
nOffered_sum=[]
interval_start=[]
interval_end=[]
global df1,matric_name

queue_id="qxsdweqe78784"
if(query_result["results"] != None):
  matric_name=query_result["results"][0]["data"][0]["metrics"][0]["metric"]
  df1 = pd.DataFrame({"queue_id":[],
                    matric_name "_count":[],
                    matric_name "_sum":[],
                    "interval_start":[],
                    "interval_end":[]})
  for item in query_result["results"]:
    data_lst = []
    for lst_data in item["data"]:
      for met in lst_data["metrics"]:
        metric_name = met["metric"]
        queue_id_ls.append(queue_id)
        nOffered_count.append(met["stats"]["count"])
        nOffered_sum.append(met["stats"]["sum"])
        interval_start.append(lst_data["interval"].split('/')[0])
        interval_end.append(lst_data["interval"].split('/')[1])
df1["queue_id"]=queue_id_ls
df1[matric_name "_count"]=nOffered_count
df1[matric_name "sum"]=nOffered_sum
df1["interval_start"]=interval_start
df1["interval_end"]=interval_end

print(df1)
  • Related