Home > Net >  How extract value from a JSON value stored as a list object in a column?
How extract value from a JSON value stored as a list object in a column?

Time:09-13

Dataframe for example:

data =[['a056cf7d3aeadfc7c4d5d6e37f424554', 11089528586286, [{'rate': 0.0625, 'price': '0.52', 'title': 'IL STATE TAX', 'price_set': {'shop_money': {'amount': '0.52', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.52', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL COUNTY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL CITY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL SPECIAL TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}]], ['a056cf7d3aeadfc7c4d5d6e37f424554', 11089528651822, [{'rate': 0.0625, 'price': '0.75', 'title': 'IL STATE TAX', 'price_set': {'shop_money': {'amount': '0.75', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.75', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL COUNTY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL CITY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL SPECIAL TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}]], ['a056cf7d3aeadfc7c4d5d6e37f424554', 11089528717358, [{'rate': 0.0625, 'price': '0.48', 'title': 'IL STATE TAX', 'price_set': {'shop_money': {'amount': '0.48', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.48', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL COUNTY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL CITY TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}, {'rate': 0.0, 'price': '0.00', 'title': 'IL SPECIAL TAX', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'USD'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'USD'}}, 'channel_liable': False}]], ['a8911732ef329fccffd45fd4ff945b36', 11805558440126, [{'rate': 0.2, 'price': '35.42', 'title': 'AT VAT', 'price_set': {'shop_money': {'amount': '35.42', 'currency_code': 'EUR'}, 'presentment_money': {'amount': '35.42', 'currency_code': 'EUR'}}, 'channel_liable': False}]], ['5506eecfc2ffa7f2b1832e1ccab121fa', 12019803881631, [{'rate': 0.2, 'price': '0.00', 'title': 'GB VAT', 'price_set': {'shop_money': {'amount': '0.00', 'currency_code': 'GBP'}, 'presentment_money': {'amount': '0.00', 'currency_code': 'GBP'}}, 'channel_liable': False}]]]
headers = [['token','line','data']]
df = pd.DataFrame(data, columns =headers)
df

Output: enter image description here

The "data" column stored as json in list. How it is possible to extract price for "data" column for each "token" and "line" ? When I try :

s = df.explode('data', ignore_index=True)
dfn=s.join(pd.DataFrame([*s.pop('data')], index=s.index))

Get error : TypeError: explode() missing 1 required positional argument: 'column'

Also can not to use json.loads because of this list and not string...

By this example :Pandas expand json field across records I get error: TypeError: the JSON object must be str, bytes or bytearray, not Series

Any ideas?

CodePudding user response:

Your data is not JSON if its type is a list and not str. The following code will produce a series of lists containing the prices.

df['data'].apply(lambda l: [item['price'] for item in l])

CodePudding user response:

hope this help iterate through each element of each element

for d in data:
    i = 0
    for r in d:
        if i == 2:
            for m in r:
                print(m['price'])
        i = i   1

if i use the i counter it's to avoid looping on the int elements token and line only continue to go further deep with the data list

CodePudding user response:

I am sure there is a better way, but since your object is regularly nested, I would simply parse each entry and build a list of non-nested dictionaries you can later use to build the DataFrame.

I am assuming you want to port all the data into a single DataFrame.

new_data = []
for entry in data:
    d = {'label':[], 'line':[], 'rate':[], 'price':[], 'title':[],'shop_money_amount':[], 'shop_money_currency_code':[], 'presentment_money_amount':[], 'presentment_money_currency_code':[], 'channel_liable':[]}
    for item in entry[2]: #list of dicts
        d['label'].append(entry[0])
        d['line'].append(entry[1])
        d['rate'].append(item['rate'])
        d['price'].append(item['price'])
        d['title'].append(item['title'])
        d['shop_money_amount'].append(item['price_set']['shop_money']['amount'])
        d['shop_money_currency_code'].append(item['price_set']['shop_money']['currency_code'])
        d['presentment_money_amount'].append(item['price_set']['presentment_money']['amount'])
        d['presentment_money_currency_code'].append(item['price_set']['presentment_money']['currency_code'])
        d['channel_liable'].append(item['channel_liable'])
        new_data.append(d)
        
df = pd.concat([pd.DataFrame(d) for d in new_data])
  • Related