I am trying to extract all of the 'uid'
values into a simple python list for each row of the 'data'
column and have been unsuccessful.
Here is the code so far:
import json
df = pd.DataFrame({
'bank_account': [101, 102, 201, 301],
'data': [
'[{"uid": 100, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "Alice"},{"uid": 150, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "jer"}]',
'[{"uid": 100, "account_type": 2, "account_data": {"currency": {"current": 2000, "minimum": 0}, "fees": {"monthly": 0}}, "user_name": "Alice"}]',
'[{"uid": 200, "account_type": 1, "account_data": {"currency": {"current": 3000, "minimum": 0}, "fees": {"monthly": 13.5}}, "user_name": "Bob"}]',
'[{"uid": 300, "account_type": 1, "account_data": {"currency": {"current": 4000, "minimum": 0}, "fees": {"monthly": 13.5}}, "user_name": "Carol"}]'
]},
index = ['Alice', 'Alice', 'Bob', 'Carol']
)
# df["data"] = df["data"].apply(lambda x: pd.read_json(x, lines=True)["uid"][0])
df["data"] = [[json.loads(d)['uid'] for d in li] for li in df['data']]
print(df)
CodePudding user response:
You could iterate over the list created by json.loads
and get "uid"
values:
import json
df['data'] = [[item['uid'] for item in json.loads(d)] for d in df['data']]
Output:
bank_account data
Alice 101 [100, 150]
Alice 102 [100]
Bob 201 [200]
Carol 301 [300]
You could also explode
"data":
df = df.explode('data')
Output:
bank_account data
Alice 101 100
Alice 101 150
Alice 102 100
Bob 201 200
Carol 301 300
CodePudding user response:
Modified apply a bit as you're using apply
import pandas as pd
df = pd.DataFrame(
{
"bank_account": [101, 102, 201, 301],
"data": [
'[{"uid": 100, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "Alice"},{"uid": 150, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "jer"}]',
'[{"uid": 100, "account_type": 2, "account_data": {"currency": {"current": 2000, "minimum": 0}, "fees": {"monthly": 0}}, "user_name": "Alice"}]',
'[{"uid": 200, "account_type": 1, "account_data": {"currency": {"current": 3000, "minimum": 0}, "fees": {"monthly": 13.5}}, "user_name": "Bob"}]',
'[{"uid": 300, "account_type": 1, "account_data": {"currency": {"current": 4000, "minimum": 0}, "fees": {"monthly": 13.5}}, "user_name": "Carol"}]',
],
},
index=["Alice", "Alice", "Bob", "Carol"],
)
df["uid"] = df["data"].apply(lambda x: eval(x)[0]["uid"])
print(df)
Then for list you can do print(list(df["uid"]))