Creating a new column by extracting values from dict in Pandas-CodePudding

I have a data structure like this

name	targets	imp
Bob	{'codes':[3,4,6,199], 'region':'us', 'meta':''}	200
Diana	{'codes':[3,33,199], 'region':'us', 'meta':''}	100

I am trying to make the final results one more column of extracted codes in targets, like this

name	targets	imp	targets.code
Bob	{'codes':[3,4,6,199], 'region':'us', 'meta':''}	200	[3,4,6,199]
Diana	{'codes':[3,33,199], 'region':'us', 'meta':''}	100	[3,33,199]

I tried doing

df['targets.code'] = df['targets'].apply(lambda x: x['codes'])

But it shows specifying on that line

[2022-01-14 19:53:33,660] {{taskinstance.py:1150}} ERROR - 'NoneType' object is not subscriptable

I really tried digging into a lot of posts but didn't find a solution. What am I doing wrong?

CodePudding user response：

Looks like you have bad data (empty fields) in your source under "targets" column, because the following works:

df = pd.DataFrame([{'name': 'Bob', 'targets': {'codes':[3,4,6,199], 'region':'us', 'meta':''}}])
print(df)
#  name                                            targets
# 0  Bob  {'codes': [3, 4, 6, 199], 'region': 'us', 'met...

df['targets.code'] = df['targets'].apply(lambda x: x['codes'])
print(df)
#   name                                            targets    targets.code
# 0  Bob  {'codes': [3, 4, 6, 199], 'region': 'us', 'met...  [3, 4, 6, 199]

CodePudding user response：

You can use pd.json_normalize:

df['targets.code'] = pd.json_normalize(df['targets'])['codes']
print(df)

# Output
    name                                            targets  imp    targets.code
0    Bob  {'codes': [3, 4, 6, 199], 'region': 'us', 'met...  200  [3, 4, 6, 199]
1  Diana  {'codes': [3, 33, 199], 'region': 'us', 'meta'...  100    [3, 33, 199]

You can also use a comprehension:

df['targets.code'] = [x['codes'] if x else [] for x in df['targets']]
print(df)

# Output
    name                                            targets  imp    targets.code
0    Bob  {'codes': [3, 4, 6, 199], 'region': 'us', 'met...  200  [3, 4, 6, 199]
1  Diana  {'codes': [3, 33, 199], 'region': 'us', 'meta'...  100    [3, 33, 199]
2   Test                                               None   50              []