I have a datframe with the following values
Call Data
1 [{'b_id': '31358658', 'incentive': 0}, {'b_id': 'D8384E90', 'incentive': 0}, {'b_id': '681B405A','incentive': 100}]
2 [{'b_id': 'D8384E90','incentive': 0 }, {'b_id': '31358658', 'incentive': 0}, {'b_id': '681B405A', 'incentive': 120}]
3 [{'b_id': '971C0B58','incentive': 0,}]
4 [{'b_id': '00450AAA','incentive': 0}, {'b_id': '0BCAEC4F','incentive': 0}, {'b_id': 'F2AD1313''incentive': 220},{'b_id': '971C0B58', 'incentive': 0}]
Ideally I would like the output in this format
Call B_id incentive
1 [31358658,D8384E90,681B405A] [0,0,100]
2 [D8384E90,31358658,681B405A] [0,0,120]
3 [971C0B58] [0]
4 [00450AAA,0BCAEC4F,F2AD1313,971C0B58] [0,0,220,0]
The length of the data column can wary So far I have tried
df1 = df1.join(df1['Data'].str.split('b_id',expand=True).add_prefix('data'))
is there a way to search for each b_id in the sting and then take the value followed by the ":" and then add it to the list
#sample data code
Call = [1,2,3,4,5,6,7,8,9]
Data= [
[{'b_id': '31358658', 'incentive': 0}, {'b_id': 'D8384E90', 'incentive': 0}, {'b_id': '681B405A','incentive': 100}],
[{'b_id': 'D8384E90','incentive': 0 }, {'b_id': '31358658', 'incentive': 0}, {'b_id': '681B405A', 'incentive': 120}],
[{'b_id': '971C0B58','incentive': 0}],
[{'b_id': '00450AAA','incentive': 0}, {'b_id': '0BCAEC4F','incentive': 0}, {'b_id': 'F2AD1313','incentive': 220},{'b_id': '971C0B58', 'incentive': 0}],
[{'b_id': '90591CC5','incentive': 0}, {'b_id': '31358658','incentive': 0,}],
[{'b_id': '20E32751', 'incentive': 0}, {'b_id': '339A574F','incentive': 0}],
[{'b_id': '971C0B58','incentive': 0}],
[],
]
df = pd.DataFrame(list(zip(Call,Data)), columns =['Call','Data'])
All help appreciated
CodePudding user response:
you can use a loop in lambda function.:
import ast
df['Data']=df['Data'].apply(ast.literal_eval)
df['B_id']=df['Data'].apply(lambda x: [i['b_id'] for i in x])
df['incentive']=df['Data'].apply(lambda x: [i['incentive'] for i in x])
print(df.head(1)
'''
Call Data B_id incentive
0 1 [......] ['31358658', 'D8384E90', '681B405A'] [0, 0, 100]
....
'''