And I want to generate other dataframe like this:
Where sku are the list of codes of products column, combined with "code" column (1, 2, 3, 4). I developed this code, but I want to do this in a short or better way:
skus_with_layoutSegments = pd.DataFrame(columns = ['sku','code'])
for i in range(len(segments_telynet)):
skus = pd.DataFrame(segments_telynet.loc[i,'products']).rename(columns={'code': 'sku'})
skus['code'] = int(segments_telynet.loc[i,'code'])
skus_with_layoutSegments = skus_with_layoutSegments.append(skus).reset_index(drop=True)
Can anyone help me with a better solution, I want to avoid doing a for loop. Thanks!
CodePudding user response:
Use df.explode
with Series.str.get
:
out = df.explode('products')
out['products'] = out['products'].str.get('code')
CodePudding user response:
In order to better illustrate this solution I reproduced part of your dataframe:
list1 = [{'Code':'ALBN100043'}, {'Code':'ALBN100044'}, {'Code':'ALBN100045'}, {'Code':'ALBN100046'}, {'Code':'ALBN100047'}, {'Code':'HOG900035'}, {'Code':'HOG900036'}, {'Code':'HOG900037'}]
list2 = [{'Code':'ALBN100043'}, {'Code':'ALBN100044'}, {'Code':'ALBN100045'}, {'Code':'ALBN100046'}, {'Code':'ALBN100047'}, {'Code':'HOG900035'}, {'Code':'HOG900036'}, {'Code':'HOG900037'}]
list3 = [{'Code':'3462456'}, {'Code':'23453'}, {'Code':'74573475'}, {'Code':'34535'}, {'Code':'2452'}, {'Code':'174564'}, {'Code':'865785'}, {'Code':'34535'}, {'Code':'3543743'},]
df = pd.DataFrame({'code' :[1,2,3],
'dateIni' :['20200101','20200101','20200101'],
'dateEnd' :['99992131','99992131','99992131'],
'products':[list1,list2,list3]})
df
code dateIni dateEnd products
0 1 20200101 99992131 [{'Code': 'ALBN100043'}, {'Code': 'ALBN100044'...
1 2 20200101 99992131 [{'Code': 'ALBN100043'}, {'Code': 'ALBN100044'...
2 3 20200101 99992131 [{'Code': '3462456'}, {'Code': '23453'}, {'Cod...
The first part is just like @Mayank's answer:
out = df.explode('products')
Output:
out
code dateIni dateEnd products
0 1 20200101 99992131 {'Code': 'ALBN100043'}
0 1 20200101 99992131 {'Code': 'ALBN100044'}
0 1 20200101 99992131 {'Code': 'ALBN100045'}
0 1 20200101 99992131 {'Code': 'ALBN100046'}
0 1 20200101 99992131 {'Code': 'ALBN100047'}
0 1 20200101 99992131 {'Code': 'HOG900035'}
0 1 20200101 99992131 {'Code': 'HOG900036'}
0 1 20200101 99992131 {'Code': 'HOG900037'}
1 2 20200101 99992131 {'Code': 'ALBN100043'}
1 2 20200101 99992131 {'Code': 'ALBN100044'}
1 2 20200101 99992131 {'Code': 'ALBN100045'}
1 2 20200101 99992131 {'Code': 'ALBN100046'}
1 2 20200101 99992131 {'Code': 'ALBN100047'}
1 2 20200101 99992131 {'Code': 'HOG900035'}
1 2 20200101 99992131 {'Code': 'HOG900036'}
1 2 20200101 99992131 {'Code': 'HOG900037'}
2 3 20200101 99992131 {'Code': '3462456'}
2 3 20200101 99992131 {'Code': '23453'}
2 3 20200101 99992131 {'Code': '74573475'}
2 3 20200101 99992131 {'Code': '34535'}
2 3 20200101 99992131 {'Code': '2452'}
2 3 20200101 99992131 {'Code': '174564'}
2 3 20200101 99992131 {'Code': '865785'}
2 3 20200101 99992131 {'Code': '34535'}
2 3 20200101 99992131 {'Code': '3543743'}
Now we differ:
out['products'] = out['products'].apply(lambda x: x.get('Code'))
Output:
code dateIni dateEnd products
0 1 20200101 99992131 ALBN100043
0 1 20200101 99992131 ALBN100044
0 1 20200101 99992131 ALBN100045
0 1 20200101 99992131 ALBN100046
0 1 20200101 99992131 ALBN100047
0 1 20200101 99992131 HOG900035
0 1 20200101 99992131 HOG900036
0 1 20200101 99992131 HOG900037
1 2 20200101 99992131 ALBN100043
1 2 20200101 99992131 ALBN100044
1 2 20200101 99992131 ALBN100045
1 2 20200101 99992131 ALBN100046
1 2 20200101 99992131 ALBN100047
1 2 20200101 99992131 HOG900035
1 2 20200101 99992131 HOG900036
1 2 20200101 99992131 HOG900037
2 3 20200101 99992131 3462456
2 3 20200101 99992131 23453
2 3 20200101 99992131 74573475
2 3 20200101 99992131 34535
2 3 20200101 99992131 2452
2 3 20200101 99992131 174564
2 3 20200101 99992131 865785
2 3 20200101 99992131 34535
2 3 20200101 99992131 3543743