Please help with the below request:
need to clean up below df to df_1: 'SKU' has multiple required data, and this column needs to be exploded to multiple rows
df = pd.DataFrame([['1','[{"Result":"00028"},{"Result":"00063"}]','abj'],
['2','[{"Result":"00018"},{"Result":"00065"}]','abj'],
['3','[{"Result":"00021"},{"Result":"00053"}]','abj']],
columns = ['ID','SKU','NOTES'])
df_1 = pd.DataFrame([['1','00028','abj'],
['1','00063','abj'],
['2','00018','abj'],
['2','00065','abj'],
['3','00021','abj'],
['3','00053','abj']],
columns = ['ID','SKU','NOTES'])
CodePudding user response:
To convert the SKU values from str
to dictionary
, I used json
module as follows:
import pandas as pd
import json
df = pd.DataFrame([
['1', '[{"Result":"00028"},{"Result":"00063"}]', 'abj'],
['2', '[{"Result":"00018"},{"Result":"00065"}]', 'abj'],
['3', '[{"Result":"00021"},{"Result":"00053"}]', 'abj']],
columns=['ID','SKU','NOTES']
)
new_df = {
'ID': [],
'SKU': [],
'NOTES': []
}
for i, row in df.iterrows():
results = json.loads(row['SKU'])
for res in results:
new_df['ID'].append(row['ID'])
new_df['SKU'].append(res['Result'])
new_df['NOTES'].append(row['NOTES'])
df_1 = pd.DataFrame(new_df)
print(df_1)
# ID SKU NOTES
#0 1 00028 abj
#1 1 00063 abj
#2 2 00018 abj
#3 2 00065 abj
#4 3 00021 abj
#5 3 00053 abj
For another way of converting str
to dictionary
, you can see: https://www.geeksforgeeks.org/python-convert-string-dictionary-to-dictionary/
CodePudding user response:
If all records are following the same pattern, this should clean it for you:
Beware that the code below is modifying df.
import pandas as pd
import json
df.SKU = df.SKU.apply(lambda x : [ i['Result'] for i in json.loads(x)])
df.explode('SKU')