Home > Mobile >  convert a dataframe column of comma separted value to string with different format
convert a dataframe column of comma separted value to string with different format

Time:07-08

I hava a pandas dataframe column with string

parameters

param1,param2
param1,param2,param3
param1,param2,param3,param4

I want the column to be converted to string like

parameters

[{"param" : "param1"}, {"param" : "param2"}]
[{"param" : "param1"}, {"param" : "param2"},{"param" : "param3"}]
[{"param" : "param1"}, {"param" : "param2"},{"param" : "param3"}, {"param": "param4"]

I could achieve this using comprehension for single sample value but not sure how to apply it to entire series

a = df2.sample().to_dict()
ss = {k:list(a[k].values())[0] for k in a}

# print(ss["parameters"])
pl = ss["parameters"].split(",")[:-1]
# print(pl)
ss["templateParameters"] = [{"param": each} for each in pl]

CodePudding user response:

Use nested list comprehension - output is list of dictionaries:

df2["templateParameters"] = [[{"param": each} for each in pl.split(',')] 
                             for pl in df2["parameters"]]
print (df2)
                    parameters  \
0                param1,param2   
1         param1,param2,param3   
2  param1,param2,param3,param4   

                                  templateParameters  
0         [{'param': 'param1'}, {'param': 'param2'}]  
1  [{'param': 'param1'}, {'param': 'param2'}, {'p...  
2  [{'param': 'param1'}, {'param': 'param2'}, {'p... 

EDIT: Error means obviously there are missing values, solution is add if-else:

df2["templateParameters"] = [[{"param": each} for each in pl.split(',')] 
                             if pd.notna(pl) else None 
                             for pl in df2["parameters"]]
print (df2)
                    parameters  \
0                          NaN   
1                param1,param2   
2         param1,param2,param3   
3  param1,param2,param3,param4   

                                  templateParameters  
0                                               None  
1         [{'param': 'param1'}, {'param': 'param2'}]  
2  [{'param': 'param1'}, {'param': 'param2'}, {'p...  
3  [{'param': 'param1'}, {'param': 'param2'}, {'p...   

CodePudding user response:

Alternative solution:

df = pd.DataFrame({"parameters": [np.nan, "param1,param2", "param1,param2,param3", "param1,param2,param3,param4"]})

(
    df["parameters"].fillna('').str.split(",")
    .apply(lambda x: [{"param": i} for i in x] if x[0] else None)
)

-------------------------------------------------------
0    None
1    [{'param': 'param1'}, {'param': 'param2'}]
2    [{'param': 'param1'}, {'param': 'param2'}, {'p...
3    [{'param': 'param1'}, {'param': 'param2'}, {'p...
Name: parameters, dtype: object
-------------------------------------------------------
  • Related