Map Pandas Series Containing key/value pairs to a new columns with data-CodePudding

I have a dataframe containing a pandas series (column 2) as below:

column 1	column 2	column 3
1123	Requested By = John Doe 1\n Requested On = 12 October 2021\n Comments = This is a generic request	INC29192
1251	NaN	INC18217
1918	Requested By = John Doe 2\n Requested On = 2 September 2021\n Comments = This is another generic request	INC19281

I'm struggling to extract, split and map column 2 data to a series of new column names with the appropriate data for that record (where possible, that is where there is data available as I have NaNs).

The Desired output is something like (where Ive dropped the column 3 data for legibility):

column 1	column 3	Requested By	Requested On	Comments
1123	INC29192	John Doe 1	12 October 2021	This is a generic request
1251	INC18217	NaN	NaN	NaN
1918	INC19281	John Doe 2	2 September 2021	This is another generic request

I have spent quite some time, trying various approaches, from lambda functions to comprehensions to explode methods but havent quite found a solution that provides the desired output - any thoughts greatly appreciated!

CodePudding user response：

First I would convert column 2 values to dictionaries and then convert them to Dataframes and join them to your df:

df['column 2'] = df['column 2'].apply(lambda x: 
                                       {y.split(' = ',1)[0]:y.split(' = ',1)[1]
                                       for y in x.split(r'\n ')}
                                       if not pd.isna(x) else {})
df = df.join(pd.DataFrame(df['column 2'].values.tolist())).drop('column 2', axis=1)
print(df)

Output:

   column 1  column 3 Requested By      Requested On                         Comments
0      1123  INC29192   John Doe 1   12 October 2021        This is a generic request
1      1251  INC18217          NaN               NaN                              NaN
2      1918  INC19281   John Doe 2  2 September 2021  This is another generic request