I have a pandas dataframe with two columns: col1: a list column col2: an integer that specify index of the list element that I would like to extract and store in col3. it can take nan value, then outcome should be nan as well.
df = pd.DataFrame({
'col1' : [['A', 'B'], ['C', 'D', 'E'], ['F', 'G']],
'col2' : [0,2,np.nan]})
df_out = pd.DataFrame({
'col1' : [['A', 'B'], ['C', 'D', E'], ['F', 'G']],
'col2' : [0,2,np.nan],
'col3' : ['A','E',np.nan]})
CodePudding user response:
you can use a basic apply:
def func(row):
if np.isnan(row.col2):
return np.nan
else:
return row.col1[int(row.col2)]
df['col3'] = df.apply(func, axis=1)
output:
col1 col2 col3
0 [A, B] 0.0 A
1 [C, D, E] 2.0 E
2 [F, G] NaN NaN
CodePudding user response:
You can do a list comprehension to compare the two columns too.
df['col3'] = [i[int(j)] if not np.isnan(j) else j for i,j in zip(df.col1,df.col2)]
col1 col2 col3
0 [A, B] 0.0 A
1 [C, D, E] 2.0 E
2 [F, G] NaN NaN