I have a data frame consisting of lists as elements. Also, I have a list of known indexes. Now I want to extract the elements present in those indexes in each row. My code:
df = pd.DataFrame({'A':[[7,8],[4,5,NaN],[NaN,1,9]],'match_idx':[1,0,NaN]})
df
A match_idx
0 [7, 8] 1
1 [4, 5, nan] 0
2 [nan, 1, 9] NaN
# in each row, let's find the values located in the match_idx position
Present solution:
df['A_element'] = df.apply(lambda x: x['A'][x['match_idx']] if ~x['match_idx'].isnan() else np.nan,axis=1)
AttributeError: 'float' object has no attribute 'isnan'
Expected solution:
df =
A match_idx A_element
0 [7, 8] 1 8
1 [4, 5, nan] 0 4
2 [nan, 1, 9] NaN NaN
CodePudding user response:
For tet non missing values use notna
with convert indices to integer:
df['A_element'] = [a[int(i)] if pd.notna(i) else np.nan
for a, i in zip(df['A'], df['match_idx'])]
Or:
df['A_element'] = df.apply(lambda x: x['A'][int(x['match_idx'])]
if pd.notna(x['match_idx']) else np.nan,axis=1)
print (df)
A match_idx A_element
0 [7, 8] 1.0 8.0
1 [4, 5, nan] 0.0 4.0
2 [nan, 1, 9] NaN NaN