How do i add an extra column in a dataframe, so it could split and convert to integer types but np.nan for string types
Col1
1|2|3
"string"
so
Col1 ExtraCol
1|2|3 [1,2,3]
"string" nan
I tried long contorted way but failed
df['extracol'] = df["col1"].str.strip().str.split("|").str[0].apply(lambda x: x.astype(np.float) if x.isnumeric() else np.nan).astype("Int32")
CodePudding user response:
Another possible solution:
import re
df['ExtraCol'] = df['Col1'].apply(lambda x: [int(y) for y in re.split(
r'\|', x)] if x.replace('|', '').isnumeric() else np.nan)
Output:
Col1 ExtraCol
0 1|2|3 [1, 2, 3]
1 string NaN
CodePudding user response:
You can use regex and Series.str.match
to find the rows whose value can be split into integer lists
df['ExtraCol'] = df.loc[df['Col1'].str.match(r'\|?\d \|?'), 'Col1'].str.split('|')