Say I have a dataframe
df = pd.DataFrame({
'column_1': ['ABC DEF', 'JKL', 'GHI ABC', 'ABC ABC', 'DEF GHI', 'DEF', 'DEF DEF', 'ABC GHI DEF ABC'],
'column_2': [9, 2, 3, 4, 6, 2, 7, 1 ]})
column_1 column_2
0 ABC DEF 9
1 JKL 2
2 GHI ABC 3
3 ABC ABC 4
4 DEF GHI 6
5 DEF 2
6 DEF DEF 7
7 ABC GHI DEF ABC 1
I am using extract all to get the matched pattern in my dataframe.
df_['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first()
I get
0 1
0 ABC DEF
2 ABC None
3 ABC None
4 None DEF
5 None DEF
6 None DEF
7 ABC DEF
However Expected output was (check index : 1)
0 1
0 ABC DEF
1 None None
2 ABC None
3 ABC None
4 None DEF
5 None DEF
6 None DEF
7 ABC DEF
CodePudding user response:
You can just reindex the new dataframe with the old one's:
out = df['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first().reindex(df.index, fill_value="None")
CodePudding user response:
A simple solution might be to fill the missing index rows with None values, like so:
df.reindex(list(range(df.index.min(), df.index.max() 1)), fill_value="None")
Output:
0 1
0 ABC DEF
1 None None
2 ABC None
3 ABC None
4 None DEF
5 None DEF
6 None DEF
7 ABC DEF