Home > Software engineering >  Pandas extactall: get None value if no match was found
Pandas extactall: get None value if no match was found

Time:08-04

Say I have a dataframe

df = pd.DataFrame({
    'column_1': ['ABC DEF', 'JKL', 'GHI  ABC', 'ABC ABC', 'DEF GHI', 'DEF', 'DEF DEF', 'ABC GHI DEF ABC'],
    'column_2': [9, 2, 3, 4, 6, 2, 7, 1 ]})
          column_1  column_2
0          ABC DEF         9
1              JKL         2
2         GHI  ABC         3
3          ABC ABC         4
4          DEF GHI         6
5              DEF         2
6          DEF DEF         7
7  ABC GHI DEF ABC         1

I am using extract all to get the matched pattern in my dataframe.

df_['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first()

I get

      0     1
0   ABC   DEF
2   ABC  None
3   ABC  None
4  None   DEF
5  None   DEF
6  None   DEF
7   ABC   DEF

However Expected output was (check index : 1)

      0     1
0   ABC   DEF
1  None  None
2   ABC  None
3   ABC  None
4  None   DEF
5  None   DEF
6  None   DEF
7   ABC   DEF

CodePudding user response:

You can just reindex the new dataframe with the old one's:

out = df['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first().reindex(df.index, fill_value="None")

CodePudding user response:

A simple solution might be to fill the missing index rows with None values, like so:

df.reindex(list(range(df.index.min(), df.index.max() 1)), fill_value="None")

Output:

    0       1
0   ABC     DEF
1   None    None
2   ABC     None
3   ABC     None
4   None    DEF
5   None    DEF
6   None    DEF
7   ABC     DEF
  • Related