Home > Mobile >  Regex that captures and filters the "steps" strings that have only one sole number at the
Regex that captures and filters the "steps" strings that have only one sole number at the

Time:12-24

So I have a pandas.Series as such

s = pd.Series(['1-Onboarding   Retorno', '1.1-Onboarding escolha de bot',
                  '2-Seleciona produto', '3-Informa localizacao e cpf',
                  '3.1-CPF valido (V.2.0)', '3.2-Obtencao de CEP'],name = 'Steps')

0           1-Onboarding   Retorno
1    1.1-Onboarding escolha de bot
2              2-Seleciona produto
3      3-Informa localizacao e cpf
4           3.1-CPF valido (V.2.0)
5              3.2-Obtencao de CEP

The idea here is to "filter" the df so I gather only the strings with the a unique number.

s = pd.Series(['1-Onboarding   Retorno',
                  '2-Seleciona produto', '3-Informa localizacao e cpf'],name = 'Steps')

0         1-Onboarding   Retorno
1            2-Seleciona produto
2    3-Informa localizacao e cpf
Name: Steps, dtype: object

Any ideas on how I could do that? I am having difficulties formulating the regex. I know I should use to formulate such filter in Pandas.

s.str.contains('',regex = True) 

CodePudding user response:

We can use str.contains here:

df_out = s[s["Steps"].str.contains(r'^\d -', regex=True)]

The resulting output data frame df_out will contain only steps value which begin with a major version (integer) number.

CodePudding user response:

you can use this

l=[]
for i in range(len(s)):
        if '.' not in s[i] :
            l.append(s[i])
new_s= pd.Series(l,name = 'Steps')

out:

0         1-Onboarding   Retorno
1            2-Seleciona produto
2    3-Informa localizacao e cpf
  • Related