How to filter the dataframe df1 based on column symbol that Starts with .
and first digit numeric
df1
SYMBOL TYPE
.1E09UOV Exchange code
.2E09UP0 Exchange code
.AT0013F Exchange code
.BT0013G Exchange code
.CT002MS Exchange code
.DT002MT Exchange code
.7T003MT Exchange code
.7T004MT Exchange code
.7T001MT Exchange code
.7T003MT Exchange code
Expected output
SYMBOL TYPE
.1E09UOV Exchange code
.2E09UP0 Exchange code
.7T003MT Exchange code
.7T004MT Exchange code
.7T001MT Exchange code
.7T003MT Exchange code
Tried code:
df1.loc[(df1['SYMBOL'].re.sub(r'.\d')]
CodePudding user response:
You can use the following:
df1 = df1[df1['SYMBOL'].str.match('^\.[0-9].*')]
^
= start of string\.
= look for period[0-9]
= look for single digit.*
= look for zero or more characters
Here is an example showing the full code:
Code:
import pandas as pd
df1 = pd.DataFrame({ 'SYMBOL': ['.1E09UOV', '.2E09UP0', '.AT0013F', '.BT0013G', '.CT002MS', '.DT002MT', '.7T003MT', '.7T004MT', '.7T001MT', '.7T003MT'],
'TYPE': ['Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code']})
df1 = df1[df1['SYMBOL'].str.match('^\.[0-9].*')]
print(df1)
Output:
SYMBOL TYPE
0 .1E09UOV Exchange code
1 .2E09UP0 Exchange code
6 .7T003MT Exchange code
7 .7T004MT Exchange code
8 .7T001MT Exchange code
9 .7T003MT Exchange code