I have a column in my dataframe were there will be multiple values. I need to filter only values that match my condition.
For Example
df
col1
Tesla
Audi
BMW-N2204281200PE
SUPRA2204241300.75CE
TATA230612133.50PE
I need to filter only the values like the last 3 rows. It will be a string that will be starting with characters, may have symbols(-,&,$) followed by characters ,will have 6 digit value, then some price like 1300,1300.75, and ends with PE or CE
How could I do this using pandas?
Also how could I split the same symbol like ['BMW-N','220428',1200PE], ['SUPRA','220424','1300.75CE' ]
?
CodePudding user response:
You can use the following regex:
df['col1'].str.extract('([a-zA-Z-&$] )(\d{6})(\d (?:\.\d )?[PC]E)')
output:
0 1 2
0 NaN NaN NaN
1 NaN NaN NaN
2 BMW-N 220428 1200PE
3 SUPRA 220424 1300.75CE
4 TATA 230612 133.50PE