Here i'm using regex (regular expression) in Pnadas.
NIFTY30DEC219000CE.NFO
NIFTY30DEC2116000CE.NFO
NIFTY30DEC2116000CE
NIFTY30DEC2116111PE
NIFTY30DEC218100PE
I have this type of string, in which '9000'
which is 4 digit character or 16000
or 5 digit character, as same as more.
And Output should be-
9000
16000
16000
16111
8100
And I don't need this 30DEC21
in output.
Syntax image - which I'm using. And I'm using this syntax. but I'm getting wrong output.
This is my code - image of My Code
CodePudding user response:
I would use str.extract
with the following regex pattern:
\d{2}[A-Z]{3}\d{2}(\d )
Python script:
df["output"] = df["col"].str.extract(r'\d{2}[A-Z]{3}\d{2}(\d )')
Here is a demo showing that the extraction logic is working.
CodePudding user response:
r"NIFTY30DEC21(\d{4,5})(CE\.NFO|CE|PE)"