I am new to coding , recently started learning to code. Currently I am stuck in the process to split a column. Please help me
I have this dataframe
data = ['TOOK22JAN1515100HG','BOOK22FEB1643200GH','TOOK22MAR1742200HG']
df= pd.DataFrame(data)
and I want to split it into
0 TOOK22JAN1515100HG TOOK 22-01-15 15100 HG
1 BOOK22FEB1643200GH BOOK 22-02-16 43200 GH
2 TOOK22MAR1742200HG TOOK 22-03-17 42200 HG
Really appreciate for taking your time and answering to my problem.
PS: this is just an example of option symbol which is combination of Index date strike type (stock market)
CodePudding user response:
Use str.extract
to explode your string:
pattern = r'(?P<id>[A-Z]{4})(?P<date>\w{7})(?P<val>\d )(?P<misc>[A-Z]{2})'
df = df.join(df[0].str.extract(pattern))
df['date'] = pd.to_datetime(df['date'])
df['val'] = df['val'].astype(int)
print(df)
# Output
0 id date val misc
0 TOOK22JAN1515100HG TOOK 2015-01-22 15100 HG
1 BOOK22FEB1643200GH BOOK 2016-02-22 43200 GH
2 TOOK22MAR1742200HG TOOK 2017-03-22 42200 HG