Home > Back-end >  how to split one dataframe column string(letter and number) into 2 columns? to separate the letters
how to split one dataframe column string(letter and number) into 2 columns? to separate the letters

Time:09-23

I have a dataframe with this as index

Index(['LCOc1', 'LCOc2', 'LCOc3', 'LCOc4', 'LCOc5', 'LCOc6', 'LCOc7', 'LCOc8',
       'LCOc9', 'LCOc10', 'LCOc11', 'LCOc12', 'CLc1', 'CLc2', 'CLc3', 'CLc4',
       'CLc5', 'CLc6', 'CLc7', 'CLc8', 'CLc9', 'CLc10', 'CLc11', 'CLc12',
       'OQc1', 'OQc2', 'OQc3', 'OQc4', 'OQc5', 'OQc6', 'OQc7', 'OQc8', 'OQc9',
       'OQc10', 'OQc11', 'OQc12'],
      dtype='object', name='CLOSE')

I wish to create 2 column one with only letters and the other with only numbers, I tested with str.split and str.extract but haven't succeeded

Basically I want one new column with values like ['LCOc', 'LCOc', 'LCOc'] and another one with values like [1,2,3.....11,12,1]

this is part of the dataframe with "CLOSE" as the index name

        2022-09-02
CLOSE             
LCOc1        93.02
LCOc2        91.81
LCOc3        90.66
LCOc4        89.52
LCOc5        88.52
LCOc6        87.68
LCOc7        86.94
LCOc8        86.24
LCOc9        85.63
LCOc10       85.02
LCOc11       84.40
LCOc12       83.81
CLc1         86.87
CLc2         86.48
CLc3         85.90
CLc4         85.19
CLc5         84.41

CodePudding user response:

You can use .str.extract with the index like this:

df[['letters', 'digits']] = df.index.str.extract('(\D )(\d )').to_numpy()

Output:

        2022-09-02 letters digits
CLOSE                            
LCOc1        93.02    LCOc      1
LCOc2        91.81    LCOc      2
LCOc3        90.66    LCOc      3
LCOc4        89.52    LCOc      4
LCOc5        88.52    LCOc      5
LCOc6        87.68    LCOc      6
LCOc7        86.94    LCOc      7
LCOc8        86.24    LCOc      8
LCOc9        85.63    LCOc      9
LCOc10       85.02    LCOc     10
LCOc11       84.40    LCOc     11
LCOc12       83.81    LCOc     12
CLc1         86.87     CLc      1
CLc2         86.48     CLc      2
CLc3         85.90     CLc      3
CLc4         85.19     CLc      4
CLc5         84.41     CLc      5

CodePudding user response:

You can also use re package to extract and then assign them as your new indices (if that's what you wanted)

import re
df.index = [(re.search('[a-zA-Z] ', df.index[i])[0], re.search('[0-9] ', df.index[i])[0]) for i in range(len(df.index))]

or if you want to add them to your dataframe

df[['letters','digits']] = ([[re.search('[a-zA-Z] ', df.index[i])[0], re.search('[0-9] ', df.index[i])[0]] for i in range(len(df.index))])

  • Related