Home > Mobile >  Extract values within the quotes signs into two separate columns with python
Extract values within the quotes signs into two separate columns with python

Time:07-13

How can i extract the values within the quotes signs into two separate columns with python. The dataframe is given below:

    df = pd.DataFrame(["'FRH02';'29290'", "'FRH01';'29300'", "'FRT02';'29310'", "'FRH03';'29340'", 
                       "'FRH05';'29350'", "'FRG02';'29360'"], columns = ['postcode'])
    
    df

          postcode
    0   'FRH02';'29290'
    1   'FRH01';'29300'
    2   'FRT02';'29310'
    3   'FRH03';'29340'
    4   'FRH05';'29350'
    5   'FRG02';'29360'

i would like to get an output like the one below:

   postcode1  postcode2
     FRH02     29290
     FRH01     29300
     FRT02     29310
     FRH03     29340
     FRH05     29350
     FRG02     29360

i have tried several str.extract codes but havent been able to figure this out. Thanks in advance.

CodePudding user response:

Finishing Quang Hoang's solution that he left in the comments:

import pandas as pd

df = pd.DataFrame(["'FRH02';'29290'", 
                   "'FRH01';'29300'", 
                   "'FRT02';'29310'", 
                   "'FRH03';'29340'", 
                   "'FRH05';'29350'", 
                   "'FRG02';'29360'"], 
                  columns = ['postcode'])

# Remove the quotes and split the strings, which results in a Series made up of 2-element lists
postcodes = df['postcode'].str.replace("'", "").str.split(';')

# Unpack the transposed postcodes into 2 new columns
df['postcode1'], df['postcode2'] = zip(*postcodes)

# Delete the original column
del df['postcode']

print(df)

Output:

  postcode1 postcode2
0     FRH02     29290
1     FRH01     29300
2     FRT02     29310
3     FRH03     29340
4     FRH05     29350
5     FRG02     29360

CodePudding user response:

You can use Series.str.split:

p1 = []
p2 = []

for row in df['postcode'].str.split(';'):
   p1.append(row[0])
   p2.append(row[1])

df2 = pd.DataFrame()
df2["postcode1"] = p1
df2["postcode2"] = p2
  • Related