Home > Software engineering >  Python Pandas Split strings into two Columns using str.split()
Python Pandas Split strings into two Columns using str.split()

Time:12-05

How do you split the text in a column to create a new column in a dataframe using "(" and ")"? Current data frame:

Item Description
0 coat Boys (Target)
1 boots Womens (DSW)
2 socks Girls (Kohls)
3 shirt Mens (Walmart)
4 boots Womens (DSW)
5 coat Boys (Target)

What I want to create:

Item Description Retailer
0 coat Boys Target
1 boots Womens DSW
2 socks Girls Kohls
3 shirt Mens Walmart
4 boots Womens DSW
5 coat Boys Target

I've tried the following:

df[['Description'], ['Retailer']] = df['Description'].str.split("(")

I get an error: "TypeError: unhashable type: 'list'"

CodePudding user response:

Try this:

import pandas as pd

# creating the df
item = ['coat','boots']
dec = ["Boys (Target)", "Womens (DSW)"]
df = pd.DataFrame(item, columns=['Item'])
df['Description'] = dec


def extract_brackets(row):
    return row.split('(', 1)[1].split(')')[0].strip()


def extract_first_value(row):
    return row.split()[0].strip()


df['Retailer'] = df['Description'].apply(extract_brackets)
df['Description'] = df['Description'].apply(extract_first_value)

print(df)

CodePudding user response:

Hi I have run this tiny test and seems to work; note the space and the \ in the split string.

import pandas as pd
df = pd.Series(['Boys (Target)','Womens (DSW)','Girls (Kohls)'])
print(df)
d1 = df.str.split(' \(')
print(d1)

CodePudding user response:

You have to include the parameter expand=True within split function, and rearrange the way you assign back your two columns. Consider using the following code:

df[['Description','Retailer']]  = df.Description.str.replace(')','',regex=True)\
    .str.split('(',expand=True)

print(df)

    Item Description Retailer
0   coat       Boys    Target
1  boots     Womens       DSW
2  socks      Girls     Kohls
3  shirt       Mens   Walmart
4  boots     Womens       DSW
5   coat       Boys    Target

I am first removing the closing bracket from Description, and then expanding based on the opening bracket.

  • Related