Home > Net >  split before first comma in pandas and output
split before first comma in pandas and output

Time:06-17

I want to output the following table in pandas. I only have the description column so far but I want to split on the comma and output the contents before the comma in the commondescrip column.

I have the description column right now, I need the commondescrip column

description commondescrip
00001 00001
00002 00002
00003,Area01 00003
00004 00004
00005,Area02 00005

I tried

splitword = df2["description"].str.split(",", n=1, expand = True)
df2["commondescrip"] = splitword[0]

but it gives me NaN for those rows that have Area.

How can I fix it so that I can achieve the above the table and split it to output before the comma?

CodePudding user response:

Here is one way to do it

df['description'].apply(lambda x: x.strip().split(',')[0])
0    00001
1    00002
2    00003
3    00004
4    00005
Name: description, dtype: object

CodePudding user response:

Don't split, this would require to handle several parts while you're only interested in one: remove or extract.

removing everything after the first comma:

df['commondescrip'] = df['description'].str.replace(',.*', '', regex=True)

or extracting everything before the first comma:

df['commondescrip'] = df['description'].str.extract('([^,] )')

output:

    description commondescrip
0         00001         00001
1         00002         00002
2  00003,Area01         00003
3         00004         00004
4  00005,Area02         00005
  • Related