I have a pandas dataframe like below:
df = pd.DataFrame({'id': [1,2,3],
'fea1':['high', 'med', ''],
'fea2': ['med', '', 'high'],
'fea3': ['low', 'high', 'low']})
id fea1 fea2 fea3
1 high med low
2 med high
3 high low
I need to create a new column fea
which concatenate columns fea1
, fea2
, fea3
with a separator /
. Ignore those empty cells.
id fea1 fea2 fea3 fea
1 high med low high/med/low
2 med high med/high
3 high low high/low
Here is what I tried:
df = pd.DataFrame({'id': [1,2,3],
'fea1':['high', 'med', ''],
'fea2': ['med', '', 'high'],
'fea3': ['low', 'high', 'low']})
df['fea'] = df['fea1'] '/' df['fea2'] '/' df['fea3']
id fea1 fea2 fea3 fea
1 high med low high/med/low
2 med high med//high
3 high low /high/low
In row 2 and 3, I have unwanted extra /
. Does anyone know how to concatenate the columns without creating these extra separators /
? Thanks.
CodePudding user response:
This will remove the /
at the start and in the middle of your column
df = pd.DataFrame({'id': [1,2,3],
'fea1':['high', 'med', ''],
'fea2': ['med', '', 'high'],
'fea3': ['low', 'high', 'low']})
df['Column'] = df[[x for x in df.columns.drop('id')]].apply(lambda x : '/'.join(x), axis = 1).str.replace('//', '/').str.strip('/')
CodePudding user response:
using apply and regex to remove leading "/"
df['fea'] = df[['fea1','fea2','fea3']].apply(lambda x: "/".join(x.str.strip()), axis=1).replace(r'^/','', regex=True)
df
id fea1 fea2 fea3 fea
0 1 high med low high/med/low
1 2 med high med//high
2 3 high low high/low