I have a dataset, df, where I would like to combine certain values from separate columns into one 'cell':
Data
hello hi ok bye
q122 q222 q422 q222
hi hi hi hi
Logic
The first two rows are joined
Desired
hello_q122 hi_q222 ok_q422 bye_q222
hi hi hi hi
Doing
df.columns = (df.iloc[0] '_' df.iloc[1])
df = out.iloc[0:].reset_index(drop=True)
However, the first row keeps getting removed.
Any suggestion is appreciated.
CodePudding user response:
You need set second row ([1]
) by empty strings and filter out first row by [1:]
:
#solution if default header
print (df)
0 1 2 3
0 hello hi ok bye
1 q122 q222 q422 q222
2 hi hi hi hi
df.columns = df.iloc[0] '_' df.iloc[1]
df.iloc[1] = ''
df = df.iloc[1:].reset_index(drop=True)
print (df)
hello_q122 hi_q222 ok_q422 bye_q222
0
1 hi hi hi hi
df.to_csv(file, index=False)
#solution if first row is header
print (df)
hello hi ok bye
0 q122 q222 q422 q222
1 hi hi hi hi
df.columns = (df.columns '_' df.iloc[0])
df.iloc[0] = ''
df = df.reset_index(drop=True)
print (df)
hello_q122 hi_q222 ok_q422 bye_q222
0
1 hi hi hi hi