i have the following dataframe:
Column1 Column2 Column3 Column4 Column5
0 value1 x1 y1 na na
1 value2 x2 y2 na na
2 value3 x3 na z1 na
3 value4 x4 na z2 na
4 value5 x5 na na w1
I want the following
Column1 Column2 Column
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
How can I achieve this? It seems that stack() doesn't works for this task.
I'll appreciate any help.
CodePudding user response:
Set the initial columns as index, then back fill on axis 1 and select the first column:
cols = ['Column1','Column2']
out = df.mask(df.eq('na')).set_index(cols).bfill(axis=1,).iloc[:,0].reset_index()
print(out)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
CodePudding user response:
new_column = pd.Series()
for col in ["Column3", "Column4", "Column5"]:
new_column = pd.concat([new_column, df[col].dropna()])
df = df.drop(col, axis=1)
df["Column3"] = new_column
>>> df
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1
CodePudding user response:
One option is with coalesce
from pyjanitor
to abstract the process(under the hood, it's just bfill/ffill):
# pip install pyjanitor
import pandas as pd
import janitor
df.coalesce('Column3', 'Column4', 'Column5').dropna(axis=1)
Column1 Column2 Column3
0 value1 x1 y1
1 value2 x2 y2
2 value3 x3 z1
3 value4 x4 z2
4 value5 x5 w1