Home > database >  How to combine multiple columns to single column
How to combine multiple columns to single column

Time:05-27

This is the sample data provided and I need to combine the columns to single column which consists of all the fruit names without any Nan Values.

|Fruit1  | Fruit2| Fruit3 |
|:------ |:-----:| ------:|
|NaN     | Apple | NaN    |
|Pear    | NaN   | NaN    |
|NaN     | NaN   | Orange |
|Mango   | NaN   | NaN    |
|NaN     | banana| NaN    |

This should be the output:

|Fruits  | 
|------- | 
|Apple   |
|Pear    |
|Orange  | 
|Mango   |
|banana  |

CodePudding user response:

I would use bfill():

df = pd.DataFrame({
    'fruit_1': [None, 'Pear', None, None], 
    'fruit_2': ['Apple', None, None, None], 
    'fruit_3': [None, None, 'Orange', None]})

df.bfill(axis=1).iloc[:,0].rename('fruits') # returns
0     Apple
1      Pear
2    Orange
3      None
Name: fruits, dtype: object

(or ffill() and use the last column)

It also works for rows containing None only.

CodePudding user response:

Assuming you have only one non-NaN per row, you can stack:

df.stack().droplevel(1).to_frame(name='Fruits')

Output:

   Fruits
0   Apple
1    Pear
2  Orange
3   Mango
4  banana

handling rows with only NaNs:

df.stack().droplevel(1).to_frame(name='Fruits').reindex(df.index)

output assuming banana is a NaN:

   Fruits
0   Apple
1    Pear
2  Orange
3   Mango
4     NaN

CodePudding user response:

I think this should give the desired output -

df['Fruit1'].fillna(df['Fruit2'])

CodePudding user response:

We can use combine_first here:

df["Fruits"] = df["Fruit1"].combine_first(df["Fruit2"])

We can also use np.where:

df["Fruits"] = np.where(df["Fruit1"].isnull(), df["Fruit2"], df["Fruit1"])

CodePudding user response:

We can use a modification of an approach from Transform Multiple Columns Into One With Pandas to combine columns:

df['new'] = df.fillna('').sum(1) 

Explanation

  • replace all nan values with an empty string
  • sum(1), is summing the df row by row. Since values in row are strings, it will join them together

Example Usage

from io import StringIO

# Create DataFrame from OP data
s = '''Fruit1,Fruit2,Fruit3
NaN,Apple,NaN
Pear,NaN,NaN
NaN,NaN,Orange
Mango,NaN,NaN
NaN,banana,NaN'''

df = pd.read_csv(StringIO(s))
print(df)

Initial DataFrame

    Fruit1  Fruit2  Fruit3
0   NaN     Apple   NaN
1   Pear    NaN     NaN
2   NaN     NaN     Orange
3   Mango   NaN     NaN
4   NaN     banana  NaN

df['New']=df.fillna('').sum(1)
print(df)

Updated DataFrame

    Fruit1  Fruit2  Fruit3  new
0   NaN     Apple   NaN     Apple
1   Pear    NaN     NaN     Pear
2   NaN     NaN     Orange  Orange
3   Mango   NaN     NaN     Mango
4   NaN     banana  NaN     banana
  • Related