Home > database >  in pandas, how to combine two dataframes vertically while two dataframes have different number of co
in pandas, how to combine two dataframes vertically while two dataframes have different number of co

Time:05-05

There are two dataframes, one dataframe might have less columns than another one. For instance,

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'col1': ['A', 'B'],
    'col2': [2, 9],
    'col3': [0, 1]
})
df1 = pd.DataFrame({
    'col1': ['G'],
    'col2': [3]
})

The df and df1 are shown as follows.

enter image description here

I would like to combine these two dataframes together, and the missing values should be assigned as some given value, like -100. How to perform this kind of combination.

enter image description here

CodePudding user response:

You could reindex the DataFrames first to "preserve" the dtypes; then concatenate:

cols = df.columns.union(df1.columns)
out = pd.concat([d.reindex(columns=cols, fill_value=-100) for d in [df, df1]], 
                ignore_index=True)

Output:

  col1  col2  col3
0    A     2     0
1    B     9     1
2    G     3  -100

CodePudding user response:

Use concat with DataFrame.fillna:

df = pd.concat([df, df1], ignore_index=True).fillna(-100)
print (df)
  col1  col2   col3
0    A     2    0.0
1    B     9    1.0
2    G     3 -100.0

If need same dtypes add DataFrame.astype:

d = df.dtypes.append(df1.dtypes).to_dict()
df = pd.concat([df, df1], ignore_index=True).fillna(-100).astype(d)
print (df)

  col1  col2  col3
0    A     2     0
1    B     9     1
2    G     3  -100
  • Related