There are two dataframes, one dataframe might have less columns than another one. For instance,
import pandas as pd
import numpy as np
df = pd.DataFrame({
'col1': ['A', 'B'],
'col2': [2, 9],
'col3': [0, 1]
})
df1 = pd.DataFrame({
'col1': ['G'],
'col2': [3]
})
The df
and df1
are shown as follows.
I would like to combine these two dataframes together, and the missing values should be assigned as some given value, like -100. How to perform this kind of combination.
CodePudding user response:
You could reindex
the DataFrames first to "preserve" the dtypes; then concatenate:
cols = df.columns.union(df1.columns)
out = pd.concat([d.reindex(columns=cols, fill_value=-100) for d in [df, df1]],
ignore_index=True)
Output:
col1 col2 col3
0 A 2 0
1 B 9 1
2 G 3 -100
CodePudding user response:
Use concat
with DataFrame.fillna
:
df = pd.concat([df, df1], ignore_index=True).fillna(-100)
print (df)
col1 col2 col3
0 A 2 0.0
1 B 9 1.0
2 G 3 -100.0
If need same dtypes add DataFrame.astype
:
d = df.dtypes.append(df1.dtypes).to_dict()
df = pd.concat([df, df1], ignore_index=True).fillna(-100).astype(d)
print (df)
col1 col2 col3
0 A 2 0
1 B 9 1
2 G 3 -100