Home > other >  how to merge two dataframes in pandas based on order
how to merge two dataframes in pandas based on order

Time:11-05

I have two df's that I want to merge but they do not have a common column.

Thus, I have created a temporary column on each of the dataframes called tmp:

y_pred['tmp'] = 1
data['tmp'] = 1 

data looks like:

     mean  year  tmp
4600  2.3  2019  1
2601  5.3  2020  1

whereas y_pred looks like:

     pred_score  tmp
0     2           1
1     5.2         1

and I merge them:

new_df = pd.merge(data, y_pred, on=['tmp'], how='left')
new_df.drop('tmp', inplace=True, axis=1)

I get 900 rows where I need to have only 30 (suppose that datasets have 30 rows each, I get 30 times 30)

whereas what I need is new_df to have 30 rows and just merge the column pred_score to data in the order the rows are currently.

So that I would get:

new_df:

     mean  year  pred_score
4600  2.3  2019  2
2601  5.3  2020  5.2

Is there a way to achieve this without having a common column?

CodePudding user response:

Use y_pred.values:

>>> data
      mean  year
4600   2.3  2019
2601   5.3  2020

>>> y_pred
   pred_score
0         2.0
1         5.2

>>> data['pred_score'] = y_pred.values

# Output
      mean  year  pred_score
4600   2.3  2019         2.0
2601   5.3  2020         5.2
  • Related