I have two df's that I want to merge but they do not have a common column.
Thus, I have created a temporary column on each of the dataframes called tmp
:
y_pred['tmp'] = 1
data['tmp'] = 1
data
looks like:
mean year tmp
4600 2.3 2019 1
2601 5.3 2020 1
whereas y_pred
looks like:
pred_score tmp
0 2 1
1 5.2 1
and I merge them:
new_df = pd.merge(data, y_pred, on=['tmp'], how='left')
new_df.drop('tmp', inplace=True, axis=1)
I get 900 rows where I need to have only 30 (suppose that datasets have 30 rows each, I get 30 times 30)
whereas what I need is new_df
to have 30 rows and just merge the column pred_score
to data
in the order the rows are currently.
So that I would get:
new_df:
mean year pred_score
4600 2.3 2019 2
2601 5.3 2020 5.2
Is there a way to achieve this without having a common column?
CodePudding user response:
Use y_pred.values
:
>>> data
mean year
4600 2.3 2019
2601 5.3 2020
>>> y_pred
pred_score
0 2.0
1 5.2
>>> data['pred_score'] = y_pred.values
# Output
mean year pred_score
4600 2.3 2019 2.0
2601 5.3 2020 5.2