Home > other >  Create a new column in pandas dataframe with values from another dataframe, based on date
Create a new column in pandas dataframe with values from another dataframe, based on date

Time:04-26

I have 2 dataframes.

First dataframe contains PLN data:

         date  polarity  subjectivity
0  2022-03-25  0.118771      0.517508
1  2022-03-28  0.136532      0.555640
2  2022-04-04  0.447222      0.419444
3  2022-04-11 -0.022222      0.397222
4  2022-04-18 -0.098333      0.526111
5  2022-04-20 -0.077778      0.644444
6  2022-04-22  0.187500      0.437500
7  2022-04-25  0.350000      0.375000

The second dataframe contains stock data:

          open        high         low       close    adjclose    volume        date
0   277.559998  282.500000  272.059998  282.190002  282.190002  42549400  2022-03-28
1   286.950012  289.459991  279.799988  286.559998  286.559998  48898400  2022-03-29
2   283.040009  284.959991  275.029999  276.899994  276.899994  46348700  2022-03-30
3   277.820007  282.480011  272.700012  272.859985  272.859985  52344000  2022-03-31
4   273.750000  274.959991  262.670013  267.119995  267.119995  51653400  2022-04-01
5   267.279999  275.579987  266.130005  273.600006  273.600006  39712000  2022-04-04
6   272.540009  273.190002  258.200012  259.309998  259.309998  43661500  2022-04-05
7   249.339996  253.000000  240.029999  244.070007  244.070007  70383300  2022-04-06
8   244.410004  247.220001  234.779999  242.080002  242.080002  55799200  2022-04-07
9   239.169998  239.229996  230.619995  231.190002  231.190002  52409100  2022-04-08
10  222.139999  223.199997  216.600006  219.169998  219.169998  57520700  2022-04-11

What I want is to create both 'polarity' and 'subjetivity' columns in the second dataframe (the one with stock data) based on the date field the most efficient way.

Edit Adding desired output:

          open        high         low       close    adjclose    volume        date    polarity    subjetivity
0   277.559998  282.500000  272.059998  282.190002  282.190002  42549400  2022-03-28    0.136532       0.555640
1   286.950012  289.459991  279.799988  286.559998  286.559998  48898400  2022-03-29         NaN            NaN
2   283.040009  284.959991  275.029999  276.899994  276.899994  46348700  2022-03-30         NaN            NaN
3   277.820007  282.480011  272.700012  272.859985  272.859985  52344000  2022-03-31         NaN            NaN
4   273.750000  274.959991  262.670013  267.119995  267.119995  51653400  2022-04-01         NaN            NaN
5   267.279999  275.579987  266.130005  273.600006  273.600006  39712000  2022-04-04    0.447222       0.419444
6   272.540009  273.190002  258.200012  259.309998  259.309998  43661500  2022-04-05         NaN            NaN
7   249.339996  253.000000  240.029999  244.070007  244.070007  70383300  2022-04-06         NaN            NaN
8   244.410004  247.220001  234.779999  242.080002  242.080002  55799200  2022-04-07         NaN            NaN
9   239.169998  239.229996  230.619995  231.190002  231.190002  52409100  2022-04-08         NaN            NaN
10  222.139999  223.199997  216.600006  219.169998  219.169998  57520700  2022-04-11   -0.022222       0.397222

CodePudding user response:

df2.merge(df1, how = 'left', on = 'date')

          open        high         low  ...        date  polarity  subjectivity
0   277.559998  282.500000  272.059998  ...  2022-03-28  0.136532      0.555640
1   286.950012  289.459991  279.799988  ...  2022-03-29       NaN           NaN
2   283.040009  284.959991  275.029999  ...  2022-03-30       NaN           NaN
3   277.820007  282.480011  272.700012  ...  2022-03-31       NaN           NaN
4   273.750000  274.959991  262.670013  ...  2022-04-01       NaN           NaN
5   267.279999  275.579987  266.130005  ...  2022-04-04  0.447222      0.419444
6   272.540009  273.190002  258.200012  ...  2022-04-05       NaN           NaN
7   249.339996  253.000000  240.029999  ...  2022-04-06       NaN           NaN
8   244.410004  247.220001  234.779999  ...  2022-04-07       NaN           NaN
9   239.169998  239.229996  230.619995  ...  2022-04-08       NaN           NaN
10  222.139999  223.199997  216.600006  ...  2022-04-11 -0.022222      0.397222

CodePudding user response:

Try this

df2.merge(df, on='date', how='left')
  • Related