I've got two data frames, one has historical prices of stocks in this format:
year | Company1 | Company2 |
---|---|---|
1980 | 4.66 | 12.32 |
1981 | 5.68 | 15.53 |
etc with hundreds of columns, then I have a dataframe specifing a company, its sector and its country.
company 1 | industrials | Germany |
---|---|---|
company 2 | consumer goods | US |
company 3 | industrials | France |
I used the first dataframe to plot the prices of various companies over time, however, I'd like to now somehow group the data from the first table with the second one and create a separate dataframe which will have form of sectors total value of time, ie.
year | industrials | consumer goods | healthcare |
---|---|---|---|
1980 | 50.65 | 42.23 | 25.65 |
1981 | 55.65 | 43.23 | 26.15 |
Thank you
CodePudding user response:
You can do the following, assuming df_1 is your DataFrame with price of stock per year and company, and df_2 your DataFrame with information on the companies:
# turn company columns into rows
df_1 = df_1.melt(id_vars='year', var_name='company')
df_1 = df_1.merge(df_2)
# groupby and move industry to columns
output = df_1.groupby(['year', 'industry'])['value'].sum().unstack('industry')
Output:
industry consumer goods industrials
year
1980 12.32 4.66
1981 15.53 5.68