How can I fill a dataframe on specific column names of another dataframe-CodePudding

I have constructed a pretty basic dataframe where the column names are years:

import pandas as pd
column_names = [x for x in range(2000,2005)]
df = pd.DataFrame(columns=column_names)

Which of course gives me a dataframe where the column names are years and currently has no entries.

2000    2001    2002    2003    2004

I also have a different dataframe where one column has specific dates and the second column has the corresponding year. I call this dataframe set0 because I will continously add more sets which are then numbered (set1,set2,etc.).


data = {'Date': ['2001-06-08', '2002-05-23', '2002-05-24', '2003-06-23'],
        'Year': [2001, 2002, 2002, 2003]}
df2 = pd.DataFrame(data)

Date    Year
0   2001-06-08  2001
1   2002-05-23  2002
2   2002-05-24  2002
3   2003-06-23  2003

Now what I want to do is to create something like this: It takes the first dataframe, adds a first column which has a name of a certain dataset, in this case set0. I will then group this dataset by years and if I have an entry for a year I can create this:

set_name 2000   2001    2002    2003    2004
set0        0      1       2       1       0

I have found nothing similar on the web. I have done the grouping but then wasn't able to add the entries in the corresponding columns. Any help or hint is much appreciated!

CodePudding user response：

Does this answer your question ?

import pandas as pd

column_names = [x for x in range(2000, 2005)]

df = pd.DataFrame(index=column_names)

data = {
    'Date': ['2001-06-08', '2002-05-23', '2002-05-24', '2003-06-23'],
    'Year': [2001, 2002, 2002, 2003]
}

df2 = pd.DataFrame(data)
df2_grouped = df2.groupby('Year').count()['Date']

df['set0'] = df2_grouped
# Pandas version 1.5 
df = df.fillna(0).reset_index(names='set_name').pivot_table(columns='set_name')
# All versions : 
df = df.fillna(0).reset_index().rename(columns={
    'index': 'set_name'
}).pivot_table(columns='set_name')


print(df)

Result :

set_name  2000  2001  2002  2003  2004
set0       0.0   1.0   2.0   1.0   0.0