Add new rows to data frame, where one column stays the same while other column changes values-CodePudding

sorry, if my title sounds a bit confusing. What I'm basically trying to do is adding new rows in a data frame, where I duplicate the value of each unique value of one column, while another column's new values are changing.

This is what my data frame looks like:

id	year
01	2022
02	2022
03	2022
...	...
99	2022

And I want it to look like this:

id	year
01	2022
01	2023
01	2024
02	2022
02	2023
02	2024
03	2022
...	...
99	2024

I.e. I want for every id to add the years 2023 and 2024 in the year column. I tried doing this with an apply function, but it always didn't work out, could you guys help me out in solving this?

CodePudding user response：

years = [2022   i for i in range(3)]
# or
years = [2022,2023, 2024]
pd.DataFrame({
    'id': np.repeat((data:=df.id.to_numpy()), len(years)).reshape(-1,len(years)).flatten(),
    'year': np.repeat(np.array(years), data.shape[0]).reshape(len(years), data.shape[0]).T.flatten()
})

CodePudding user response：

You can simply make a list comprehension and concat all dataframe years wirh increments of your desire. For example:

pd.concat([df.assign(year=df.year increment) for increment in range(0,3)]).sort_values(by='id').reset_index(drop=True)

This will increment your dataframe to three years as follows. You can play around with range for the desired number of extensions:

id	year
1	2022
1	2023
1	2024
2	2022
2	2023
2	2024
3	2022
3	2023
3	2024

CodePudding user response：

A quick solution would be to make two copies of your current dataframe and change accordingly the year date to 2023 and 2024. After you do that, concatenate all 3 datasets together using pd.concat.