Home > database >  Add new rows to data frame, where one column stays the same while other column changes values
Add new rows to data frame, where one column stays the same while other column changes values

Time:07-14

sorry, if my title sounds a bit confusing. What I'm basically trying to do is adding new rows in a data frame, where I duplicate the value of each unique value of one column, while another column's new values are changing.

This is what my data frame looks like:

id year
01 2022
02 2022
03 2022
... ...
99 2022

And I want it to look like this:

id year
01 2022
01 2023
01 2024
02 2022
02 2023
02 2024
03 2022
... ...
99 2024

I.e. I want for every id to add the years 2023 and 2024 in the year column. I tried doing this with an apply function, but it always didn't work out, could you guys help me out in solving this?

CodePudding user response:

years = [2022   i for i in range(3)]
# or
years = [2022,2023, 2024]
pd.DataFrame({
    'id': np.repeat((data:=df.id.to_numpy()), len(years)).reshape(-1,len(years)).flatten(),
    'year': np.repeat(np.array(years), data.shape[0]).reshape(len(years), data.shape[0]).T.flatten()
})

CodePudding user response:

You can simply make a list comprehension and concat all dataframe years wirh increments of your desire. For example:

pd.concat([df.assign(year=df.year increment) for increment in range(0,3)]).sort_values(by='id').reset_index(drop=True)

This will increment your dataframe to three years as follows. You can play around with range for the desired number of extensions:

id year
1 2022
1 2023
1 2024
2 2022
2 2023
2 2024
3 2022
3 2023
3 2024

CodePudding user response:

A quick solution would be to make two copies of your current dataframe and change accordingly the year date to 2023 and 2024. After you do that, concatenate all 3 datasets together using pd.concat.

  • Related