I have a dataframe like this:
Index Category Sales
01-01-21 Blue 83.20
01-02-21 Yellow 98.57
01-03-21 Green 55.70
01-04-21 Orange 87.32
01-05-21 Black 65.02
01-06-21 Yellow 61.05
01-07-21 White 10.54
01-08-21 Yellow 73.59
01-09-21 Blue 13.88
01-10-21 Black 47.39
01-11-21 Blue 55.75
01-12-21 Yellow 39.64
01-01-22 Green 39.64
01-02-22 Orange 39.64
01-03-22 Black 28.87
01-04-22 Yellow 82.40
01-05-22 White 28.09
01-06-22 Yellow 74.64
01-07-22 Blue 18.27
01-08-22 Black 11.16
01-09-22 Green 79.49
01-10-22 Green 46.74
01-11-22 Yellow 6.81
01-12-22 Blue 8.00
I want to add a new column repeating the values of Sales for Oct, Nov and Dec into the next 12 months. So the result will looks like this:
Index Category Sales New Column
01-01-21 Blue 83.20
01-02-21 Yellow 98.57
01-03-21 Green 55.70
01-04-21 Orange 87.32
01-05-21 Black 65.02
01-06-21 Yellow 61.05
01-07-21 White 10.54
01-08-21 Yellow 73.59
01-09-21 Blue 13.88
--------------------------------------------
01-10-21 Black 47.39
01-11-21 Blue 55.75
01-12-21 Yellow 39.64
--------------------------------------------
01-01-22 Green 39.64 47.39
01-02-22 Orange 39.64 55.75
01-03-22 Black 28.87 39.64
01-04-22 Yellow 82.40 47.39
01-05-22 White 28.09 55.75
01-06-22 Yellow 74.64 39.64
01-07-22 Blue 18.27 47.39
01-08-22 Black 11.16 55.75
01-09-22 Green 79.49 39.64
01-10-22 Green 46.74 47.39
01-11-22 Yellow 6.81 55.75
01-12-22 Blue 8.00 39.64
I have tried in this way:
recurr_oct = new.loc[new.index.isin(['01-10-21']), 'Sales'].values
recurr_nov = new.loc[new.index.isin(['01-11-21']), 'Sales'].values
recurr_dec = new.loc[new.index.isin(['01-12-21']), 'Sales'].values
new.index = pd.to_datetime(new.index)
rating = []
for row in new.index.month:
if row == 1 or row == 4 or row == 7 or row == 10: rating.append(recurr_oct)
elif row == 2 or row == 5 or row == 8 or row == 11: rating.append(recurr_nov)
elif row == 3 or row == 6 or row == 9 or row == 12: rating.append(recurr_dec)
else: rating.append(0)
new['New Column'] = rating
new.head()
but it takes time and it does not looks that good, is there a better way to do it?
Thanks!
CodePudding user response:
I would suggest the modulo %
combined with the .month
attribute for this.
To change your code, I would use something like this:
# your code...
for row in new.index.month:
if (row % 3) == 1: rating.append(recurr_oct)
elif (row % 3) == 2: rating.append(recurr_nov)
elif (row % 3) == 0: rating.append(recurr_dec)
You could also make this into a function and then use df.apply()
with that function. IMHO cleaner and probably faster.
CodePudding user response:
if row % 3 == 1: rating.append(recurr_oct)
elif row % 3 == 2: rating.append(nov)
else: rating.append(recurr_dec)