Home > Software design >  How to repeat values in a new Column in Pandas if condition
How to repeat values in a new Column in Pandas if condition

Time:03-10

I have a dataframe like this:


Index     Category   Sales   
01-01-21    Blue     83.20  
01-02-21    Yellow   98.57  
01-03-21    Green    55.70  
01-04-21    Orange   87.32  
01-05-21    Black    65.02  
01-06-21    Yellow   61.05  
01-07-21    White    10.54  
01-08-21    Yellow   73.59  
01-09-21    Blue     13.88  
01-10-21    Black    47.39  
01-11-21    Blue     55.75  
01-12-21    Yellow   39.64  
01-01-22    Green    39.64  
01-02-22    Orange   39.64  
01-03-22    Black    28.87    
01-04-22    Yellow   82.40   
01-05-22    White    28.09    
01-06-22    Yellow   74.64  
01-07-22    Blue     18.27   
01-08-22    Black    11.16    
01-09-22    Green    79.49  
01-10-22    Green    46.74    
01-11-22    Yellow   6.81    
01-12-22    Blue     8.00    

I want to add a new column repeating the values of Sales for Oct, Nov and Dec into the next 12 months. So the result will looks like this:

Index   Category     Sales   New Column 
01-01-21    Blue     83.20  
01-02-21    Yellow   98.57  
01-03-21    Green    55.70  
01-04-21    Orange   87.32  
01-05-21    Black    65.02  
01-06-21    Yellow   61.05  
01-07-21    White    10.54  
01-08-21    Yellow   73.59  
01-09-21    Blue     13.88  
 -------------------------------------------- 
01-10-21    Black    47.39  
01-11-21    Blue     55.75  
01-12-21    Yellow   39.64  
 -------------------------------------------- 
01-01-22    Green    39.64   47.39 
01-02-22    Orange   39.64   55.75 
01-03-22    Black    28.87   39.64 
01-04-22    Yellow   82.40   47.39 
01-05-22    White    28.09   55.75 
01-06-22    Yellow   74.64   39.64 
01-07-22    Blue     18.27   47.39 
01-08-22    Black    11.16   55.75 
01-09-22    Green    79.49   39.64 
01-10-22    Green    46.74   47.39 
01-11-22    Yellow   6.81    55.75 
01-12-22    Blue     8.00    39.64 

I have tried in this way:

recurr_oct = new.loc[new.index.isin(['01-10-21']), 'Sales'].values
recurr_nov = new.loc[new.index.isin(['01-11-21']), 'Sales'].values
recurr_dec = new.loc[new.index.isin(['01-12-21']), 'Sales'].values

new.index = pd.to_datetime(new.index)
rating = []
for row in new.index.month:
    if row == 1 or row == 4 or row == 7 or row == 10:    rating.append(recurr_oct)
    elif row == 2 or row == 5 or row == 8 or row == 11:   rating.append(recurr_nov)
    elif row == 3 or row == 6 or row == 9 or row == 12:  rating.append(recurr_dec)
    else: rating.append(0)
        
new['New Column'] = rating
new.head()

but it takes time and it does not looks that good, is there a better way to do it?

Thanks!

CodePudding user response:

I would suggest the modulo % combined with the .month attribute for this.

To change your code, I would use something like this:

# your code...
for row in new.index.month:
    if (row % 3) == 1:    rating.append(recurr_oct)
    elif (row % 3) == 2:   rating.append(recurr_nov)
    elif (row % 3) == 0:  rating.append(recurr_dec)

You could also make this into a function and then use df.apply() with that function. IMHO cleaner and probably faster.

CodePudding user response:

if row % 3 == 1: rating.append(recurr_oct)
elif row % 3 == 2: rating.append(nov)
else: rating.append(recurr_dec)
  • Related