Home > Back-end >  Strategy to replace negative count with positive count for time series data
Strategy to replace negative count with positive count for time series data

Time:11-11

Data column contains values of new customer additions. However it contains negative values, it means that no. of customers left in that perticular week. What is the best way to replace negative values with positive values?

For instance, mean of previous values

Here is the sample

Week    New customer added
Week 01 1,881 
Week 02 7 
Week 03 10 
Week 04 10 
Week 05 (22)
Week 06 (18)
Week 07 22 
Week 08 25 
Week 09 3 
Week 10 (36)
Week 11 25 
Week 12 20 
Week 13 (7)
Week 14 (24)
Week 15 10 
Week 16 29 
Week 17 4 
Week 18 (55)
Week 19 (3)
Week 20 17 
Week 21 (1,875)
Week 22 1,867 
Week 23 (13)
Week 24 47 
Week 25 40 
Week 26 (1,916)
Week 27 1,891 
Week 28 0 
Week 29 22 

Current strategy is

df['New customer added'].mask(df['New customer added'].lt(0)).bfill().fillna(0) 

I'm thinking of second strategy i.e. replacing negative number with mean of previous 2-3 positive values.

Expected output of second strategy is

Week    New customer added
Week 01 1,881 
Week 02 7 
Week 03 10 
Week 04 10 
Week 05 9  # Negative no. replaced by mean of prev 3(i.e. 7,10,10) positive no. i.e. 9
Week 06 9  # Negative no. replaced by mean of prev 3(i.e. 7,10,10) positive no. i.e. 9

CodePudding user response:

Use:

#is sample data are ,, so replace to emoty string
df['New customer added'] = df['New customer added'].astype(str).str.replace(',','')

#negatives are in sample with (
m = df['New customer added'].astype(str).str.contains('\(')

Solution filter only positive rows and use rolling with mean, last use DataFrame.reindex by original index with forward filling missing values (for negatives rows)

new = (df.loc[~m, 'New customer added'].astype(float)
         .rolling(3, min_periods=1).mean()
         .reindex(df.index, method='ffill'))

And last replace only negatives:

df['New customer added'] = df['New customer added'].mask(m, new)

print (df)
       Week New customer added
0   Week 01               1881
1   Week 02                  7
2   Week 03                 10
3   Week 04                 10
4   Week 05                9.0
5   Week 06                9.0
6   Week 07                 22
7   Week 08                 25
8   Week 09                  3
9   Week 10          16.666667
10  Week 11                 25
11  Week 12                 20
12  Week 13               16.0
13  Week 14               16.0
14  Week 15                 10
15  Week 16                 29
16  Week 17                  4
17  Week 18          14.333333
18  Week 19          14.333333
19  Week 20                 17
20  Week 21          16.666667
21  Week 22               1867
22  Week 23         629.333333
23  Week 24                 47
24  Week 25                 40
25  Week 26         651.333333
26  Week 27               1891
27  Week 28                  0
28  Week 29                 22
  • Related