Update values in dataframe column based on array with row numbers indicating the start of the range-CodePudding

I have the following dataframe and two arrays fillrows and salescodes. Fillrows is an array that I generated by extracting the rows where a condition is met. In order to be able to identify the person who generate the sales, I need to fill an additional column with the SalesCode. In fillrows I have the starting row of every SalesCode that corresponds to the codes in array salescodes.

df = pd.DataFrame([['', 120], ['', 300], ['', 160],['', 220], ['', 400], ['', 560],['', 860],['',900]], columns=['SalesCode', 'Sales'])
fillrows=[0,3,5]
salescodes =['XA','AC','BC']

  SalesCode  Sales
0              120
1              300
2              160
3              220
4              400
5              560
6              860
7              900

In need to have this output

Output
   SalesCode  Sales
0        XA    120
1        XA    300
2        XA    160
3        AC    220
4        AC    400
5        BC    560
6        BC    860
7        BC    900

CodePudding user response：

You could join a crafted Series and ffill:

s = pd.Series(salescodes, index=fillrows[:-1], name='SalesCode')
df2 = df.drop('SalesCode', axis=1).join(s).ffill()

NB. the drop part is not needed if the empty "SalesCode" columns does not exist in df

output:

>>> df2
   Sales SalesCode
0    120        XA
1    300        XA
2    160        XA
3    220        AC
4    400        AC
5    560        BC
6    860        BC
7    900        BC

CodePudding user response：

You can simply iterate over the array of fillrows and add values for salescode. note: I changed the last index to be Len(df) since there's no 4th value in salescode that can start from idx 7; feel free to play around.

import pandas as pd
from itertools import repeat

df = pd.DataFrame([['', 120], ['', 300], ['', 160],['', 220], ['', 400], ['', 560],['', 860],['',900]], columns=['SalesCode', 'Sales'])
fillrows=[0,3,5, len(df)]
salescodes =['XA','AC','BC']

SalesCodes = []

for idx, _ in enumerate(fillrows[:-1]):
  SalesCodes.extend(repeat(salescodes[idx], fillrows[idx 1] - fillrows[idx]))

df = df.assign(SalesCode=SalesCodes)

> df
  SalesCode  Sales
0        XA    120
1        XA    300
2        XA    160
3        AC    220
4        AC    400
5        BC    560
6        BC    860
7        BC    900