python pandas how to set unique value for null row?-CodePudding

I have an data frame like this:

       sku         new-sku
0  FAT-001     FAT-001
1  FAT-001  FAT-001-01
2  FAT-001  FAT-001-02
3  FAT-002     FAT-002
4  FAT-002  FAT-002-01
5           
6            
7 FAT-003   FAT-003
8            
9

here is my code:

groups = df.groupby('sku').cumcount()
df['new'] = df['sku']   ('-'   groups.astype('string').str.zfill(2)).mask(groups.eq(0), '')

My expected result will be look like this:

       sku         new-sku
0  FAT-001     FAT-001
1  FAT-001  FAT-001-01
2  FAT-001  FAT-001-02
3  FAT-002     FAT-002
4  FAT-002  FAT-002-01
5           FAT-null-01
6           FAT-null-02
7 FAT-003   FAT-003
8           FAT-null-03
9           FAT-null-04

It will increment by 1 for every new null row.

The constructor:

{'sku': {0: 'FAT-001', 1: ' ', 2: ' ', 3: 'FAT-002', 4: 'FAT-002', 5: ' ', 6: ' ', 7: 'FAT-003', 8: 'FAT-003', 9: 'FAT-004'}}

CodePudding user response：

Building on my answer to your previous question, We could add a mask when we create groups using groupby.cumcount for the white space rows and adjust accordingly:

groups = df.groupby('sku').cumcount()
groups = groups.mask(df['sku'].eq(' '), groups 1)
df['new-sku'] = df['sku'].replace(' ', 'FAT-null')   ('-'   groups.astype('string').str.zfill(2)).mask(groups.eq(0), '')

Output:

   ID      sku      new-sku
0   1  FAT-001      FAT-001
1   2           FAT-null-01
2   3           FAT-null-02
3   4  FAT-002      FAT-002
4   5  FAT-002   FAT-002-01
5   6           FAT-null-03
6   7           FAT-null-04
7   8  FAT-003      FAT-003
8   9  FAT-003   FAT-003-01
9  10  FAT-004      FAT-004