df = pd.DataFrame({'Date': {0:'2020-04-01', 1:'2020-04-01', 2:'2020-04-01',
3:'2020-04-02', 4:'2020-04-02',
5:'2020-04-03', 6:'2020-04-03', 7:'2020-04-03'},
'Name': {0:'AA', 1:'BB', 2:'CC',
3:'AA', 4:'BB',
5:'AA', 6:'BB', 7:'CC'},
'Weight':{0:1, 1:3, 2:6,
3:1, 4:4,
5:2, 6:5, 7:7}})
df=df.set_index(['Date'])
For each unique date in df.index - I would like to compare the Name columns to those present for the previous date, and if a name is missing, add a value of 0 to the weight column
For example..How would I recognise that 2020-04-02 was missing a row for 'CC' and insert a weight of 0 for this row.
Thereby achieving a DataFrame that looks like this...
Name | Weight | |
---|---|---|
2020-04-01 | AA | 1 |
2020-04-01 | BB | 3 |
2020-04-01 | CC | 6 |
2020-04-02 | AA | 1 |
2020-04-02 | BB | 4 |
2020-04-02 | CC | 0 |
2020-04-03 | AA | 2 |
2020-04-03 | BB | 5 |
2020-04-03 | CC | 7 |
CodePudding user response:
You can add Name
to index
, then reshape by DataFrame.unstack
with add 0
with DataFrame.stack
for original format of DataFrame
:
df = df.set_index('Name', append=True).unstack(fill_value=0).stack().reset_index(level=1)
print (df)
Name Weight
Date
2020-04-01 AA 1
2020-04-01 BB 3
2020-04-01 CC 6
2020-04-02 AA 1
2020-04-02 BB 4
2020-04-02 CC 0
2020-04-03 AA 2
2020-04-03 BB 5
2020-04-03 CC 7