I got the following table with repeated values per person:
Costumer | Jan | Feb | Mar | Apr | Jun | Jul |
---|---|---|---|---|---|---|
Adam | 345 | 345 | 345 | 345 | 345 | |
Susan | 645 | 645 | 645 | 645 | ||
Paul | 153 | 153 | 153 | 153 |
You can see that the first value is actual value. So that table should be like that:
Costumer | Jan | Feb | Mar | Apr | Jun | Jul |
---|---|---|---|---|---|---|
Adam | 345 | |||||
Susan | 645 | |||||
Paul | 153 |
What's the better approach to solve this table?
Data:
import numpy as np
import pandas as pd
data = {'Costumer': ['Adam', 'Susan', 'Paul'],
'Jan': [345.0, np.NaN, np.NaN],
'Feb': [345.0, np.NaN, 153.0],
'Mar': [345.0, 645.0, 153.0],
'Apr': [345.0, 645.0, 153.0],
'Jun': [345.0, 645.0, 153.0],
'Jul': [np.NaN, 645.0, np.NaN]}
df = pd.DataFrame(data)
CodePudding user response:
You could mask
the duplicate values:
out = (df.mask(df.apply(lambda x: x.duplicated(), axis=1)).fillna(''))
Output:
Costumer Jan Feb Mar Apr Jun Jul
0 Adam 345.0
1 Susan 645.0
2 Paul 153.0