We have the following dummy data frame that collects counts of warnings (based on a reason) for a given date:
temp = pd.DataFrame(np.array([['2022-10-13','one',123],['2022-10-13','two',77123],
['2022-10-13','three',451], ['2022-10-13','three',77]]),
columns = ['date','reason','count'])
The problem is in the count column
date reason count
0 2022-10-13 one 123
1 2022-10-13 two 77123
2 2022-10-13 three 451
3 2022-10-13 three 77
The data for reasons one and three needs to be scaled by 100 as it is stored in a minimized way on the database.
Is there a way to traverse the cells and add '00' to the counts or multiply by 100 where reason is not equal to two? And end up with something like this:
date reason count
0 2022-10-13 one 12300
1 2022-10-13 two 77123
2 2022-10-13 three 45100
3 2022-10-13 three 7700
How would this be achieved?
CodePudding user response:
First make sure the column count
are integers (or floats), then use .loc[] and select the rows you wish to adapt (temp.reason != 'two'
) and make the change (temp['count'] * 100
):
temp['count'] = temp['count'].astype(int)
temp.loc[temp.reason != 'two', 'count'] = temp['count'] * 100
print(temp)
Output
date reason count
0 2022-10-13 one 12300
1 2022-10-13 two 77123
2 2022-10-13 three 45100
3 2022-10-13 three 7700
CodePudding user response:
Use DataFrame.loc
with *=100
for multiple values if not matched by Series.ne
:
temp['count'] = temp['count'].astype(int)
temp.loc[temp['reason'].ne('two'), 'count'] *= 100
print (temp)
date reason count
0 2022-10-13 one 12300
1 2022-10-13 two 77123
2 2022-10-13 three 45100
3 2022-10-13 three 7700
CodePudding user response:
You can use np.where
temp['count'] = temp['count'].astype(int)
temp['count'] = np.where(temp['reason']!='two', temp['count']*100, temp['count'])