I am trying to divide all columns (all the months columns -there are way more in the original dataset so I cannot write them manually one by one) but few (Series ID & Country) only when the View Description is Percent change, otherwise keep the values of the other rows as they are.
Input
Series ID View Description Jan Feb Mar Apr Country
Food Percent change 219.98 210.98 205 202 Italy
Drinks Original Data Value 215.46 205.04 206 203 France
Food at Home Original Data Value 202.88 203 207 199 Angola
I tried two ways with no luck.
1st) for loop with an if statement
for i in df1:
if df1['View Description'] == 'Percent change':
df1 = df1.set_index(['Series ID', 'View Description', 'Year']).div(100).reset_index()
else:
2nd)
cols = ['Series ID', 'View Description']
mask = final_df['View Description'] == 'Percent change'
final_df.loc[mask, final_df.columns != cols] = final_df.loc[mask, final_df.columns != cols].div(100).reset_index()
Desired output
Series ID View Description Jan Feb Mar Apr Country
Food Percent change 2.1998 2.1098 2.05 2.02 Italy
Drinks Original Data Value 215.46 205.04 206 203 France
Food at Home Original Data Value 202.88 203 207 199 Angola
Any suggestions that it can be done more efficiently or even correctly possibly with one of the two methods that I've tried?
CodePudding user response:
Unite the analyzes within a single search and do the division, this way you work in detail and can define the specifics:
import pandas as pd
df = pd.DataFrame({
'Series ID': ['Food', 'Drinks', 'Food at Home'],
'View Description': ['Percent change', 'Original Data Value', 'Original Data Value'],
'Jan': [219.98, 'B', 'A'],
'Feb': [210.98, 'B', 'A'],
'Mar': [205, 'B', 'A'],
'Apr': [202, 'B', 'A'],
'Country': ['Italy', 'B', 'A']
})
months = ['Jan', 'Feb', 'Mar', 'Apr']
df.loc[df['View Description'] == 'Percent change', months] /= 100
print(df)
Output:
Series ID View Description Jan Feb Mar Apr Country
0 Food Percent change 2.1998 2.1098 2.05 2.02 Italy
1 Drinks Original Data Value B B B B B
2 Food at Home Original Data Value A A A A A
Note: I left the rest of the values as a string just to confirm that only the desired line would be modified (since divide a string would generate an error).