Home > other >  Apply a condition on all but certain columns only for a certain value of another column
Apply a condition on all but certain columns only for a certain value of another column

Time:07-19

I am trying to divide all columns (all the months columns -there are way more in the original dataset so I cannot write them manually one by one) but few (Series ID & Country) only when the View Description is Percent change, otherwise keep the values of the other rows as they are.

Input

Series ID       View Description       Jan      Feb      Mar   Apr   Country
Food            Percent change         219.98   210.98   205   202   Italy
Drinks          Original Data Value    215.46   205.04   206   203   France
Food at Home    Original Data Value    202.88   203      207   199   Angola

I tried two ways with no luck.

1st) for loop with an if statement

for i in df1:
        if df1['View Description'] == 'Percent change':
                df1 = df1.set_index(['Series ID', 'View Description', 'Year']).div(100).reset_index()
        else:

2nd)

cols = ['Series ID', 'View Description']
mask = final_df['View Description'] == 'Percent change'
final_df.loc[mask, final_df.columns != cols] = final_df.loc[mask, final_df.columns != cols].div(100).reset_index()

Desired output

Series ID       View Description       Jan      Feb      Mar   Apr   Country
Food            Percent change         2.1998   2.1098   2.05  2.02  Italy
Drinks          Original Data Value    215.46   205.04   206   203   France
Food at Home    Original Data Value    202.88   203      207   199   Angola

Any suggestions that it can be done more efficiently or even correctly possibly with one of the two methods that I've tried?

CodePudding user response:

Unite the analyzes within a single search and do the division, this way you work in detail and can define the specifics:

import pandas as pd

df = pd.DataFrame({
    'Series ID': ['Food', 'Drinks', 'Food at Home'],
    'View Description': ['Percent change', 'Original Data Value', 'Original Data Value'],
    'Jan': [219.98, 'B', 'A'],
    'Feb': [210.98, 'B', 'A'],
    'Mar': [205, 'B', 'A'],
    'Apr': [202, 'B', 'A'],
    'Country': ['Italy', 'B', 'A']
})

months = ['Jan', 'Feb', 'Mar', 'Apr']
df.loc[df['View Description'] == 'Percent change', months] /= 100
print(df)

Output:

      Series ID     View Description     Jan     Feb   Mar   Apr Country
0          Food       Percent change  2.1998  2.1098  2.05  2.02   Italy
1        Drinks  Original Data Value       B       B     B     B       B
2  Food at Home  Original Data Value       A       A     A     A       A

Note: I left the rest of the values as a string just to confirm that only the desired line would be modified (since divide a string would generate an error).

  • Related