Home > Enterprise >  Write a Function to apply a specific % change to certain rows
Write a Function to apply a specific % change to certain rows

Time:07-27

Basically I have a dataframe:

# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Amount'])

and I want to write a function that will apply a percentage change to certain rows based on the values I give it:

def function(x, pct):
    
    if df['Name'] == x:
        df['Amount'] = df['Amount'] - (df['Amount'] * pct), df['Amount']
    else:
        df['Amount'] = df['Amount']
    
    return df

I know that I need to reference the data frame somewhere in the function but I'm struggling to figure out how to do it.

CodePudding user response:

You need to use apply across the series along columns axis like below:

def  function(s, x, pct):
    
    if s['Name'] == x:
        s['Amount'] = s['Amount'] - (s['Amount'] * pct), s['Amount']
    else:
        s['Amount'] = s['Amount']
        
    return s

and then use it for example 'tom' and 0.1

df.apply(lambda s: function(s, 'tom', 0.1), axis=1)

output of this is:

   Name     Amount
0   tom  (9.0, 10)
1  nick         15
2  juli         14

Note : you can do better than this, if you can define some sort of datastructure like dict and then using it in apply.

CodePudding user response:

Several ways how to accomplish this. Based on your own attempt:

def f(x, name, pct):
    if x['Name'] == name:
        return x['Amount']*(1-pct)
    return x['Amount']


df['Amount'] = df.apply(lambda x: f(x, 'tom', 0.25), axis=1)
df

   Name  Amount
0   tom     7.5
1  nick    15.0
2  juli    14.0

Or using np.where like so:

import numpy as np
pct = 0.25

df['Amount'] = np.where(df['Name'] == 'tom', (1-pct)*df['Amount'], df['Amount'])

Yet another option:

df = pd.DataFrame(data, columns=['Name', 'Amount'])
df.loc[df['Name'] == 'tom', 'Amount'] = df.loc[df['Name'] == 'tom', 'Amount']*(1-pct)

Will all get you the same output.

CodePudding user response:

Use boolean indexing:

names = 'tom'
pct = 0.2

df.loc[df['Name'].eq(name), 'Amount'] *= (1-pct)

with a list:

names = ['tom']
pct = 0.2

df.loc[df['Name'].isin(names), 'Amount'] *= (1-pct)

output:

   Name  Amount
0   tom       8
1  nick      15
2  juli      14

CodePudding user response:

def function(dataframe, name, pct_change):
    dataframe = dataframe.copy()
    dataframe.loc[dataframe.Name==name, "Amount"]*=(1-pct_change)
    return dataframe

#function call example
function(df, "nick", .5)

#function call output
#
#   Name  Amount
#0   tom    10.0
#1  nick     7.5
#2  juli    14.0

Note that the the function does not modify inplace df, but only return a modified copy of it. To replace the older dataframe with the new one:

df = function(df, "nick", .5)
  • Related