Write a Function to apply a specific % change to certain rows-CodePudding

Basically I have a dataframe:

# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Amount'])

and I want to write a function that will apply a percentage change to certain rows based on the values I give it:

def function(x, pct):
    
    if df['Name'] == x:
        df['Amount'] = df['Amount'] - (df['Amount'] * pct), df['Amount']
    else:
        df['Amount'] = df['Amount']
    
    return df

I know that I need to reference the data frame somewhere in the function but I'm struggling to figure out how to do it.

CodePudding user response：

You need to use apply across the series along columns axis like below:

def  function(s, x, pct):
    
    if s['Name'] == x:
        s['Amount'] = s['Amount'] - (s['Amount'] * pct), s['Amount']
    else:
        s['Amount'] = s['Amount']
        
    return s

and then use it for example 'tom' and 0.1

df.apply(lambda s: function(s, 'tom', 0.1), axis=1)

output of this is:

   Name     Amount
0   tom  (9.0, 10)
1  nick         15
2  juli         14

Note : you can do better than this, if you can define some sort of datastructure like dict and then using it in apply.

CodePudding user response：

Several ways how to accomplish this. Based on your own attempt:

def f(x, name, pct):
    if x['Name'] == name:
        return x['Amount']*(1-pct)
    return x['Amount']


df['Amount'] = df.apply(lambda x: f(x, 'tom', 0.25), axis=1)
df

   Name  Amount
0   tom     7.5
1  nick    15.0
2  juli    14.0

Or using np.where like so:

import numpy as np
pct = 0.25

df['Amount'] = np.where(df['Name'] == 'tom', (1-pct)*df['Amount'], df['Amount'])

Yet another option:

df = pd.DataFrame(data, columns=['Name', 'Amount'])
df.loc[df['Name'] == 'tom', 'Amount'] = df.loc[df['Name'] == 'tom', 'Amount']*(1-pct)

Will all get you the same output.

CodePudding user response：

Use boolean indexing:

names = 'tom'
pct = 0.2

df.loc[df['Name'].eq(name), 'Amount'] *= (1-pct)

with a list:

names = ['tom']
pct = 0.2

df.loc[df['Name'].isin(names), 'Amount'] *= (1-pct)

output:

   Name  Amount
0   tom       8
1  nick      15
2  juli      14

CodePudding user response：

def function(dataframe, name, pct_change):
    dataframe = dataframe.copy()
    dataframe.loc[dataframe.Name==name, "Amount"]*=(1-pct_change)
    return dataframe

#function call example
function(df, "nick", .5)

#function call output
#
#   Name  Amount
#0   tom    10.0
#1  nick     7.5
#2  juli    14.0

Note that the the function does not modify inplace df, but only return a modified copy of it. To replace the older dataframe with the new one:

df = function(df, "nick", .5)