Home > database >  (Python)Localizing variable
(Python)Localizing variable

Time:11-29

My goal is to alter dataframe df using my_function then assign the result to dataframe df. But when I use function the dataframe df which is in the outside of the function is changed. How could I modify function not to affect the df varable which is outside of funtion?

import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]}, index=['2021-11-24', '2021-11-25', '2021-11-26'])


def my_function(df_temp):
    df_temp['A'][0] = 100  # How could I modify not to affect df varable which is outside of funtion
    return df_temp         

   something = my_function(df)
   print(df)   # df is already altered although I didn't assign

# df = my_function(df)
# print(df)

CodePudding user response:

Try these solutions

  1. Using pandas.apply function
import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]}, index=['2021-11-24', '2021-11-25', '2021-11-26'])


def my_function(row):
    row[0] = 100       
    return row


something = df.apply(my_function)
print(something)
A
2021-11-24  100
2021-11-25  20
2021-11-26  30
print(df)

A
2021-11-24  10
2021-11-25  20
2021-11-26  30



2. Using pandas.copy function

import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]}, index=['2021-11-24', '2021-11-25', '2021-11-26'])

def my_function(df):
    temp_df = df.copy()
    temp_df['A'][0] = 100
    return temp_df


something = my_function(df)
print(something)
A
2021-11-24  100
2021-11-25  20
2021-11-26  30
print(df)

A
2021-11-24  10
2021-11-25  20
2021-11-26  30

CodePudding user response:

The parameters are always passed by assignment in Python, so that the DataFrame gets mutated inside of a function. Working on references is preferred as it does not affect the performance.

If you are forced to keep an original object, you can perform the operation by creating a copy manually.

import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]}, index=['2021-11-24', '2021-11-25', '2021-11-26'])

def my_function(df_temp):
    df_temp['A'][0] = 99

dfc = df.copy()
my_function(dfc) # alter the copy

print(df) # unchanged
print(dfc) # altered

You can read more about passing variables in the documentation: https://docs.python.org/3/faq/programming.html#how-do-i-write-a-function-with-output-parameters-call-by-reference

  • Related