Home > database >  How to move a single row from one pandas dataframe to another with an impure function?
How to move a single row from one pandas dataframe to another with an impure function?

Time:11-17

So I have have written a short function that basically moves a single row, based on the index, from one dataframe to another, while preserving the index.

If I have this test dataframe and an empty one:

df = pd.DataFrame({'lower': ['a','b','c'],
                   'upper': ['A', 'B', 'C'],
                   'number': [1, 2, 3]},
                  index=['first', 'second', 'third'])
print(df, '\n\n')

empty = pd.DataFrame(columns=['lower', 'upper', 'number'])
print(empty, '\n\n')

and I just use the instructions:

line = 'second'
empty = empty.append(df.loc[line])
df = df.drop(index=line)

it works.

But if I try to write an impure function that does the same thing, it only modifies the dataframes inside the function, and outside it they remain unchanged!?

Here is my entire code:

def move_line(ind, source, destination):
    row = source.loc[ind]
    destination = destination.append(row)
    source = source.drop(index=ind)
    print('source inside function\n', source, '\n\n')
    print('destination inside function\n', destination, '\n\n')


def main():

    df = pd.DataFrame({'lower': ['a','b','c'],
                       'upper': ['A', 'B', 'C'],
                       'number': [1, 2, 3]},
                      index=['first', 'second', 'third'])
    #print(df, '\n\n')

    empty = pd.DataFrame(columns=['lower', 'upper', 'number'])
    #print(empty, '\n\n')


    move_line('second', df, empty)

    print('source outside function\n', df, '\n\n')
    print('destination outside function\n', empty)

CodePudding user response:

it only modifies the dataframes inside the function, and outside it they remain unchanged!?

That is because DataFrame.append doesn't mutate the original DataFrame, it creates a new DataFrame with the new row. The original object is left unchanged. DataFrame.drop by default also doesn't change the original object, unless you pass inplace=True.

destination = destination.append(row)
source = source.drop(index=ind)

Here you are only rebinding the names destination and source to the objects returned by append and drop, they are not the same original objects which destination and source originally pointed to. The original objects remain unchanged.

To mutate the original objects you can do the following

def move_line(ind, source, destination):
    row = source.loc[ind]
    destination.loc[ind] = row 
    source.drop(index=ind, inplace=True)
    print('source inside function\n', source, '\n\n')
    print('destination inside function\n', destination, '\n\n')
df = pd.DataFrame({'lower': ['a','b','c'],
                   'upper': ['A', 'B', 'C'],
                   'number': [1, 2, 3]},
                  index=['first', 'second', 'third'])
#print(df, '\n\n')

empty = pd.DataFrame(columns=['lower', 'upper', 'number'])
#print(empty, '\n\n')


move_line('second', df, empty)

print('source outside function\n', df, '\n\n')
print('destination outside function\n', empty)

Output:

source inside function
       lower upper  number
first     a     A       1
third     c     C       3 


destination inside function
        lower upper number
second     b     B      2 


source outside function
       lower upper  number
first     a     A       1
third     c     C       3 


destination outside function
        lower upper number
second     b     B      2
  • Related