Home > other >  Function modifies Pandas dataframe but can't access the modifed datarame
Function modifies Pandas dataframe but can't access the modifed datarame

Time:06-11

I previously posted a question and code to SO about flattening JSON data that in retrospect was too convoluted so I've tried to simplify and post a new question (original question: How to flatten a pandas dataframe with some columns as json? follow-up).

My function code takes a Pandas dataframe as an input parameter and modifies it, but I can't access the modified dataframe after the function runs; I still see the original version. Here's an example: First import numpy and pandas and create a dataframe:

df = pd.DataFrame(np.arange(12).reshape(3, 4), columns=['A', 'B', 'C', 'D'])

Next, create a function to modify the dataframe and run it:

def drop_col(df):
    print(f"original shape: {df.shape}")
    df = df.drop(['B'], axis=1)
    print(f"final shape: {df.shape}")
    return df

The print statements show that the original shape of the dataframe was (3,4) and the final shape and the final shape is (3,3), which indicates tha column B was dropped as intended. However, once the function runs and I access the dataframe with df.head() for example, it shows the orignal dataframe with 3 rows and 4 columns.

CodePudding user response:

I suppose you didn't run the function correctly.

import pandas as pd
import numpy as np

def drop_col(df_in):
    print(f"original shape: {df_in.shape}")
    df_out = df_in.drop(['B'], axis=1)
    print(f"final shape: {df_out.shape}")
    return df_out

df_1 = pd.DataFrame(np.arange(12).reshape(3, 4), columns=['A', 'B', 'C', 'D'])

df_2 = drop_col(df_1)

print(df_2)
A C D
0 0 2 3
1 4 6 7
2 8 10 11
  • Related