Losing variables in Jupyter Notebook-CodePudding

In a jupyter notebook, I declare one variable from file:

with fits.open('mind_dataset/matrix_CEREBELLUM_large.fits') as data:
    matrix_cerebellum = pd.DataFrame(data[0].data.byteswap().newbyteorder())

In the cells below, I have two methods:

neuronal_web_pixel = 0.32 # 1 micron => 10e-6 meters

def pixels_to_scale(df, mind=False, cosmos=False):
    
    one_pixel_equals_micron = neuronal_web_pixel
    brain_mask = (df != 0.0)
    df[brain_mask] *= one_pixel_equals_micron
        
    return df

and

def binarize_matrix(df, mind=False, cosmos=False):
    
    brain_Llink = 16.0 # microns
    zero_mask = (df != 0)
    low_mask = (df <= brain_Llink)
    df[low_mask & zero_mask] = 1.0
    higher_mask = (df >= brain_Llink)
    df[higher_mask] = 0.0
       
    return df

Then I pass my variables to methods, to obtain scaled and binary dataframes:

matrix_cerebellum_scaled = pixels_to_scale(matrix_cerebellum, mind=True)

And:

matrix_cerebellum_binary = binarize_matrix(matrix_cerebellum_scaled, mind=True)

However, if I call 'matrix_cerebellum_scaled', now it points to 'matrix_cerebellum_binary' and I lose 'matrix_cerebellum_scaled' dataframe.

Why? what am I missing?

CodePudding user response：

Naming thing: those aren't methods, they're functions; now: if you modify a DataFrame within a function those changes still happen to the DataFrame. If you want a new DataFrame, declare it as a copy of the one being passed in.

At the very least at the top of binarize_matrix() do: new_df = df.copy(). More detail about why that's necessary in this SO answer and comments: https://stackoverflow.com/a/39628860/42346