Home > OS >  Pandas dataframe normalization returning NaNs
Pandas dataframe normalization returning NaNs

Time:12-21

I'm using python 3.9.7 and pandas version 1.3.4.

I'm trying to create a normalized set of columns in pandas, but my columns keep returning as NaNs. I broke the steps down and assigned intermediate variables, which have non-NaN values, but when I go to do the final reassignment back to the dataframe, then everything becomes NaNs. I wrote a simpler example case

import numpy as np
import pandas as pd

time = [1.0, 1.1, 2.0]
col1 = [1.0, 3.0, 6.0]
col2 = [3.0, 5.0, 9.0]
col3 = [1.5, 2.5, 3.5]
junk = ['wow', 'fun', 'times']

df2 = pd.DataFrame({'Time [days]': time, 'col1': col1, 'col2': col2,'col3': col3, 'junk':junk})
df2


num1 = len(df2.columns)
num2 = len(df2.columns[1:-1])
for col in df2.columns[1:-1]:
    df3 = pd.DataFrame({str(col) '_normalized_values' : df2[str(col)]})
    df2 = df2.join(df3)
    del df3
df2.head()

df2.index = df2['Time [days]'].values
t=df2.index[1]
cols = df2.columns

a = df2.loc[t,cols[1:(num1-1)]]
b = (df2.groupby('Time [days]').sum().loc[t,cols[1:(num1-1)]] 1.0e-20)
c = a/b #c is coming back as the expected values

df2.loc[t,cols[num1:(num1 num2)]] = c 
df2.loc[t,cols[num1:(num1 num2)]] #This step always prints all NaNs

I've checked the shapes of c and the LHS assignment, and they're the same. I also checked the dtypes, and they're also the same. At this point, I'm at a loss for what could be causing the issue.

CodePudding user response:

There is an index-mismatch between c and df2. Changing the RHS of your final assignment to c.values solves the problem:

df2.loc[t,cols[num1:(num1 num2)]] = c.values
  • Related