I'm using python 3.9.7 and pandas version 1.3.4.
I'm trying to create a normalized set of columns in pandas, but my columns keep returning as NaNs. I broke the steps down and assigned intermediate variables, which have non-NaN values, but when I go to do the final reassignment back to the dataframe, then everything becomes NaNs. I wrote a simpler example case
import numpy as np
import pandas as pd
time = [1.0, 1.1, 2.0]
col1 = [1.0, 3.0, 6.0]
col2 = [3.0, 5.0, 9.0]
col3 = [1.5, 2.5, 3.5]
junk = ['wow', 'fun', 'times']
df2 = pd.DataFrame({'Time [days]': time, 'col1': col1, 'col2': col2,'col3': col3, 'junk':junk})
df2
num1 = len(df2.columns)
num2 = len(df2.columns[1:-1])
for col in df2.columns[1:-1]:
df3 = pd.DataFrame({str(col) '_normalized_values' : df2[str(col)]})
df2 = df2.join(df3)
del df3
df2.head()
df2.index = df2['Time [days]'].values
t=df2.index[1]
cols = df2.columns
a = df2.loc[t,cols[1:(num1-1)]]
b = (df2.groupby('Time [days]').sum().loc[t,cols[1:(num1-1)]] 1.0e-20)
c = a/b #c is coming back as the expected values
df2.loc[t,cols[num1:(num1 num2)]] = c
df2.loc[t,cols[num1:(num1 num2)]] #This step always prints all NaNs
I've checked the shapes of c and the LHS assignment, and they're the same. I also checked the dtypes, and they're also the same. At this point, I'm at a loss for what could be causing the issue.
CodePudding user response:
There is an index-mismatch between c
and df2
. Changing the RHS of your final assignment to c.values
solves the problem:
df2.loc[t,cols[num1:(num1 num2)]] = c.values