I am solving a problem in which there is a column with NaN
values that need to be filled in by the number of rows, starting from one. Like this:
j П, k=1 П, k=2
0 1 10.0 40.0
1 2 20.0 50.0
2 3 30.0 60.0
Sum NaN 60.0 150.0
I try to do this in two ways: the first one works out incorrectly - it gives all threes instead of 1,2,3
, the second option does not work at all.
In addition, the first option counts the amount in this column, and I don't need it there. What should I do?
Code:
import numpy as np
import pandas as pd
m=4
df1 = pd.DataFrame(data = {'j': [np.nan, np.nan, np.nan], 'П, k=1': [10, 20, 30], 'П, k=2': [40, 50, 60]})
df1.loc['Sum'] = df1.sum()
# THE FIRST OPTION:
# for j in range(1, m):
# df1['j'] = j
# THE SECOND OPTION:
bc = [x for x in range(1, m)]
print(bc)
df1['j'] = bc
print(df1)
CodePudding user response:
You need to select first three rows in your second option
df1.loc[df1.index[:-1], 'j'] = bc
j П, k=1 П, k=2
0 1.0 10.0 40.0
1 2.0 20.0 50.0
2 3.0 30.0 60.0
Sum 0.0 60.0 150.0
In your first option, with df1['j'] = j
you are overriding j
column in every loop. You may consider df1.loc[j, 'j'] = j
.