I have a dataframe like so below
active idx
0 nan 0
1 20.0 0
2 32.0 0
3 nan 1
4 38.0 1
5 24.0 1
6 nan 2
7 33.0 2
8 44.0 2
9 59.0 2
10 nan 3
11 17.0 3
12 15.0 3
13 9.0 3
I also have a series like so
idx
0 3
1 3
2 4
3 4
Name: active, dtype: int64
and I also have a list like so:
list = [[4.0, 4.0], [2.0, 3.0], [1.0, 0.0, 1.0], [0.0, 0.0, -1.0]]
I need to broadcast the list element who's position corresponds to the idx in the first data frame from the nth 1 number of values in the data frame so we ignore the np.nan that is shown at the start of every new idx
So for idx = 0, I do a lookup in the second series and see for the first 2 values I need to broadcast the first element that is [4.0, 4.0] the values after the nan, so no number should be added to the np.nan
So it should come out like so:
active idx
0 nan 0
1 24.0 0
2 36.0 0
3 nan 1
4 40.0 1
5 27.0 1
6 nan 2
7 34.0 2
8 44.0 2
9 60.0 2
10 nan 3
11 17.0 3
12 15.0 3
13 8.0 3
I know I can loop through but that's not the most optimised way, I've tired grouping by and applying functions but Im struggling with broadcasting the list to the values. Any help appreciated! :)
CodePudding user response:
Since your list is already in the correct order, you can filter out the nan
values from the dataframe and do the operation on that. You can use numpy.concatenate
to flatten the list. Assuming your dataframe is named df
:
df.active[~df.active.isna()] = np.concatenate(list)
Would also recommend using a different variable name than list
in python.