Home > Blockchain >  Why does Pandas fills NaN values in wrong order?
Why does Pandas fills NaN values in wrong order?

Time:10-29

I am trying to fill NaN values on a data frame with the values from another data frame. The source data frame as:

    price
timestamp   
2021-10 2.60
2021-11 1.85
2022-01 12.20
2022-02 15.50
2022-03 16.00
2022-04 22.05
2022-05 16.80
2022-06 21.55
2022-07 65.45
2022-08 30.80
2022-09 5.10
2022-10 21.40

As you see, 2021-12 is missed!

And here is the other data frame, that I want to use in fillna() operation:

    price
timestamp   
2021-10 NaN
2021-11 NaN
2021-12 NaN
2022-01 NaN
2022-02 NaN
2022-03 NaN
2022-04 NaN
2022-05 NaN
2022-06 NaN
2022-07 NaN
2022-08 NaN
2022-09 NaN
2022-10 NaN

I want to fill missing month, 2021-12, from the dataframe is only have NaN values. Here is the code I have used to do it:

creator_primary_sales_dataFrame=creator_primary_sales_dataFrame.reset_index()
creator_primary_sales=creator_primary_sales.reset_index()

creator_primary_sales_dataFrame=creator_primary_sales.fillna(creator_primary_sales_dataFrame)
and result:


timestamp   price
0   2021-10 2.60
1   2021-11 1.85
2   2021-12 12.20
3   2022-01 15.50
4   2022-02 16.00
5   2022-03 22.05
6   2022-04 16.80
7   2022-05 21.55
8   2022-06 65.45
9   2022-07 30.80
10  2022-08 5.10
11  2022-09 21.40
12  2022-10 NaN

expected output:

timestamp   price
0   2021-10 2.60
1   2021-11 1.85
2   2021-12 NaN
3   2022-01 12.20
4   2022-02 15.50
5   2022-03 16.00
6   2022-04 22.05
7   2022-05 16.80
8   2022-06 21.55
9   2022-07 65.45
10  2022-08 30.80
11  2022-09 5.10    
12  2022-10 21.40

CodePudding user response:

Your problem is that you are resetting the index of the dataframes.

Just do:

creator_primary_sales_dataFrame = creator_primary_sales_dataFrame(creator_primary_sales)

CodePudding user response:

here is one way to do it using map

# df2 is your dataframe with null values

# map price from original DF having prices over to the new DF
# setting index is required, and is done inline

df2['price']=df2['timestamp'].map(df.set_index('timestamp')['price'])
df2
    timestamp   price
0   2021-10     2.60
1   2021-11     1.85
2   2021-12     NaN
3   2022-01     12.20
4   2022-02     15.50
5   2022-03     16.00
6   2022-04     22.05
7   2022-05     16.80
8   2022-06     21.55
9   2022-07     65.45
10  2022-08     30.80
11  2022-09     5.10
12  2022-10     21.40
  • Related