Home > Net >  Conduct the calculation only when the value is not null
Conduct the calculation only when the value is not null

Time:05-24

I have a data frame dft:

Date              Total Value
02/01/2022          2
03/01/2022          6 
03/08/2022          4
03/11/2022          
03/15/2022          4
05/01/2022          4

I want to calculate the total value in March, I used the following code:

Mar22 = dft.loc[dft['Date'].between('03/01/2022', '03/31/2022', inclusive='both'),'Total Value'].sum()

03/11/2022 has a null value, which caused an error. What should I add to my code so that I only sum the values that are not null?

Would that be isnull() == False ?

CodePudding user response:

This issue is that you have an empty string (it should rather be a NaN).

You can ensure having only numbers with pandas.to_numeric:

out = (pd.to_numeric(df['Total Value'], errors='coerce')[dft['Date']
         .between('03/01/2022', '03/31/2022', inclusive='both')].sum()
      )

Or if you only have empty strings as non numeric values:

out = (dft.loc[dft['Date'].between('03/01/2022', '03/31/2022', inclusive='both'), 'Total Value']
          .replace('', float('nan')).sum()
       )

output: 14.0

CodePudding user response:

Try the pandas built-in notnull() function.

Mar22 = dft.loc[dft['Total Value'].notnull()].loc[dft['Date'].between('03/01/2022', '03/31/2022', inclusive='both'),'Total Value'].sum()

  • Related