Home > database >  Calculate some column if it is empty
Calculate some column if it is empty

Time:05-26

I have a problem. I want to calculate m3 if the column is NaN or None. Unfortunately I do not know, how I could create a these kind of if condition in pandas. And is there any better and faster way to calculate these formula e.g. directly in pandas without extra method?

Formula

if m3 is None
   if CMT
        dimension to m
        calculate m3
   if INH
        calculate dimension to cm
        dimension to m
        calculate m3

Dataframe

  dimension_unit  dimension_length  dimension_height  dimension_width    m3
0            CMT              50.0              80.0             60.0  0.24
1            INH              15.0              10.0             10.0   NaN
2            CMT              80.0             120.0             56.0   NaN

Code

import pandas as pd

d = { 
    'dimension_unit': ['CMT', 'INH', 'CMT'], 
    'dimension_length': [50, 15,    80.0], 
    'dimension_height': [80, 10, 120.0], 
    'dimension_width': [60, 10, 56.0],
    'm3': [0.24, None, None]
    }
df = pd.DataFrame(data=d)
print(df)

def calculate_m3(x, unit, length, height, width, m3):
  m3 = x[m3]
   if m3 == None:
   print(x)
   unit = x[unit]
   length = x[length]
   height = x[height]
   width = x[width]
   if(unit == 'CMT'):
     length_in_m = length / 100
     width_in_m = width / 100
     height_in_m = height / 100
     return round(length_in_m * width_in_m * height_in_m,3)
   else:
     length_in_m = (length * 2.54) / 100
     width_in_m = (width * 2.54) / 100
     height_in_m = (height * 2.54) / 100
     return round(length_in_m * width_in_m * height_in_m,3)



df['m3'] = df.apply(lambda x: calculate_m3(x, 'dimension_unit', 'dimension_length', 'dimension_height', 'dimension_width', m3))

[OUT]
KeyError: 'dimension_unit'

What I want

  dimension_unit  dimension_length  dimension_height  dimension_width    m3
0            CMT              50.0              80.0             60.0  0.24
1            INH              15.0              10.0             10.0  0.037
2            CMT              80.0             120.0             56.0  0.538

CodePudding user response:

IIUC, you want to do

def calculate_m3(x, unit, length, height, width, m3):
    m3 = x[m3]
    if m3 != m3:
        # You should compare `NaN` not
        unit = x[unit]
        length = x[length]
        height = x[height]
        width = x[width]
        if(unit == 'CMT'):
            length_in_m = length / 100
            width_in_m = width / 100
            height_in_m = height / 100
            return round(length_in_m * width_in_m * height_in_m,3)
        else:
            length_in_m = (length * 2.54) / 100
            width_in_m = (width * 2.54) / 100
            height_in_m = (height * 2.54) / 100
            return round(length_in_m * width_in_m * height_in_m,3)
    else:
        return m3

df['m3'] = df.apply(lambda x: calculate_m3(x, 'dimension_unit', 'dimension_length', 'dimension_height', 'dimension_width', 'm3'), axis=1)
print(df)

  dimension_unit  dimension_length  dimension_height  dimension_width     m3
0            CMT              50.0              80.0             60.0  0.240
1            INH              15.0              10.0             10.0  0.025
2            CMT              80.0             120.0             56.0  0.538

You also can try np.where

import numpy as np

cols = ['dimension_length', 'dimension_height', 'dimension_width']

df['m3'] = df['m3'].mask(df['dimension_unit'].eq('CMT') & df['m3'].isna(), np.prod(df[cols].div(100), axis=1))
df['m3'] = df['m3'].mask(df['dimension_unit'].eq('INH') & df['m3'].isna(), np.prod(df[cols].mul(2.54).div(100), axis=1))
df['m3'] = df['m3'].round(3)
print(df)

  dimension_unit  dimension_length  dimension_height  dimension_width     m3
0            CMT              50.0              80.0             60.0  0.240
1            INH              15.0              10.0             10.0  0.025
2            CMT              80.0             120.0             56.0  0.538
  • Related