Home > Blockchain >  Python: How to find the near multiple value in pandas?
Python: How to find the near multiple value in pandas?

Time:10-22

Have got a dataframe input_df

Item    Space   Size    Max
Apple   0.375   0.375   0.375
Lemon   0.625   0.375   0.75
Melon   0.5     0.375   0.625

'Space' column value need to be the nearest multiple of 'Size' column. Also, final input_df['Space'] <= input_df['Max']. 'extra' column to be generated which gives how much space value added/reduced.

Expected output_df

Item    Space   Size    Max     extra
Apple   0.375   0.375   0.375   0       
Lemon   0.75    0.375   0.75    0.125   #Space value changed to nearest
Melon   0.375   0.375   0.625   -0.125  #Space value changed to nearest

CodePudding user response:

You can use:

# get max multiple
MAX = input_df['Max'].div(input_df['Size'])

NEW = (input_df['Space']
       .div(input_df['Size'])        # get multiple
       .clip(upper=MAX.astype(int))  # clip to MAX integer
       .round()                      # round to nearest integer
       .mul(input_df['Size'])        # multiply again by step
      )

# compute the difference to original
input_df['extra'] = NEW.sub(input_df['Space'])
# update original
input_df['Space'] = NEW

output:

    Item  Space   Size    Max  extra
0  Apple  0.375  0.375  0.375  0.000
1  Lemon  0.750  0.375  0.750  0.125
2  Melon  0.375  0.375  0.625 -0.125
3  Peach  0.375  0.375  0.625 -0.375
4  Grape  0.375  0.375  0.750  0.075

used input:

    Item  Space   Size    Max
0  Apple  0.375  0.375  0.375
1  Lemon  0.625  0.375  0.750
2  Melon  0.500  0.375  0.625
3  Peach  0.750  0.375  0.625
4  Grape  0.300  0.375  0.750

CodePudding user response:

You can find the nearest multiple using min of floor and round operation on divisible factor of Space and Max respectively:

pd.DataFrame([np.floor(df["Max"]/df["Size"]), np.round(df["Space"]/df["Size"])]).min() * df["Size"]

Full example:

columns = ["Item","Space","Size","Max"]
data = [("Apple",0.375,0.375,0.375),
("Lemon",0.500,0.375,0.75),
("Melon",0.625,0.375,0.75),
("Peach",0.700,0.375,0.625),
("Grape",0.300 ,0.375,0.750)]
df = pd.DataFrame(data=data, columns=columns)
# Input
>>     Item  Space   Size    Max
>> 0  Apple  0.375  0.375  0.375
>> 1  Lemon  0.500  0.375  0.750
>> 2  Melon  0.625  0.375  0.750
>> 3  Peach  0.700  0.375  0.625
>> 4  Grape  0.300  0.375  0.750


df["Extra"] = df["Space"]
df["Space"] = pd.DataFrame([np.floor(df["Max"]/df["Size"]), np.round(df["Space"]/df["Size"])]).min() * df["Size"]
df["Extra"] = df["Space"] - df["Extra"]
# Output
>>     Item  Space   Size    Max  Extra
>> 0  Apple  0.375  0.375  0.375  0.000
>> 1  Lemon  0.375  0.375  0.750 -0.125
>> 2  Melon  0.750  0.375  0.750  0.125
>> 3  Peach  0.375  0.375  0.625 -0.325
>> 4  Grape  0.375  0.375  0.750  0.075
  • Related