have got input dataframe like below and a variable 'extra
'
input_df
Item Space refill Min Max
Apple 0.25 12.53 0.125 0.25
Lemon 0.25 11.2 0.25 0.375
Potato 0.375 10.3 0.375 0.75
Melon 0.375 9.2 0.25 0.75
'extra
' variable can be negative or positive.
Expected output_df if extra = 0.25
(this extra 0.25 space need to be splitted in the multiples of 0.125 and added to items with higher refill order. In this case, 0.125 0.125 = 0.25 so two items space will be updated. Also should consider Max value here, on adding space it should not exceed Max value, if happens skip that item and move to next.)
Item Space refill Min Max
Apple 0.25 12.53 0.125 0.25 #item skipped since it exceeds Max limit if extra space added
Lemon 0.375 11.2 0.25 0.375 #0.125 from extra space added here
Potato 0.5 10.3 0.375 0.75 #remaining 0.125 from extra space added here
Melon 0.375 9.2 0.25 0.75 #no more space to be added
Expected output_df if extra = -0.25
(this extra -0.25 space[which is two times of -0.125] need to be reduced from items with higher refill. Also should consider Min Value here, on reducing space it should not go below Min value, if happens skip that item and move to next.)
Item Space refill Min Max
Apple 0.125 12.53 0.125 0.25 #-0.125 from extra space reduced here
Lemon 0.25 11.2 0.25 0.375 #item skipped since it doesn't satisfy Min limit if reduced
Potato 0.375 10.3 0.375 0.75 #item skipped since it doesn't satisfy Min limit if reduced
Melon 0.25 9.2 0.25 0.75 #rem -0.125 from extra space reduced here
Any help will be appreciated. Thank You!
CodePudding user response:
Assumptions:
- Both
extra
and the values in columnsMin
andMax
should always contain multiples of0.125
. E.g.df.Min % 0.125
should always lead to a series of zeros. - The
df
index contains unique values only. If this is not yet the case, usedf.reset_index(drop=True, inplace=True)
to reset it. Or usedrop=False
, if you want to keep the original index values. You can then reset them afterwards, usingdf.set_index('index', drop=True, inplace=True)
.
Given the assumptions, the following should achieve this:
import pandas as pd
import numpy as np
data = {'Item': {0: 'Apple', 1: 'Lemon', 2: 'Potato', 3: 'Melon'},
'Space': {0: 0.25, 1: 0.25, 2: 0.375, 3: 0.375},
'refill': {0: 12.53, 1: 11.2, 2: 10.3, 3: 9.2},
'Min': {0: 0.125, 1: 0.25, 2: 0.375, 3: 0.25},
'Max': {0: 0.25, 1: 0.375, 2: 0.75, 3: 0.75}}
def f(df, extra):
length = int(abs(extra/0.125))
repeats = np.repeat([0.125 * -1 if np.signbit(extra) else 0.125], len(df))
tmp = np.where((pd.Series(repeats).lt(0) & df.Space.gt(df.Min)) |
(pd.Series(repeats).gt(0) & df.Space.lt(df.Max)),
df.Space.add(repeats), df.Space)
new_series = np.where(df.Space.ne(tmp).cumsum().gt(length), df.Space, tmp)
return new_series
Function applied:
df = pd.DataFrame(data)
df.Space = f(df, 0.25)
print(df)
Item Space refill Min Max
0 Apple 0.250 12.53 0.125 0.250
1 Lemon 0.375 11.20 0.250 0.375
2 Potato 0.500 10.30 0.375 0.750
3 Melon 0.375 9.20 0.250 0.750
df = pd.DataFrame(data)
df.Space = f(df, -0.25)
print(df)
Item Space refill Min Max
0 Apple 0.125 12.53 0.125 0.250
1 Lemon 0.250 11.20 0.250 0.375
2 Potato 0.375 10.30 0.375 0.750
3 Melon 0.250 9.20 0.250 0.750
Suppose we use extra= 0.375
, then the last item will also change:
df = pd.DataFrame(data)
df.Space = f(df, 0.375)
print(df)
Item Space refill Min Max
0 Apple 0.250 12.53 0.125 0.250
1 Lemon 0.375 11.20 0.250 0.375
2 Potato 0.500 10.30 0.375 0.750
3 Melon 0.500 9.20 0.250 0.750 # also changed
Explanation:
- Inside function
f
, we first create annp.array
namedrepeats
with a length equal to yourdf
, filled either with-0.125
or0.125
, depending on the sign ofextra
(usingnp.signbit
). - Next, we use
np.where
and check either (values in columnSpace
< 0 and > values in columnMin
) or (values in columnSpace
> 0 and < values in columnMax
). For eachTrue
we wantdf.Space.add(repeats)
, for eachFalse
we keep the values fromdf.Space
. We store the result in a variabletmp
. SeeSeries.lt
andSeries.gt
. - Finally, we apply
np.where
again to limit the amount of changes to be made. In order to figure this out, we first applySeries.ne
totmp
and getTrue
for each mismatch (i.e. a change). ApplyingSeries.cumsum
will get us a cumulative sum of all theTrue
values. Once, a value inside this series becomes greater thanint(abs(extra/0.125))
we need to stop making changes. So, from that point onwards, we fall back on the original values in columnSpace
.