I have a pandas dataframe comprised of assets that are ranked in order of most to least valuable ("1" being the most and "5" being the least). I want to "sell" the least valuable assets until I hit the required amount. In the given scenario I want to be able to sell the least valuable assets until I have received $1000. I want the answer to be stored in a new data frame. I've tried to use .cumsum but haven't had success in producing the wanted output.
Recreate Scenario:
df = pd.DataFrame({'Tickers': ['AAPL', 'TSLA', 'SGOL', 'BA', 'V'],
'Value': [400, 1300, 200, 600, 400],
'Rank': [1, 4, 3, 5, 2]})
df = df.sort_values(by=["Rank"])
assets_value = df['Value'].sum()
sell_amount = 1000
print(
f'assets value = {assets_value} and the needed sell amount is {sell_amount}')
print(df)
Wanted Outcome:
Tickers Sold Available
BA 600 0
TSLA 400 900
CodePudding user response:
Here's a way to do it:
df2 = df.sort_values('Rank', ascending=False)
df2 = df2.assign(cum=df2.Value.cumsum())
df2 = df2[df2.cum - df2.Value < sell_amount]
df2 = df2.assign(Available=df2.cum - sell_amount)
df2 = df2.assign(Sold=df2.Value - df2.Available * (df2.Available > 0))
df2 = df2.assign(Available=df2.Available * (df2.Available > 0)).set_index('Tickers')[['Sold', 'Available']]
Output:
Sold Available
Tickers
BA 600 0
TSLA 400 900
Explanation:
- sort by decreasing Rank and add a column
cum
containing cumsum ofValue
- select rows where
cum
of prior rows is <sell_amount
- set
Available
to be the surplus ofcum
oversell_amount
- set
Sold
to beValue
less (if positive)Available
- set negative
Available
rows to zero - make
Tickers
the index and keep only theSold
andAvailable
columns.