names = ['john', 'pete', 'jack']
funds = [1000, 3600, 2390]
df = pd.DataFrame({'Name': names, 'Fund': funds})
df['m1'] = np.nan
df['m2'] = np.nan
df['m3'] = np.nan
df['m4'] = np.nan
df['m5'] = np.nan
df['m6'] = np.nan
df['m7'] = np.nan
df['m8'] = np.nan
df['m9'] = np.nan
df['m10'] = np.nan
df['m11'] = np.nan
df['m12'] = np.nan
original df
Name Fund m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12
0 john 1000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 pete 3600 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 jack 2390 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
The recreated df
structure represents the structure of a typical csv file which I receive. I am required to populate based on a set of 'release of fund conditions' which vary from csv table to the next.
Below I have provided an example of a set of 'release of fund conditions' for which I would like to create a script that I can easily adapt when the conditions are changed.
Release of fund conditions
john 10% of fund monthly for 10 months starting after m1
pete 12.5% of fund monthly for 8 months starting after m2
jack 20% of fund monthly for 5 months starting after m4
I want to be able to apply the Release of fund conditions as code to produce the output shown below:
Desired Output
Name Fund m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12
0 john 1000 NaN 100 100 100 100 100 100 100 100 100 100 NaN
1 pete 3600 NaN NaN 450 450 450 450 450 450 450 450 NaN NaN
2 jack 2390 NaN NaN NaN NaN 478 478 478 478 478 NaN NaN NaN
I think maybe a tuple or dictionary might be needed, but honestly I do not know how to tackle this task.
Many thanks in advance
CodePudding user response:
You can try this snippet as a starting point -
# target columns range for each Name
cols = list(df.columns)
john = cols[cols.index('m2'):cols.index('m2') 10]
pete = cols[cols.index('m3'):cols.index('m3') 8]
jack = cols[cols.index('m5'):cols.index('m5') 5]
# Calculate funds for each Name and distribute in the targeted columns
df.loc[df.Name=='john', john] = [df[df.Name=='john'].iloc[0].Fund*0.1] * len(john)
df.loc[df.Name=='pete', pete] = [df[df.Name=='pete'].iloc[0].Fund*0.125] * len(pete)
df.loc[df.Name=='jack', jack] = [df[df.Name=='jack'].iloc[0].Fund*0.20] * len(jack)
This is a basic pandas
logic of row indexer and column indexer to assign a list of values. The first section of the snippet targets the column names according to your logic. For example, targeted column names for john start from m2
and continue to the next 10 months.
cols[cols.index('m2'):cols.index('m2') 10]
returns these column lists where 10
is number of next months. This list of columns has been used to target columns in df
for john
to update values.
The same logic has been applied to other Name
s too.