Home > Blockchain >  Data Manipulation in multiple columns(absolute, percentage, and categorical) in pandas dataframe
Data Manipulation in multiple columns(absolute, percentage, and categorical) in pandas dataframe

Time:02-11

I need to make a function, which takes input as dataframe, and dictionary{"Col_1" :% change,"Col_2":absolute change,"Col_3": 0/1(Categorical)} and it should make the changes to the dataframe.

I Have data frame like this

Date col_1 col_2 col_3
01/01/2022 90 100 0
01/02/2022 80 110 1
01/03/2022 92 120 0
01/04/2022 96 130 0
01/05/2022 99 150 1
01/06/2022 105 155 1

Now I pass the dictionary say,

{"Date":["01/01/2022","01/02/2022"],"col_1":[-10,-10],"col_2":10,"col_3":[1,0]}
  • for "col_1" I am passing -10,-10 percentage change to its previous values on specified date.
  • for "col_2" I am passing an absolute number that is 10 (it should replace previous values by 10) specified date.
  • for "col_3" I am passing a binary number and it updated in dataframe on specified date.

Then my desired out would look like this

Date col_1 col_2 col_3
01/01/2022 81 10 1
01/02/2022 72 10 0
01/03/2022 92 120 0
01/04/2022 96 120 0
01/05/2022 99 150 1
01/06/2022 105 155 1

I followed tried this code:

def per_change(df,cols,d):
    df[cols] = df[cols].add(df[cols].div(100).mul(pd.Series(d)), fill_value=0)
    return df

but it didn't worked out. Please help!!

CodePudding user response:

You could use dic["Date"] as a boolean mask and update values in df using the values under the other keys in dic:

msk = df['Date'].isin(dic['Date'])
df.loc[msk, 'col_1'] *= (1   np.array(dic['col_1']) / 100)
df.loc[msk, 'col_2'] = dic['col_2']
df.loc[msk, 'col_3'] = dic['col_3']

Output:

         Date  col_1  col_2  col_3
0  01/01/2022   81.0     10      1
1  01/02/2022   72.0     10      0
2  01/03/2022   92.0    120      0
3  01/04/2022   96.0    130      0
4  01/05/2022   99.0    150      1
5  01/06/2022  105.0    155      1
  • Related