I have a data as below
Cust_ID Age Open_Balance
1 3 $50,000.00
1 -15 $51,000.00
1 59 $48,000.00
2 -1 $150,216.00
2 -1 $247,523.00
3 New columns need to be created in the same data frame with the help of data in [Age] and [Open_Balance] as below.
**df['0 to 30']=(df['Open Balance'] if (df['Age ']>0 and df['Age ']<=30) else 0)**
**df['30 to 60']=(df['Open Balance'] if (df['Age ']>30 and df['Age ']<=60) else 0)**
**df['Negative_amount']=(df['Open Balance'] if (df['Age ']<0) else 0)**
Result:
Cust_ID Age Open Balance 0 to 30 30 to 60 Negative amount
1 3 $50,000.00 $50,000.00 0 0
1 -15 $51,000.00 0 0 $51,000.00
1 59 $48,000.00 0 $48,000.00 0
2 -1 $150,216.00 0 0 $150,216.00
2 -1 $247,523.00 0 0 $247,523.00
I'm new to python and tried a lot to build logic with these conditions but couldn't able to make it. kindly help.
CodePudding user response:
A bunch of numpy.where
statements would do here
df['0 to 30'] = np.where( (df['Age']>0) & (df['Age']<=30) , df['Open_Balance'],0)
df['30 to 60'] = np.where( (df['Age']>30) & (df['Age']<=60) , df['Open_Balance'],0)
df['Negative_amount'] = np.where( df['Age']<0, df['Open_Balance'],0)
Cust_ID Age Open_Balance 0 to 30 30 to 60 Negative_amount
0 1 3 $50,000.00 $50,000.00 0 0
1 1 -15 $51,000.00 0 0 $51,000.00
2 1 59 $48,000.00 0 $48,000.00 0
3 2 -1 $150,216.00 0 0 $150,216.00
4 2 -1 $247,523.00 0 0 $247,523.00
CodePudding user response:
A pure Pandas solution would be this:
df['0 to 30'] = df['Open Balance'][df['Age '] > 0 & df['Age '] <= 30]
df['30 to 60'] = df['Open Balance'][df['Age '] > 30 & df['Age '] <= 60]
df['Negative_amount'] = df['Open Balance'][df['Age '] < 0]
This would leave your dataframe with a bunch of NaN
values. You can get rid of these by running df.fillna()
after those lines.
Note that pandas uses a slightly different syntax for conditions than pure Python code. Where we use and
and or
in a regular python if-statement, these are replaced by &
and |
respectively in Pandas and Numpy.
You can have a closer look at Pandas indexing here: Pandas indexing documentation