I have a data frame of rental data and would like to annualise the rent based on whether a column containing the frequency states that the rent is monthly, i.e. price * 12
The frequency column contains the following values - 'Yearly', 'Monthly', nan
I have tried - np.where(df['frequency'] == "Monthly", df['price'].apply(lambda x: x*12), 0)
However, where there is monthly data, the figure seems to be being copied 12 times rather than multiplied by 12:
And I need to have the price multiplied by 12 but can't figure out how to do this
CodePudding user response:
The problem is your price
column contains string and not numeric values.
If you load your dataframe from a file (csv, xlsx), use thousands=','
as parameter of pd.read_csv
or pd.read_excel
to interpret string like '4,500
as the number 4500
.
Demo:
import pandas as pd
import io
csvdata = """\
frequency;price
Monthly;4,500
Yearly;30,200
"""
df1 = pd.read_csv(io.StringIO(csvdata), sep=';')
df2 = pd.read_csv(io.StringIO(csvdata), sep=';', thousands=',')
For df1
:
>>> df1
frequency price
0 Monthly 4,500
1 Yearly 30,200
>>> df1.dtypes
frequency object
price object # not numeric
dtype: object
>>> df1['price'] * 2
0 4,5004,500
1 30,20030,200
Name: price, dtype: object
For df2
:
>>> df2
frequency price
0 Monthly 4500
1 Yearly 30200
>>> df2.dtypes
frequency object
price int64 # numeric
dtype: object
>>> df2['price'] * 2
0 9000
1 60400
Name: price, dtype: int64
CodePudding user response:
It seems there are strings instead numbers floats in column price
, so first replace ,
to .
and then convert to floats, last multiple by 12
:
np.where(df['frequency'] == "Monthly", df['price'].str.replace(',','.').astype(float)*12, 0)
If values are thousands separated by ,
replace by empty string:
np.where(df['frequency'] == "Monthly", df['price'].str.replace(',','').astype(float)*12, 0)