I've read an Excel file into python using:
import pandas as pd
import numpy as np
water_consumption = pd.read_csv('Self_Data.csv')
and I'm trying to square the columns using:
exponent = 2
water_consumption['x2'] = np.power(water_consumption['Consumption_(HCF)'], exponent)
water_consumption['y2'] = np.power(water_consumption['Water&Sewer_Charges'], exponent)
I keep getting the error:
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'
I'm fairly new to python. Is there any way to easily fix this?
CodePudding user response:
You can use a lambda function
like this:
But this works if the data type of the column is not object type.
for this check type(water_consumption['x2'])
you try it:
water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)
>>> import pandas as pd
>>> water_consumption={"x2":[1,2,3,4],"y2":[5,6,7,8]}
>>> water_consumption=pd.DataFrame(water_consumption)
>>> water_consumption
x2 y2
0 1 5
1 2 6
2 3 7
3 4 8
>>> water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)
>>> water_consumption
x2 y2
0 1 5
1 4 6
2 9 7
3 16 8
>>>
CodePudding user response:
Never use apply-lambda
for straightforward mathematical operations it is orders of magnitude slower than using direct operations.
The problem that click004 is having is that columns are in str
format.
They should be converted first to a numeric, typically with .convert_dtypes()
: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html
Pandas is quite good at understanding the type of column, the fact that it has been detected as str
means that probably some of the values are not directly numbers, but may have units or something else.