How do I square a column from an Excel file with pandas?-CodePudding

I've read an Excel file into python using:

import pandas as pd
import numpy as np

water_consumption = pd.read_csv('Self_Data.csv')

and I'm trying to square the columns using:

exponent = 2
water_consumption['x2'] = np.power(water_consumption['Consumption_(HCF)'], exponent)
water_consumption['y2'] = np.power(water_consumption['Water&Sewer_Charges'], exponent)

I keep getting the error:

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

I'm fairly new to python. Is there any way to easily fix this?

CodePudding user response：

You can use a lambda function like this: But this works if the data type of the column is not object type. for this check type(water_consumption['x2']) you try it:

water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)



>>> import pandas as pd
>>> water_consumption={"x2":[1,2,3,4],"y2":[5,6,7,8]}
>>> water_consumption=pd.DataFrame(water_consumption)
>>> water_consumption
   x2  y2
0   1   5
1   2   6
2   3   7
3   4   8
>>> water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)
>>> water_consumption
   x2  y2
0   1   5
1   4   6
2   9   7
3  16   8
>>>

CodePudding user response：

Never use apply-lambda for straightforward mathematical operations it is orders of magnitude slower than using direct operations. The problem that click004 is having is that columns are in str format.

They should be converted first to a numeric, typically with .convert_dtypes(): https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html

Pandas is quite good at understanding the type of column, the fact that it has been detected as str means that probably some of the values are not directly numbers, but may have units or something else.