I have been trying to write the following code and it does not let me to convert the columns Amount currency,Amount in to float. Below is my query
import pandas as pd
from datetime import datetime
df= pd.read_csv(r'C:\Users\Desktop\CustomerData.csv')
parsed = pd.to_datetime(df["Date"], errors="coerce").fillna(pd.to_datetime(df["Date"],format="%Y-%d-%m",errors="coerce"))
ordinal = pd.to_numeric(df["Date"], errors="coerce").apply(lambda x: pd.Timestamp("1899-12-30") pd.Timedelta(x, unit="D"))
df["Date"] = parsed.fillna(ordinal)
newdf = df.loc[(df.Type == "Sales Invoice")]
df2 = newdf.groupby(['Date','Customer','Type'])["Amount currency", "Amount"].apply(lambda x : x.astype(float).sum())
But i am getting the below error
could not convert string to float: '10,084.80'
And in the csv file, I have checked the column contents. But in the number columns, it does not contain any special character.
the below is the general info of my data set
Data columns (total 5 columns):
Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 80 non-null datetime64[ns]
1 Customer 80 non-null object
2 Amount currency 80 non-null object
3 Type 80 non-null object
4 Amount 80 non-null object
dtypes: datetime64[ns](1), object(4)
These columns, Amount currency and Amount cannot be converted in to float type with the above codes.
These are the steps I did -
1.Importing the data set
2.Cleaning the dte column and arranging the data
3.converting the date column to YYY-MM format
4.creating a group by after selecting the type in to "Sales Invoice"
5.Group by sum will generate a set of multiple data instead of sum of the values.
For that, need to convert the values in to float
Any one to help?
Thanks
CodePudding user response:
You can use the following code:
df['Amount currency'] = df['Amount currency'].str.replace(r'[^0-9\.]', '', regex=True)
df['Amount currency'] = pd.to_numeric(df['Amount currency'])
The first line will delete any characters that are not a digit or a dot. So, instead of 10,084.80
, you will get '10084.80'. Then, by executing the second line, you will successfully convert that to a float.
Beware, if you have negative numbers (i.e. -10,084.80), the first line will delete the '-' and therefore you'll end up with a positive float. If you need to keep the '-' to have a negative float, just change the regex expression to r'[^0-9\.\-]'
, which will keep the minus and get you -10084.80.