I am working in a data set where all the variables' data types are mixed up from their original data type.
str is below:
str(df)
'data.frame': 15 obs. of 13 variables:
$ Invoice_No : int 1 2 3 4 5 6 7 8 9 10 ...
$ Invoice_Date : chr "1-Apr-21" "1-Apr-21" "1-Apr-21" "3-Apr-21" ...
$ Customer_Name: chr "I" "F" "J" "C" ...
$ Product : chr "AY201" "AY201" "GR70171" "SUB547" ...
$ Qty : int 150 50 25 200 200 100 25 2300 420 60 ...
$ Price : int 2350 2300 6950 390 1760 390 2450 260 267 390 ...
$ Credit_Terms : int 45 10 1 30 30 30 30 30 30 30 ...
$ Zone : chr "West" "West" "North" "North" ...
$ Stock_in_date: chr "2-Mar-21" "2-Mar-21" "8-Jan-21" "15-Jan-21" ...
$ Purchase_Cost: num 1611 1611 4788 285 1611 ...
$ Gross_Profit : num 100523 31084 26910 16651 15356 ...
$ UOM : chr "KG" "KG" "KG" "KG" ...
$ Invoice_Value: int 415950 135700 205025 92040 415360 46020 72275 706346 132325 27612 ...
Description of the data is given below.
Sales data with 15 row and 13 Columns:
- Invoice_No: Invoice number (Data Type: Numeric)
- Invoice_Date: Date of invoice (Data Type: Date)
- Customer_Name: Name of the customer (Data Type: Factor)
- Product: Product name (Data Type: Factor)
- Qty: Quantity purchased (Data Type: Integer)
- Price: Price of the product (Data Type: Integer)
- Credit_Terms: Credit terms agreed by the customer 1/30/60/90 (Data Type: Factor)
- Zone: Zone location of the customer NORTH/WEST/SOUTH/EAST (Data Type: Factor)
- Stock_in_date: Purchase date of the product (Data Type: Date)
- Purchase_Cost: Purchase price of the product (Data Type: Integer)
- Gross_Profit: Profit (Data Type: Numeric)
- UOM: Unit of Measurement of the product KG/LTR/NOS (Data Type: Factor)
- Invoice_Value: Invoice value of the product (Data Type: Numeric)
As you can see the actual data set should have 3 num variables, 5 factor variables, 2 date variables, 3 int variables.
I am trying to change each type at one go. But it is not working.
df[c("Invoice_No","Gross_Profit","Invoice_Value")] <- as.numeric(pal_ex[c("Invoice_No","Gross_Profit","Invoice_Value")])
However this is working.
df["Invoice_No"] <- as.numeric(df["Invoice_No"])
How do I change the data type of multiple variables to same data type at one go?
I don't know what mistake I am doing. Please help on this.
CodePudding user response:
x <- c("Invoice_No","Gross_Profit","Invoice_Value")
df[x] <- lapply(df[x], \(i) as.numeric(i))
CodePudding user response:
A tidy solution is a such:
library(dplyr)
df1 <- df %>%
mutate(across(c("Invoice_No","Gross_Profit","Invoice_Value"), as.numeric))