I have a csv file of information of car (price, model, color, and more) I have uploaded this into R through read.csv Some variables are text based categorical variables such as Model, color, and fuel type I came up with a for loop to find out how to find these text based categorical variables
for(i in 1:dim(car)[2]){
if(is.character(car[,i])){
print(names(car)[i])
}
}
###car is name of file Now I want to add to the loop how to find the index of the column. For example column of Model is 2 but how should I integrate it into this loop? Below is what I have so far but response is "Integer(0)".
for(i in 1:dim(car)[2]){
if(is.character(car[,i])){
print(which(i==colnames(car)))}
}
CodePudding user response:
dim(car)[2]
is the number of columns of car
. (ncol()
is a more common way to get this number for a data frame).
1:dim(car)[2]
is therefore 1, 2, 3, ...
up to the number of columns.
So for(i in ...)
means i
will be 1, then i
will be 2, .... up to the number of columns.
When your if
statement is TRUE
, the current value of i
is the column number. So you want print(i)
inside your if()
statement.
Your attempt, print(which(i==colnames(car)))
failes because colnames(car)
are the names of the columns, and i
is the number of the column. Names and numbers are different.
A more R-like way to do this would be to use sapply
instead of a loop. Something like this:
char_cols = sapply(cars, is.character)
char_cols # named vector saying if each column is character or not
char_cols[char_cols] # look only at the character columns
CodePudding user response:
"which" function can still be used. From the response from Gregor Thomas there is a way to modify there is a way to modify for loop
for(i in 1:ncol(car)){
if(is.character(car[,i])){
print(names(car)[i])
print(which(names(car)[i]==colnames(car)))
}
}
- we first print the actual names through print(names(car)[i])
- then we simply ask R to print the names (that we receive above) that match with name in column of "car" dataset
check the link below for a picture. Once again thank you to Mr. Gregor Thomas
CodePudding user response:
A slight variation of Gregor Thomas' smart recommendation is to use sapply
with the typeof
function to type every column and then the which
function to get the character column numbers:
x <- sapply(cars, typeof)
y <- which(x == 'character')
Also note that you see which columns are character from a visual inspection of a dataframe's structure, str(car)