How to formulate for loop here-CodePudding

I have a csv file of information of car (price, model, color, and more) I have uploaded this into R through read.csv Some variables are text based categorical variables such as Model, color, and fuel type I came up with a for loop to find out how to find these text based categorical variables

for(i in 1:dim(car)[2]){ 
  if(is.character(car[,i])){
  print(names(car)[i])
  }
}

###car is name of file Now I want to add to the loop how to find the index of the column. For example column of Model is 2 but how should I integrate it into this loop? Below is what I have so far but response is "Integer(0)".

for(i in 1:dim(car)[2]){ 
  if(is.character(car[,i])){ 
    print(which(i==colnames(car)))}
}

CodePudding user response：

dim(car)[2] is the number of columns of car. (ncol() is a more common way to get this number for a data frame).

1:dim(car)[2] is therefore 1, 2, 3, ... up to the number of columns.

So for(i in ...) means i will be 1, then i will be 2, .... up to the number of columns.

When your if statement is TRUE, the current value of i is the column number. So you want print(i) inside your if() statement.

Your attempt, print(which(i==colnames(car))) failes because colnames(car) are the names of the columns, and i is the number of the column. Names and numbers are different.

A more R-like way to do this would be to use sapply instead of a loop. Something like this:

char_cols = sapply(cars, is.character)
char_cols # named vector saying if each column is character or not
char_cols[char_cols] # look only at the character columns

CodePudding user response：

"which" function can still be used. From the response from Gregor Thomas there is a way to modify there is a way to modify for loop

for(i in 1:ncol(car)){ 
  if(is.character(car[,i])){ 
  print(names(car)[i])
  print(which(names(car)[i]==colnames(car)))
  }
}

enter image description here

we first print the actual names through print(names(car)[i])
then we simply ask R to print the names (that we receive above) that match with name in column of "car" dataset

check the link below for a picture. Once again thank you to Mr. Gregor Thomas

CodePudding user response：

A slight variation of Gregor Thomas' smart recommendation is to use sapply with the typeof function to type every column and then the which function to get the character column numbers:

x <- sapply(cars, typeof)
y <- which(x == 'character')

Also note that you see which columns are character from a visual inspection of a dataframe's structure, str(car)