Home > database >  R: How to get column names for columns that contain a certain word AND their associated index number
R: How to get column names for columns that contain a certain word AND their associated index number

Time:04-22

I want to create a list of column names that contain the word "arrest" AND their associated index number. I do not want all the columns, so I DO NOT want to subset the arrest columns into a new data frame. I merely want to see the list of names and their index numbers so I can delete the ones I don't want from the original data frame.

I tried getting the column names and their associated index numbers by using the below codes, but they only gave one or the other. This gives me their names only

colnames(x2009_2014)[grepl("arrest",colnames(x2009_2014))]

[1] "poss_cannabis_tot_arrests"    "poss_drug_total_tot_arrests" 
 [3] "poss_heroin_coke_tot_arrests" "poss_other_drug_tot_arrests" 
 [5] "poss_synth_narc_tot_arrests"  "sale_cannabis_tot_arrests"   
 [7] "sale_drug_total_tot_arrests"  "sale_heroin_coke_tot_arrests"
 [9] "sale_other_drug_tot_arrests"  "sale_synth_narc_tot_arrests" 
[11] "total_drug_tot_arrests"  

This gives me their index numbers only

grep("county", colnames(x2009_2014))

[1]  93 168 243 318 393 468 543 618 693 768 843

But I want their name AND index number so that it looks something like this

[93] "poss_cannabis_tot_arrests"   
 [168] "poss_drug_total_tot_arrests" 
 [243] "poss_heroin_coke_tot_arrests"
 [318] "poss_other_drug_tot_arrests" 
 [393] "poss_synth_narc_tot_arrests" 
 [468] "sale_cannabis_tot_arrests"   
 [543] "sale_drug_total_tot_arrests" 
 [618] "sale_heroin_coke_tot_arrests"
 [693] "sale_other_drug_tot_arrests" 
[768] "sale_synth_narc_tot_arrests" 
[843] "total_drug_tot_arrests"

Lastly, using advice here, I used the below code, but it did not work.

K=sapply(x2009_2014,function(x)any(grepl("arrest",x)))

which(K)
named integer(0)

The person who provided the advice in the above link used

K=sapply(df,function(x)any(grepl("\\D ",x)))
 names (df)[K]
    Zo.A Zo.B 

 Which (k)
   Zo.A Zo.B 
     2    4 

I'd prefer the list I showed in the third block of code, but the code this person used provides a structure I can work with. It just did not work for me when I tried using it.

CodePudding user response:

Hacky as a one-liner because I really dislike use <- inside a function call, but this should work:

setNames(
  nm = matches <- grep("arrest", colnames(x2009_2014)),
  colnames(x2009_2014)[matches]
)

Reproducible example:

setNames(nm = x <- grep("b|c", letters), letters[x])
#   2   3 
# "b" "c" 

Or write your own function that does it. Here I put it in a data frame, which seems nicer than a named vector:

grep_ind_value = function(pattern, x, ...) {
  index = grep(x, pattern, ...)
  value = x[index]
  data.frame(index, value)
}
  •  Tags:  
  • r
  • Related