Select specific columns of a csv file contain specific letters in R-CodePudding

Using R to read a csv file.

Can anyone share with me how to select only columns end with _ct?

CodePudding user response：

You can specify which columns you want to read into R from a csv file using the vroom package, e.g.

install.packages("vroom")
library(vroom)

data <- vroom("yourfile.csv", col_select = ends_with("_ct"))

CodePudding user response：

Using readr::read_csv() you can specify which columns to select (using tidyselection as we use in dplyr::select(). For example if we want to select those column in mtcars.csv data starts with letter d, we can do the following

library(readr)
library(dplyr)

read_csv(readr_example("mtcars.csv"), col_select = starts_with("d"))

#> Rows: 32 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (2): disp, drat
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 32 × 2
#>     disp  drat
#>    <dbl> <dbl>
#>  1  160   3.9 
#>  2  160   3.9 
#>  3  108   3.85
#>  4  258   3.08
#>  5  360   3.15
#>  6  225   2.76
#>  7  360   3.21
#>  8  147.  3.69
#>  9  141.  3.92
#> 10  168.  3.92
#> # … with 22 more rows

^{Created on 2022-07-07 by the reprex package (v2.0.1)}

CodePudding user response：

For a base R solution, we can use grep here:

df_select <- df[, grep("_ct$", names(df), value=TRUE)]

CodePudding user response：

The ends_with function in dplyr package is what you want.

 library(dplyr)
    
 df %>% dplyr::select(ends_with("_ct"))