How would you set delim to in R if you had a line of text data-CodePudding

Can't find the answer to this question.

What would you set delim to if you had a line of text data as follows in R?

Casey.Brook-Smith.”1200 Clover Lane, Hamden, CT”.8605555812.10-24-2001

Test your answer with read_<fill in here>.

I tried:

#create data frame
df <- data.frame(Casey.Brook-Smith.”1200 Clover Lane, Hamden, CT”.8605555812.10-24-2001)

but it said it was an error.

CodePudding user response：

Something like this may be what you want:

text <- 'Casey.Brook-Smith.”1200 Clover Lane, Hamden, CT”.8605555812.10-24-2001'
vals <- stringr::str_split(text, "\\.")[[1]]
tibble::as_tibble_row(vals, .name_repair = ~LETTERS[1:5])

# A tibble: 1 × 5
  A     B           C                              D          E         
  <chr> <chr>       <chr>                          <chr>      <chr>     
1 Casey Brook-Smith ”1200 Clover Lane, Hamden, CT” 8605555812 10-24-2001

CodePudding user response：

Something like this? The file reading function is read_delim from package readr. Base R's read.table does not have an argument delim, the separator argument is named sep.

First, create a file with the string. The column name is what I assumed to make sense, with . as separators.

flname <- tempfile()
x <- 'Casey.Brook-Smith."1200 Clover Lane, Hamden, CT".8605555812.10-24-2001'

df1 <- data.frame(`name.surname.address.number.dob` = x)
write.table(df1, flname, quote = FALSE, row.names = FALSE)

Now read the string from the file.

df1 <- readr::read_delim(flname, delim = ".")
#> Rows: 1 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "."
#> chr (4): name, surname, address, dob
#> dbl (1): number
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
str(df1)
#> spec_tbl_df [1 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#>  $ name   : chr "Casey"
#>  $ surname: chr "Brook-Smith"
#>  $ address: chr "1200 Clover Lane, Hamden, CT"
#>  $ number : num 8.61e 09
#>  $ dob    : chr "10-24-2001"
#>  - attr(*, "spec")=
#>   .. cols(
#>   ..   name = col_character(),
#>   ..   surname = col_character(),
#>   ..   address = col_character(),
#>   ..   number = col_double(),
#>   ..   dob = col_character()
#>   .. )
#>  - attr(*, "problems")=<externalptr>

df1
#> # A tibble: 1 × 5
#>   name  surname     address                          number dob       
#>   <chr> <chr>       <chr>                             <dbl> <chr>     
#> 1 Casey Brook-Smith 1200 Clover Lane, Hamden, CT 8605555812 10-24-2001

unlink(flname)

^{Created on 2022-10-29 with reprex v2.0.2}