Home > Net >  best way to break up fixed width files in R
best way to break up fixed width files in R

Time:12-10

Below is the sample data.

 string1 <- c(320100000020220311020210)
 string2 <- c(320100000020220312120211)

 testitem <- data.frame (string1,string2)

My first question is what is the best method to break up the strings in a specified fashion. For example, the breaks should appear after characters 2,4,10,14,16,18,19, and 23. In this case, the end result should appear as such

 32    01    0000000    2022    03    11    0   2021    0

CodePudding user response:

If those are character strings, then use read.fwf from base R

read.fwf(textConnection(unlist(testitem)),
    widths = c(2, 2, 6, 4, 2, 2, 1, 4, 1), colClasses = "character")

-output

  V1 V2     V3   V4 V5 V6 V7   V8 V9
1 32 01 000000 2022 03 11  0 2021  0
2 32 01 000000 2022 03 12  1 2021  1

Or another option is separate

library(stringr)
library(tidyr)
library(dplyr)
tibble(col1 = unlist(testitem)) %>% 
  separate(col1, into = str_c("V", 1:9), sep = c(2, 4, 10, 14, 16, 18, 19, 23))
# A tibble: 2 × 9
  V1    V2    V3     V4    V5    V6    V7    V8    V9   
  <chr> <chr> <chr>  <chr> <chr> <chr> <chr> <chr> <chr>
1 32    01    000000 2022  03    11    0     2021  0    
2 32    01    000000 2022  03    12    1     2021  1 

data

string1 <- c("320100000020220311020210")
string2 <- c("320100000020220312120211")
testitem <- data.frame (string1,string2)
  • Related