Home > database >  How can I write or change some characters in my file in R?
How can I write or change some characters in my file in R?

Time:03-02

I have a PRM file that is a kind of text file. I want to change some characters in it. For example, " md = minf; " to "md = maxf;" and "ls = 1" to "ls = 3". Can you guide me on how can I change it? I don't know how can I use WriteLines function in this situation?

> setwd("C:/Users/Documents/CONN")
> fileName <- "Hmin.PRM"
> conn <- file(fileName,open="r")
> linn <-readLines(conn)
> for (i in 1:length(linn)){
    print(linn[i])
  }
[1] ""
[1] "begin_pop = \"p1\";"
[1] "   beginfounder;"
[1] "      male   [n =   20, pop = \"hp\"];"
[1] "      female [n = 400, pop = \"hp\"];"
[1] "   endfounder;"
[1] "   ls  = 1;           "
[1] "   pmp = 0.5;                 "
[1] "   ng  = 10;                  "
[1] "   md  = minf;                 "
[1] "   sr  = 0.5;                 "
[1] "   dr  = 0.3;                 "
[1] "   sd  = ebv /h;             "
[1] "   cd  = ebv /l;              "
[1] "   ebvest = true_av;"
[1] "   begpop;"
[1] "        ld /maft 0.1;"
[1] "\t   crossover;"
[1] "        data;"
[1] "        stat;"
[1] "        genotype/gen 8 9 10;"
[1] "   endpop;"
[1] "end_pop;"
[1] "  "
> close(conn)

CodePudding user response:

linn <- c("begin_pop = \\p1\\;", "   beginfounder;", "      male   [n =   20, pop = \\hp\\];", "      female [n = 400, pop = \\hp\\];", "   endfounder;", "   ls  = 1;", "   pmp = 0.5;", "   ng  = 10;", "   md  = minf;", "   sr  = 0.5;", "   dr  = 0.3;", "   sd  = ebv /h;", "   cd  = ebv /l;", "   ebvest = true_av;", "   begpop;", "        ld /maft 0.1;", "\t   crossover;", "        data;", "        stat;", "        genotype/gen 8 9 10;", "   endpop;", "end_pop;")

gsub("\\b(ls\\s*=\\s*)1", "\\1 3",
     gsub("\\b(md\\s*=\\s*)minf", "\\1maxf", linn))
#  [1] "begin_pop = \\p1\\;"                    "   beginfounder;"                      
#  [3] "      male   [n =   20, pop = \\hp\\];" "      female [n = 400, pop = \\hp\\];" 
#  [5] "   endfounder;"                         "   ls  =  3;"                          
#  [7] "   pmp = 0.5;"                          "   ng  = 10;"                          
#  [9] "   md  = maxf;"                         "   sr  = 0.5;"                         
# [11] "   dr  = 0.3;"                          "   sd  = ebv /h;"                      
# [13] "   cd  = ebv /l;"                       "   ebvest = true_av;"                  
# [15] "   begpop;"                             "        ld /maft 0.1;"                 
# [17] "\t   crossover;"                         "        data;"                         
# [19] "        stat;"                          "        genotype/gen 8 9 10;"          
# [21] "   endpop;"                             "end_pop;"                              

Explanation of the regex:

  • \\b is a word-boundary, so we match md but not amd;
  • \\s* is zero-or-more blank-space
  • (..) is a match-group, which (in this case) we want to bring back in the replacement;
  • \\1 is recalling the first match-group
  • the only reason I introduced a space between ls = and 3 (whereas I did not add a space before maxf) is that I wanted absolutely no ambiguity visually between \\1 and \\13 suggesting 13 match-groups. R didn't have a problem with the latter, but I thought I'd keep it clear in the regex.
  • Related