Home > Mobile >  How do I take a random of sentence in R and count the number of characters per word and sorts the te
How do I take a random of sentence in R and count the number of characters per word and sorts the te

Time:12-28

First, i want to select 5 or 6 sentences randomly, and after that, write a function that finds the letter numbers of all words in a given text and sorts the text according to those numbers from words with few letters to words with many letters. Sort the words containing the same number of letters alphabetically.

[1] "We find joy in the simplest things. He wrote down a long list of items. The hail pattered on the burnt brown grass. Screen the porch with woven straw mats. The theft of the pearl pin was kept secret. Sweet words work better than fierce." 

the function should return the result like this

[1] "a he in of of on we joy pin the the the the the the was down find hail kept list long mats than with work brown burnt grass items pearl porch straw sweet theft words woven wrote better fierce screen secret things pattered simplest" 

CodePudding user response:

A similar base R approach:

str <- "We find joy in the simplest things. He wrote down a long list of items.
        The hail pattered on the burnt brown grass. Screen the porch with woven
        straw mats. The theft of the pearl pin was kept secret. 
        Sweet words work better than fierce."

words <- strsplit(str, "[[:punct:]]?\\s [[:punct:]]?")[[1]]

split(words, nchar(words)) |>
  lapply(sort) |>
  unlist() |>
  paste(collapse = " ")
  
#> [1] "a He in of of on We joy pin the the the the The The was down find hail
#> kept list long mats than with work brown burnt grass items pearl porch 
#> straw Sweet theft words woven wrote better Screen secret things fierce. 
#> pattered simplest"

CodePudding user response:

one approach with base R:

sentence <- 
"We find joy in the simplest things. He wrote down a long list of items. 
The hail pattered on the burnt brown grass. Screen the porch with woven straw mats.
The theft of the pearl pin was kept secret. Sweet words work better than fierce."

sentence |>
    strsplit('\\W ?') |> ## split at non-word characters
    unlist() |>
    (\(.) .[. != ""])() |> ## remove empty strings
    (\(.) .[order(nchar(.), .)])() |> ## sort by string length and alphabet
                            paste(collapse = ' ')

output:

[1] "a He in of of on We joy pin the the the the The The was down find hail kept list long mats than with work brown burnt grass items pearl porch straw Sweet theft words woven wrote better fierce Screen secret things pattered simplest"

Note that there's some perhaps unfamiliar notation like (\(.) ...)(). This is a shorthand for defining and executing an anonymous function:

  • function(x){...} can be written as \(x){...}
  • (\(x){...})() defines and executes the function, where x is the incoming value if you put this construct in a ... |> ... |> pipeline

CodePudding user response:

library(tokenizers)

text =  "We find joy in the simplest things. He wrote down a long list of items. The hail pattered on the burnt brown grass. Screen the porch with woven straw mats. The theft of the pearl pin was kept secret. Sweet words work better than fierce."

sort_count <- function(s){
  words <- tokenize_words(text, simplify = T)
  words[order(nchar(words), words)]
} 
 
sort_count(text)
#>  [1] "a"        "he"       "in"       "of"       "of"       "on"      
#>  [7] "we"       "joy"      "pin"      "the"      "the"      "the"     
#> [13] "the"      "the"      "the"      "was"      "down"     "find"    
#> [19] "hail"     "kept"     "list"     "long"     "mats"     "than"    
#> [25] "with"     "work"     "brown"    "burnt"    "grass"    "items"   
#> [31] "pearl"    "porch"    "straw"    "sweet"    "theft"    "words"   
#> [37] "woven"    "wrote"    "better"   "fierce"   "screen"   "secret"  
#> [43] "things"   "pattered" "simplest"
  •  Tags:  
  • r
  • Related