Home > Blockchain >  How does one manipulate the characters of a Raku string based on case?
How does one manipulate the characters of a Raku string based on case?

Time:06-17


In Raku, a given string is to be divided into its individual characters. Each character that is an uppercase letter is to be enclosed within angle brackets, and each other character is to be enclosed within single quotes. The resulting strings are to be joined together, separated by spaces.

Examples …

  • If the given string (without its delimiting quotation marks) were aSbTc, then the resulting string would be 'a' <S> 'b' <T> 'c' .
  • If the given string (without its delimiting quotation marks) were A BxyC$B, then the resulting string would be <A> ' ' <B> 'x' 'y' <C> '$' <B> .
  • If the given string (without its delimiting quotation marks) were XY12, then the resulting string would be <X> <Y> '1' '2' .

sub MAIN ( )
  {
  my $myString = 'aSbTc' ;
  # Desired output:  The string "'a' <S> 'b' <T> 'c'" (sans ").
  # Uppercase letters in angle brackets, each other character in single quotes.
  }


Update …

I have arrived at the following possible solution, but I suspect that there is a much more succinct (single-line?) solution …

sub delimited( $char )
  {
  if ( $char ~~ /<upper>/ )
    { '<' ~ $char ~ '>' }
  else
    { '\'' ~ $char ~ '\'' }
  }

sub toDelimitedString( $string )
  {
  my Seq $seq = $string.split( "", :skip-empty ) ;
  my Seq $delimitedSeq = map( &delimited, $seq ) ;
  my Str $result = $delimitedSeq.list.join: ' ' ;
  $result ;
  }

sub MAIN ( )
  {
  say toDelimitedString( 'aSbTc' ) ;     # OUTPUT: 'a' <S> 'b' <T> 'c'
  say toDelimitedString( 'A BxyC$B' ) ;  # OUTPUT: <A> ' ' <B> 'x' 'y' <C> '$' <B>
  say toDelimitedString( 'XY12' ) ;      # OUTPUT: <X> <Y> '1' '2'
  } # end sub MAIN

CodePudding user response:

My oneliner solution would be:

say "aSbTc".comb.map({ $_ ∈ "A".."Z" ?? "<$_>" !! "'$_'" }).join(" ")
# 'a' <S> 'b' <T> 'c'

Note that this only checks for the letters A through Z, which is not all capital letters. If you really want all capital letters:

say "aSbTc".comb.map({ / <:Lu> / ?? "<$_>" !! "'$_'" }).join(" ")

This uses a regular expression, which may or may not be more readable.

CodePudding user response:

The crux of the substitutions you need are as follows:

 my @myString = $myString.comb;
  for @myString {
    .=subst(:global, /(<:Lu>)/, {"<$0>"});
    .=subst(:global, /(<:Ll   :N   :Sc   :Zs>)/, {"\'$0\'"})
    };
 put @myString;

Done up as a Raku one-liner (taking care to obviate quoting problems):

~$ echo 'aSbTc' |  raku -e 'my @str1 = lines.comb; for @str1 { .=subst(:global, /(<:Lu>)/, {"<$0>"}); .=subst(:global, /(<:Ll   :N   :Sc   :Zs>)/, {"\c[APOSTROPHE]$0\c[APOSTROPHE]"}) }; put @str1;'
'a' <S> 'b' <T> 'c'
~$ echo 'A BxyC$B' |  raku -e 'my @str1 = lines.comb; for @str1 { .=subst(:global, /(<:Lu>)/, {"<$0>"}); .=subst(:global, /(<:Ll   :N   :Sc   :Zs>)/, {"\c[APOSTROPHE]$0\c[APOSTROPHE]"}) }; put @str1;'
<A> ' ' <B> 'x' 'y' <C> '$' <B>
~$ echo 'XY12' |  raku -e 'my @str1 = lines.comb; for @str1 { .=subst(:global, /(<:Lu>)/, {"<$0>"}); .=subst(:global, /(<:Ll   :N   :Sc   :Zs>)/, {"\c[APOSTROPHE]$0\c[APOSTROPHE]"}) }; put @str1;'
<X> <Y> '1' '2'

ADDENDUM: You obviously can try changing the second .=subst command to act on the negation of the first, i.e. /(<:!Lu>)/. But this accepts an enormously-wide variety of characters (including Control characters), which may not be what you want. In practice, I've found it to be finicky and requiring an unless conditional (but see working code, below):

~$ echo -n 'aSbTcA BxyC$BXY12' |  raku -e 'my @str1 = $*IN.comb; for @str1 { .=subst(:global, /(<:Lu>)/, {"<$0>"}); .=subst(:global, / (<:!Lu>) /, {"\c[APOSTROPHE]$0\c[APOSTROPHE]"}) unless /<:Lu>/ }; @str1.put;'
'a' <S> 'b' <T> 'c' <A> ' ' <B> 'x' 'y' <C> '$' <B> <X> <Y> '1' '2'

[Note, this answer doesn't attempt to write a full Raku script with sub MAIN () { … }. You still have to decide (for example), if you want your script to take command line input, or issue a prompt, etc].

https://docs.raku.org/language/regexes#Unicode_properties

CodePudding user response:

My solution would be:

sub MAIN() {
    my token left-delimiter   {  <[ ' < ]>   }
    my token right-delimiter  {  <[ ' > ]>   }
    my token middle-character { <(<-[ ' < > ]> )> }
    my token quotation-string { <left-delimiter> ~ <right-delimiter> <middle-character> }
    my token target-string    { <quotation-string>  % \s  }

    sub extract-characters-between-delimiter(Str $s) {
        if $s ~~ / <target-string> / {
            say [~] gather for $/<target-string>{'quotation-string'} {
                        take .{'middle-character'}
                    }
        }
    }

    my @target-strings = [ 
        Q「「'a' <S> 'b' <T> 'c'」」,
        Q「「<A> ' ' <B> 'x' 'y' <C> '$' <B>"」」,
        Q「「<X> <Y> '1' '2'」」
    ];

    extract-characters-between-delimiter($_) for @target-strings;
}
  • Related