Home > Blockchain >  replace some characters in a string with the next unicode character in Rust
replace some characters in a string with the next unicode character in Rust

Time:04-11

I have an input text as following:

 inputtext = "This is a test";

I need to replace some of the character (based on a certain criteria) to next unicode character

 int i=0;
 for c in inputtext.chars() {
   if (somecondition){
     //replace char here
     inputtext.replace_range(i..i 1,newchar);
     //println!("{}", c);


 }

What is the best way to do this? Thanks.

CodePudding user response:

You can't easily update a string in-place because a Rust string is not just an array of characters, it's an array of bytes (in UTF-8 encoding), and different characters may use different numbers of bytes. For example, the character ߿ (U 07FF "Nko Taman Sign") uses two bytes, whereas the next Unicode character (U 0800 "Samaritan Letter Alaf") uses three.

It's therefore simplest to turn the string into an iterator of characters (using .chars()), manipulate that iterator as appropriate, and then construct a new string using .collect().

For example:

let old = "abcdef";

let new = old.chars()
    // note: there's an edge case if ch == char::MAX which we must decide
    //       how to handle. in this case I chose to not change the
    //       character, but this may be different from what you need.
    .map(|ch| {
        if somecondition {
            char::from_u32(ch as u32   1).unwrap_or(ch)
        } else {
            ch
        }
    })
    .collect::<String>();
  • Related