Is there any way that I am able to break a string down into smaller substrings of a given length in rust. for e.g. the string is "AAABBBCCC" and we want to break the string down into a vector or array of substrings, for example, the specified length is 3 than we should get a returning array of ["AAA", "BBB", "CCC"]
Any answers would be appreciated greatly.
CodePudding user response:
There are a couple ways this problem can be approached. I assume that since you refer to substrings you want to split it by characters and not bytes. We do not know how many bytes each character occupies in memory so we need to iterate through and check. This makes the process a little more complicated but not by much.
The easiest of which is likely to use the crate itertools
since it provides a simple .chunks()
function we can use on top of .chars()
. I suspect there is a standard library equivalent, but I do not know what it would be.
pub fn to_chunks(string: &str, chunk_size: usize) -> Vec<String> {
let mut sections = Vec::new();
for chunk in &string.chars().chunks(chunk_size) {
sections.push(String::from_iter(chunk))
}
sections
}
Another slightly more performant approach would be to create a Vec
of references. Since we can reference the original we do not need to allocate new strings or copy their contents. This can help with performance, however I suspect you are hoping for a Vec<String>
so this may or may not be the solution you are looking for. This is one way it could be done with the standard library.
pub fn to_chunks(string: &str, chunk_size: usize) -> Vec<&str> {
let mut sections = Vec::new();
let mut remaining = string;
loop {
// Get the byte offset of the nth character each time so we can split the string
match remaining.char_indices().nth(chunk_size) {
Some((offset, _)) => {
let (a, b) = remaining.split_at(offset);
sections.push(a);
remaining = b;
},
None => {
sections.push(remaining);
return sections
}
}
}
}