Home > other >  Decomposing a large string into small parts without damaging word parts
Decomposing a large string into small parts without damaging word parts

Time:01-02

I have a large array that I need to send in small sections (between 200 characters and 400 characters)

The problem that arises for me is that I don't want to cut the words but only in the spaces

How can I create from the string a division into small sections?

To give an example I used a text generator

Unbowed, Unbent, Unbroken. It's ten thousand miles between Kings landing and the wall. The tourney of Ashford Meadows. Pay the iron price. The winds of Winter. King in the North. The War of the 5 kings. It's ten thousand miles between Kings landing and the wall. The winds of Winter. Words are like wind. Bastards are born of passion, aren't they? We don't despise them in Dorne. The night is dark and full of terrors. And now his watch is ended. May the Father judge him justly. The rains of castamere. What is dead may never die. A Lannister always pays his debts. Never Resting. A good act does not wash out the bad, nor a bad act the good. Each should have its own reward. Unbowed, Unbent, Unbroken.

I want the result to be Unbowed, Unbent, Unbroken. It's ten thousand and not Unbowed, Unbent, Unbroken. It's ten thou

Thank you

CodePudding user response:

Solution using loops:

Here I loop through every single character of the string dividing it in chunks of minLength until the next empty space is found or the string is over.

The whitespace is preserved at the beginning of the next chunk (it could be truncated but I preferred to keep it)

An error is thrown if a chunk exceeds the maximum length passed (if any).

const payload =
"Unbowed, Unbent, Unbroken. It's ten thousand miles between Kings landing and the wall. The tourney of Ashford Meadows. Pay the iron price. The winds of Winter. King in the North. The War of the 5 kings. It's ten thousand miles between Kings landing and the wall. The winds of Winter. Words are like wind. Bastards are born of passion, aren't they? We don't despise them in Dorne. The night is dark and full of terrors. And now his watch is ended. May the Father judge him justly. The rains of castamere. What is dead may never die. A Lannister always pays his debts. Never Resting. A good act does not wash out the bad, nor a bad act the good. Each should have its own reward. Unbowed, Unbent, Unbroken.";

console.log( splitByChunksOfMinLength(payload, 200, 400) );

function splitByChunksOfMinLength(subject, minLength, maxLength = null){
  
  const slices = [];
  let slice = "";
  let i = 0;
  let currentChar;
  
  while(i < subject.length){
    for(;i < subject.length;i  ){
      const currentChar = subject[i];
      if (slice.length >= minLength && currentChar === ' ')
        break;
      slice  = currentChar;
    }
    if(maxLength && slice.length > maxLength)
       throw new Error('A slice in the string to split, exceeded max length');
    //add the current slice to returning slices and reset the current slice
    slices.push(slice);
    slice = '';
  }
  return slices;
}

Solution using regular expressions:

Here I used one single regular expression to match the chunks in the string.

Also here the whitespaces were preserved, this time at the end of each chunk.

const payload =
"Unbowed, Unbent, Unbroken. It's ten thousand miles between Kings landing and the wall. The tourney of Ashford Meadows. Pay the iron price. The winds of Winter. King in the North. The War of the 5 kings. It's ten thousand miles between Kings landing and the wall. The winds of Winter. Words are like wind. Bastards are born of passion, aren't they? We don't despise them in Dorne. The night is dark and full of terrors. And now his watch is ended. May the Father judge him justly. The rains of castamere. What is dead may never die. A Lannister always pays his debts. Never Resting. A good act does not wash out the bad, nor a bad act the good. Each should have its own reward. Unbowed, Unbent, Unbroken.";

console.log( splitByChunksOfMinLength(payload, 200) )

function splitByChunksOfMinLength(string, minLength) {
  const regex = new RegExp(`.{1,${minLength}}[^\\s]*\\s?|.{${minLength},}[^\\s]*`, 'g');
  return string.match(regex);
}

CodePudding user response:

var string = 'Unbowed, Unbent, Unbroken. It\'s ten thousand miles between Kings landing and the wall. The tourney of Ashford Meadows. Pay the iron price. The winds of Winter. King in the North. The War of the 5 kings. It\'s ten thousand miles between Kings landing and the wall. The winds of Winter. Words are like wind. Bastards are born of passion, aren\'t they? We don\'t despise them in Dorne. The night is dark and full of terrors. And now his watch is ended. May the Father judge him justly. The rains of castamere. What is dead may never die. A Lannister always pays his debts. Never Resting. A good act does not wash out the bad, nor a bad act the good. Each should have its own reward. Unbowed, Unbent, Unbroken.'

var temporaryString = '';
var words = string.split(' ');
var temporaryArray = [];
var dividedext = [];
for (var i = 0; i < words.length; i  ) {
  temporaryArray.push(words[i]);
  if (temporaryArray.join(' ').length > 200) {
    temporaryString = temporaryArray.pop();
    dividedext.push(temporaryArray.join(" "));
    if (i == words.length - 1) {
      dividedext.push(temporaryString);
    } else {
      temporaryArray = [temporaryString];
    }
  }
}
dividedext.push(temporaryArray.join(' '))
console.log(dividedext);

  • Related