Goal
So, I'm trying to break up large sentences into smaller ones that are between a certain threshold.
Base Idea
Write a function to find the best number for this regex: .{1,20}(?:\s|$)
Issue
But my sentences are various lengths, so I have a problem just using a static number. I need to calculate the correct number to use instead of "20" in the regex above.
Criteria
The number found should optimize to get the biggest sentences possible with 165 characters being the upper limit and 70 being the lower limit.
All sentences passed to this function will be higher than 165 characters in length.
Examples
So, let's say I pass it a sentence with 166 characters. This should return 2 new sentences of roughly 83 characters each.
I think the number found would get passed to this regex: .{1,83}(?:\s|$)
to produce 2 sentences with no remainder.
Result: ['first half of the sentence', 'second half of the sentence']
If I pass it a 400 character sentence, it will return 4 shortened sentences of around 100 characters each.
Any thoughts would help.
CodePudding user response:
Okay, so here is what I went with using TypeScript:
const MAX_LENGTH = 165;
const MIN_LENGTH = 70;
function createChunks(
remainingText: string,
bestBreak: number,
chunks: string[]
): void {
if (remainingText.length > bestBreak) {
// find space closest to bestBreak point
for (let i = bestBreak; i < remainingText.length; i ) {
const charCode = remainingText.charCodeAt(i);
// once found, push chunk to array
if (charCode === 32) {
chunks.push(remainingText.substring(0, i));
// get the remaining text
remainingText = remainingText.substring(i, remainingText.length 1);
}
}
} else {
chunks.push(remainingText);
}
}
function getChunks(text: string, bestBreak: number): string[] {
const chunks: string[] = [];
let remainingText = text;
createChunks(remainingText, bestBreak, chunks);
return chunks;
}
function createShorterSentences(text: string): string[] {
const sentenceLength = text.length;
let remainder = MAX_LENGTH,
bestBreak = MAX_LENGTH;
// find best breaking point
for (let i = MIN_LENGTH; i < MAX_LENGTH; i ) {
const currRemainder = sentenceLength % i;
if (currRemainder <= remainder) {
remainder = currRemainder;
bestBreak = i;
}
}
return getChunks(text, bestBreak);
}