I am trying to split a string in multiple lines. Here is an example of the string:
Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string
It has to be split like this:
Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars
because this part have to be in the second line and it also needs to be seperated such as the string part before and
this has to be in the third line with the end of the string
I'm trying to split the string at the length of 120 chars but if a word is about to be cut it should use the last word before that limit und put the last word in the next line and calculate how the rest have to treated the same way if the text is any longer than 2 lines.
Also there is a part in the string that have to stay in one line at the end.
How do i do this dynamically? I tried some solutions like string[0:120]
, string.splitlines()
and wrap
. Maybe like putting it in a list and loop through it but how to build this splitting logic?
Is there maybe a built-in solution for this?
CodePudding user response:
Using textwrap.wrap
:
>>> text = "Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string"
>>> import textwrap
>>> print(*textwrap.wrap(text, 120), sep='\n')
Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars
because this part have to be in the second line and it also needs to be seperated such as the string part before and
this has to be in the third line with the end of the string
If you were doing it from scratch, using split
and iteratively adding words to a list of lines would be a good way to start:
>>> words = text.split(" ")
>>> lines = [words[0]]
>>> for word in words[1:]:
... if len(lines[-1]) len(word) < 120:
... lines[-1] = (" " word)
... else:
... lines.append(word)
...
>>> print(*lines, sep='\n')
Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars
because this part have to be in the second line and it also needs to be seperated such as the string part before and
this has to be in the third line with the end of the string
CodePudding user response:
You might also use regular expressions, matching anywhere from 1 to 120 characters followed by a word boundary.
re.findall(r'(.{1,120})(?=\b)', "Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string")
Yields:
['Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars ',
'because this part have to be in the second line and it also needs to be seperated such as the string part before and ',
'this has to be in the third line with the end of the string']
Putting this into a function:
def wrap(length, text):
pat = re.compile(f'(.{{1,{length}}})(?=\\b)')
return pat.findall(text)
wrap(120, "Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string")
# ['Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars ',
# 'because this part have to be in the second line and it also needs to be seperated such as the string part before and ',
# 'this has to be in the third line with the end of the string']
wrap(80, "Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string")
# ['Here the line that have to be under 120 chars and cut at the point in the string',
# ' where the last word is under 120 chars because this part have to be in the ',
# 'second line and it also needs to be seperated such as the string part before and',
# ' this has to be in the third line with the end of the string']
wrap(60, "Here the line that have to be under 120 chars and cut at the point in the string where the last word is under 120 chars because this part have to be in the second line and it also needs to be seperated such as the string part before and this has to be in the third line with the end of the string")
# ['Here the line that have to be under 120 chars and cut at the',
# ' point in the string where the last word is under 120 chars ',
# 'because this part have to be in the second line and it also ',
# 'needs to be seperated such as the string part before and ',
# 'this has to be in the third line with the end of the string']
Further exercise would be stripping leading and trailing whitespaces from each line.