Extracting a string containing given number of characters, from a text file-CodePudding

My text file contains data consisting of numerous entries, with each entry start with the character <.

By using python, I want to extract data in such a way that only the first five characters of each entry is extracted (in addition to <). For example:

my file= [<1
acloclscloclxcccdddddddddddcccccddddddddddddweeeeeeeeeeeeeeeee
  <2
lsjfljljljljljljlsjdfojljljlholhowljljljljouopuljlj
  <3
ljlhohouojljljjouopuljljljhlhouljljlhh
  <4
hououojljljlhouojljljljlhouljljljljoukhklhkhkh......]

And the result I want should be the file containing only < and first 5 chagacters i.e.

 <1
aclo
  <2
lsjf
  <3
ljlh
  <4
houo

CodePudding user response：

for x in text.split("<"):
    if x != '':
        print(f'<{x[:6]}')

This might help

CodePudding user response：

Using regex

import re

txt = '<1 acloclscloclxcccdddddddddddcccccddddddddddddweeeeeeeeeeeeeeeee <2 lsjfljljljljljljlsjdfojljljlholhowljljljljouopuljlj <3 ljlhohouojljljjouopuljljljhlhouljljlhh <4 hououojljljlhouojljljljlhouljljljljoukhklhkhkh'

print(re.findall('(<[\s\S]{0,6})', txt))

Output -

['<1 aclo', '<2 lsjf', '<3 ljlh', '<4 houo']