split python string if newline starts with a period-CodePudding

I try to split a text where a newline starts with only a period.

txt = "\ra. skin lateral biopsy:\r -positive for disease \r.\rb. skin medial biopsy:\r -negative for disease \r. \rc. skin floor biopsy:\r -negative for disease"

the expected result would be:

["a. skin lateral biopsy: -positive for disease", "b. skin medial biopsy: -negative for disease", "c. skin floor biopsy: -negative for disease"]

I tried

re.split('^\.', txt) and it does not work.

I don't understand what why regex is not picking up the lines that start with periods.

CodePudding user response：

Firstly I would split the text, and then I would process it through a loop

txt = "\ra. skin lateral biopsy:\r -positive for disease \r.\rb. skin medial biopsy:\r -negative for disease \r. \rc. skin floor biopsy:\r -negative for disease";
arr = txt.split("\r.")
for j in arr:
  j = j.replace("\r", "") 
  j = j.strip()
  print(j)

CodePudding user response：

[re.sub(r'\r.?','',e).strip() for e in re.split(r'\r(?=[a-z])',txt) if e]

['a. skin lateral biopsy:-positive for disease', 'b. skin medial biopsy:-negative for disease', 'c. skin floor biopsy:-negative for disease']