Home > Mobile >  Remove certain pattern from text using regex
Remove certain pattern from text using regex

Time:11-02

text = "Page 1 of 28 Medical Policies Archived Policies - Radiology Print Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Mechanical  Page 2/3 Percutaneous radiofrequency kyphoplasty or percutaneous mechanical vertebral augmentation using any other device, including but not limited. Page 38 Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Me... While radiotherapy and chemotherapy are frequently "

adm = re.sub("(?:(?:Page" [0-9] "of" [0-9] | Page [0-9] |  Page [0-9] "/" [0-9] ))", text, re.IGNORECASE)

print(adm)

Is there any solution to remove Page 1 of 28 , Page 2/3 , Page 38 from the text

CodePudding user response:

I would use this approach:

text = "Page 1 of 28 Medical Policies Archived Policies - Radiology Print Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Mechanical  Page 2/3 Percutaneous radiofrequency kyphoplasty or percutaneous mechanical vertebral augmentation using any other device, including but not limited. Page 38 Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Me... While radiotherapy and chemotherapy are frequently "
output = re.sub(r'\s*Page \d (?:/\d )?(?: of \d )?\s*', ' ', text).strip()
print(output)

This prints:

Medical Policies Archived Policies - Radiology Print Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Mechanical Percutaneous radiofrequency kyphoplasty or percutaneous mechanical vertebral augmentation using any other device, including but not limited. Percutaneous Balloon Kyphoplasty, Radiofrequency Kyphoplasty, and Me... While radiotherapy and chemotherapy are frequently

The regex pattern used above matches all 3 page variants seen in Page 1 of 28 , Page 2/3 , Page 38.

  • Related