Home > database >  Isolate all Substrings of a given structure in Python
Isolate all Substrings of a given structure in Python

Time:12-26

I'm currently trying to write a program that goes through a chat log generated by a site for online TTRPG playing. Currently, my output is as follows:

rolling 7d10(1 4 5 3 8 8 3)=32rolling 7d10(6 8 3 9 7 10 8)=51rolling 7d10(7 7 6 6 8 3 5)=42rolling 4d10(3 3 3 4)=13rolling 7d10(5 5 10 7 4 9 10)=50rolling 1d10 8(10) 8=18

I want every substring of this to be given as an independent string. In this case, a "substring" would be anything following the structure

rolling xdy(1 2 3... z)=a

I'm fairly certain that I'd need a Regex for this, but what this would look like (I'm not that good with Regex I'll admit) is beyond me.

CodePudding user response:

From the structure you shared rolling xdy(1 2 3... z)=a replace everyletter representing a number by \d (one or more digit) and with some ajustments you'll obtain

rolling \d d\d \((?:\d \ )*\d \)=\d 

Regex demo


import re

text = "rolling 7d10(1 4 5 3 8 8 3)=32rolling 7d10(6 8 3 9 7 10 8)=51rolling " \
       "7d10(7 7 6 6 8 3 5)=42rolling 4d10(3 3 3 4)=13rolling " \
       "7d10(5 5 10 7 4 9 10)=50rolling 1d10   8(10) 8=18"
results = re.findall(r"rolling \d d\d \((?:\d \ )*\d \)=\d ", text)
print(results)
['rolling 7d10(1 4 5 3 8 8 3)=32', 'rolling 7d10(6 8 3 9 7 10 8)=51', 
 'rolling 7d10(7 7 6 6 8 3 5)=42', 'rolling 4d10(3 3 3 4)=13', 
 'rolling 7d10(5 5 10 7 4 9 10)=50']

Note that the last one isn't valid as there is numbers and sign outside the parenthesis

CodePudding user response:

You can solve your task without regex like this

data = "rolling 7d10(1 4 5 3 8 8 3)=32rolling 7d10(6 8 3 9 7 10 8)=51rolling 7d10(7 7 6 6 8 3 5)=42rolling 4d10(3 3 3 4)=13rolling 7d10(5 5 10 7 4 9 10)=50rolling 1d10   8(10) 8=18"

parts = data.split("rolling")[1:]
print(parts)
# [' 7d10(1 4 5 3 8 8 3)=32', ' 7d10(6 8 3 9 7 10 8)=51', ' 7d10(7 7 6 6 8 3 5)=42', ' 4d10(3 3 3 4)=13', ' 7d10(5 5 10 7 4 9 10)=50', ' 1d10   8(10) 8=18']

If you need, you can attach string rolling back to parts

parts = ["rolling"   p for p in parts]
print(parts)
# ['rolling 7d10(1 4 5 3 8 8 3)=32', 'rolling 7d10(6 8 3 9 7 10 8)=51', 'rolling 7d10(7 7 6 6 8 3 5)=42', 'rolling 4d10(3 3 3 4)=13', 'rolling 7d10(5 5 10 7 4 9 10)=50', 'rolling 1d10   8(10) 8=18']

CodePudding user response:

It's much better to have the logger put a line terminator after each entry, but this will give you an array of rolls from your luxuriant purse of d10s.

import re

log_contents = 'rolling 7d10(1 4 5 3 8 8 3)=32rolling 7d10(6 8 3 9 7 10 8)=51rolling 7d10(7 7 6 6 8 3 5)=42rolling 4d10(3 3 3 4)=13rolling 7d10(5 5 10 7 4 9 10)=50rolling 1d10   8(10) 8=18'

pattern = re.compile('(=[0-9]*)')
rolls = pattern.sub(r'\1\n', log_contents).rstrip().split('\n')
  • Related