So I have a very simple problem that has been troubling me for a little while. I read translated Chinese/Korean novels often but I'm a slow reader so I use a tts to help me read faster and when I'm busy with doing other things. But in the text that is sent to the tts from the website, the translator/editor sometimes makes mistakes or possibly just a formatting problem or something. There are some instances where a sentence/paragraph starts with a number. For example, the paragraph can start with something like "1There was..." so originally the foolish me put the number 1 to be replaced with a blank in the speaking text process. But as you would guess 1 pops up quite often haha so instead I looked up some RegEx and came up with this [0-9][:alpha:] or [A-Za-z], which does work to find instances of that specific problem but the only issue is that I want it to remove the number but not the letter. So the example "1There" is currently "here" instead of "There" since it searches for the RegEx and replaces it with a blank and I'm not knowledgeable enough to figure out how to make it work. So any help would be helpful. I feel like it's a simple problem but I'm just too dumb to figure it out haha. If nothing else I'll take this as a learning experience.
CodePudding user response:
What you need is a so-called positive lookahead. Your RegEx should only match numbers, if they are directly followed by letters:
\d (?=[A-Za-z])