Remove characters after matching two conditions-CodePudding

I have the Python code below and I would like the output to be a string: "P-1888" discarding all numbers after the 2nd "-" and removing the leading 0's after the 1st "-".

So far all I have been able to do in the following code is to remove the trailing 0's:

import re

docket_no = "P-01888-000"

doc_no_rgx1 = re.compile(r"^([^\-] )\-(0 (. ))\-0[\d] $")
massaged_dn1 = doc_no_rgx1.sub(r"\1-\2", docket_no)

print(massaged_dn1)

CodePudding user response：

You can use the split() method to split the string on the "-" character and then use the join() method to join the first and second elements of the resulting list with a "-" character. Additionally, you can use the lstrip() method to remove the leading 0's after the 1st "-". Try this.

docket_no = "P-01888-000"
docket_no_list = docket_no.split("-")
docket_no_list[1] = docket_no_list[1].lstrip("0")
massaged_dn1 = "-".join(docket_no_list[:2])

print(massaged_dn1)

CodePudding user response：

First way is to use capturing groups. You have already defined three of them using brackets. In your example the first capturing group will get "P", and the third capturing group will get numbers without leading zeros. You can get captured data by using re.match:

match = doc_no_rgx1.match(docket_no)
print(f'{match.group(1)}-{match.group(3)}')  # Outputs 'P-1888'

Second way is to not use regex for such a simple task. You could split your string and reassemble it like this:

parts = docket_no.split('-')
print(f'{parts[0]}-{parts[1].lstrip("0")}')

CodePudding user response：

It seems like a sledgehammer/nut situation but of you do want to use re then you could use:

doc_no_rgx1 = ''.join(re.findall('([A-Z]-)0 (\d )-', docket_no)[0])

CodePudding user response：

I don't think I'd use a regular expression for this purpose. Your usecase can be handled by standard string manipulation so using a regular expression would be overkill. Instead, consider doing this:

docket_nos = "P-01888-000".split('-')[:-1]
docket_nos[1] = docket_nos[1].lstrip('0')
docket_no = '-'.join(docket_nos)
print(docket_no) # P-1888

This might seem a little bit verbose but it does exactly what you're looking for. The first line splits docket_no by '-' characters, producing substrings P, 01888 and 000; and then discards the last substring. The second line strips leading zeros from the second substring. And the third line joins all these back together using '-' characters, producing your desired result of P-1888.

CodePudding user response：

Functionally this is no different than other answers suggesting that you split on '-' and lstrip the zero(s), but personally I find my code more readable when I use multiple assignment to clarify intent vs. using indexes:

def convert_docket_no(docket_no):
    letter, number, *_ = docket_no.split('-')
    return f'{letter}-{number.lstrip("0")}'

_ is used here for a "throwaway" variable, and the * makes it accept all elements of the split list past the first two.