My respects, colleagues. I need to write a function that determines the maximum number of consecutive BA, CA character pairs per line.
print(f("BABABA125")) # -> 3
print(f("234CA4BACA")) # -> 2
print(f("BABACABACA56")) # -> 5
print(f("1BABA24CA")) # -> 2
Actually, I've written a function, but, to my mind, it's not very good.
def f(s: str) -> int:
res = 0
if not s:
return res
cur = 0
i = len(s) - 1
while i >= 0:
if s[i] == "A" and (s[i-1] == "B" or s[i-1] == "C"):
cur = 1
i -= 2
else:
if cur > res:
res = cur
cur = 0
i -= 1
else:
if cur > res:
res = cur
return res
In addition, I'm not allowed to use libraries and regular expressions (only string and list methods). Could you please help me or rate my code in this context. I'll be very grateful.
CodePudding user response:
Here's a function f2
that performs this operation.
if not re.search('(BA|CA)', s): return 0
First check if the string actually contains anyBA
orCA
(to preventValueError: max() arg is an empty sequence
on step 3), and return 0 if there aren't any.matches = re.finditer(r'(?:CA|BA) ', s)
Find all consecutive sequences ofCA
orBA
, using non-capturing groups to ensurere.finditer
outputs only full matches instead of partial matches.res = max(matches, key=lambda m: len(m.group(0)))
Then, among the matches (re.Match
objects), fetch the matched substring usingm.group(0)
and compare their lengths to find the longest one.return len(res.group(0))//2
Divide the length of the longest result by 2 to get the number ofBA
orCA
s in this substring. Here we use floor division//
to coerce the output into anint
, since division would normally convert the answer tofloat
.
import re
strings = [
"BABABA125", # 3
"234CA4BACA", # 2
"BABACABACA56", # 5
"1BABA24CA", # 2
"NO_MATCH_TO_BE_FOUND", # 0
]
def f2(s: str):
if not re.search('(BA|CA)', s): return 0
matches = re.finditer(r'(?:CA|BA) ', s)
res = max(matches, key=lambda m: len(m.group(0)))
return len(res.group(0))//2
for s in strings:
print(f2(s))
UPDATE: Thanks to @StevenRumbalski for providing a simpler version of the above answer. (I split it into multiple lines for readability)
def f3(s):
if not re.search('(BA|CA)', s): return 0
matches = re.findall(r'(?:CA|BA) ', s)
max_length = max(map(len, matches))
return max_length // 2
if not re.search('(BA|CA)', s): return 0
Same as abovematches = re.findall(r'(?:CA|BA) ', s)
Find all consecutive sequences ofCA
orBA
, but each value inmatches
is astr
instead of are.Match
, which is easier to handle.max_length = max(map(len, matches))
Map each matched substring to its length and find the maximum length among them.return max_length // 2
Floor divide the length of the longest matching substring by the length ofBA
,CA
to get the number of consecutive occurrences ofBA
orCA
in this string.
CodePudding user response:
Here's an alternative implementation without any imports. Do note however that it's quite slow compared to your C-style implementation.
The idea is simple: Transform the input string into a string consisting of only two types of characters c1
and c2
, with c1
representing CA
or BA
, and c2
representing anything else. Then find the longest substring of consecutive c1
s.
The implementation is as follows:
- Pick a char that is guaranteed not to appear in the input string; here we use
-
. - Replace each occurrence of
CA
andBA
with a - Replace everything else in the string (that is not a
-
(this is why-
s. - Split the string with
-
as delimiter, and map each resulting substring to their length. - Return the maximum of these substring lengths.
strings = [
"BABABA125", # 3
"234CA4BACA", # 2
"BABACABACA56", # 5
"1BABA24CA", # 2
"NO_MATCH_TO_BE_FOUND", # 0
]
def f4(string: str):
string = string.replace("CA", " ")
string = string.replace("BA", " ")
string = "".join([(c if c == " " else "-") for c in string])
str_list = string.split("-")
str_lengths = map(len, str_list)
return max(str_lengths)
for s in strings:
print(f4(s))