I want to write a generator s_generator(alphabet, length, start_s, end_s)
that generates strings of length n over a given alphabet in dictionary order starting with start_s
and ending at end_s
.
For example, s_generator('ab', 4, 'aaaa', 'bbbb')
generates ['aaaa', 'aaab', 'aaba', 'aabb', 'abaa', 'abab', 'abba', 'abbb', 'baaa', 'baab', 'baba', 'babb', 'bbaa', 'bbab', 'bbba', 'bbbb']. And s_generator('ab', 4, 'abaa', 'abaa')
generates ['abaa', 'abab', 'abba', 'abbb', 'baaa']
What is a good way to implement it?
I thought about assigning a number to each character in alphabet, treating the string as a base-n number (n is size of alphabet) and using addition to get the next number, and then convert the number back to string. For example, 'abab' is [0, 1, 0, 1] and the next number is [0, 1, 1, 0], which is 'abba'. This method seems complicated. Is there a simpler solution?
CodePudding user response:
use itertools and comprehension list
from itertools import product
def s_generator(alphabet, length, start_s, end_s):
products = product(alphabet, repeat=length)
return [''.join(x) for x in products if ''.join(x) >= start_s and ''.join(x) <= end_s]
print(s_generator('ab', 4, 'aaaa', 'bbbb'))
CodePudding user response:
It seems the counting method I mentioned in my question is the most efficient and cleanest solution.
def str_generator(start_s, end_s):
# assuming start_s < end_s and are valid strings
start_nums, end_nums = str2nums(start_s), str2nums(end_s)
nums = start_nums
while nums <= end_nums:
yield nums2str(nums)
# increment by 1
i = len(nums) - 1
while i >= 0:
nums[i] = nums[i] 1
if nums[i] < SIZE_OF_ALPHABET:
break
nums[i] = 0
i -= 1
if i < 0:
return
str2nums(s)
and nums2str(nums)
are just functions that handles the mapping between strings and numbers. They can be returning hash table entries. Or in my case of using a-zA-Z as my alphabet, simply shifting the ASCII values will do:
# ASCII value
# A-Z: 65-90, a-z: 97-122
def str2nums(s):
nums = []
for c in s:
if c.islower(): # 0-25 is a-z
nums.append(ord(c) - ord('a'))
else: # 26-51 is A-Z
nums.append(ord(c) 26 - ord('A'))
return nums
def nums2str(nums):
str_builder = []
for n in nums:
if n < 26: # 0-25 is a-z
str_builder.append(chr(n ord('a')))
else: # 26-51 is A-Z
str_builder.append(chr(n - 26 ord('A')))
return ''.join(str_builder)