A user suggested that, for my tokenizer state machine, I define a dictionary keyed by state where each value is a dictionary keyed by an input to that state. I am struggling with second part, as I do not know how I would reformat my current function into a lookup table of inputs.
My original function:
def data_state(cur_codepoint:str, reconsume:bool):
if reconsume == True:
codepoint = cur_codepoint
else:
codepoint = consume_next_input_char()
match codepoint:
case '&':
return_to(data_state, codepoint)
switch_to_state('character_reference')
case '<':
switch_to_state('tag_open')
case None:
emit_token(EOF)
case _:
emit_token(codepoint)
Outline of pt.1:
States = {
'data': data,
'rcdata': rcdata,
#Remaining states
#...
}
My attempt at pt.2:
data = {
'&': ( return_to(data_state, codepoint), switch_to_state('character_reference') ),
'<': switch_to_state('tag_open'),
None: emit_token(EOF),
_: emit_token(codepoint)
}
For some context, the state machine will receive one character/input at a time and I will have to perform operations based on what that char is. The tricky bit here is when I have to check if the input comes from a reconsume() function which asks that I consume the same char in a certain state - as opposed to the next input. I also do not know how to represent an anything else 'case _' in a dictionary nor how to call multiple functions.
Any help would be appreciated.
CodePudding user response:
In cases like this if/else statements are arguably more readable than the switch statements and switch statements in Python have no performance benefits. The dict approach may have performance benefits, but only for a large number of cases, and generally it is not very readable
But if you are committed to the dict approach then I would suggest 2 things:
- For each case where multiple functions need to be called, write a new function that calls them and put that in the case dict
- Deal with the wildcard case first:
- with an if/else statement that either calls the appropriate function directly and then skips the dict call
- or an if statement that changes your string to the appropriate key in your dict
As per your reply, here's the changing the string option:
def call_many_funcs():
func_x(...)
func_y(...)
func_z(...)
cases = {
'a': call_many_funcs,
'b': some_func,
'c': some_other_func,
'not_found': not_found_func
...
}
Then to execute:
my_case = #whatever
if my_case not in cases.keys():
my_case = 'not_found'
cases[my_case]()