how to recursively create nested list from string input-CodePudding

So, I would like to convert my string input

'f(g,h(a,b),a,b(g,h))'

into the following list

['f',['g','h',['a','b'],'a','b',['g','h']]]

Essentially, I would like to replace all '(' into [ and all ')' into ].

I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it

word='f(a,f(a))'
empty=[]
def newlist(word):
    listy=[]
    for i, letter in enumerate(word):
        if letter=='(':
            return newlist([word[i 1:]])
        if letter==')':
            listy.append(newlist)
        else:
            listy.extend(letter)
        
    return empty.append(listy)

CodePudding user response：

Assuming your input is something like this:

a = 'f,(g,h,(a,b),a,b,(g,h))'

We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:

tokens = list(a)

Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:

def next_token():
    return tokens[0] if tokens else None


def pop_token():
    tokens.pop(0)

Your input consist of "items", separated by a comma. Schematically, it can be expressed as

items = item ( ',' item )*

In the python code, we first read one item and then keep reading further items while the next token is a comma:

def items():
    result = [item()]
    while next_token() == ',':
        pop_token()
        result.append(item())
    return result

An "item" is either a sublist in parentheses or a letter:

def item():
    return sublist() or letter()

To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:

def sublist():
    if next_token() == '(':
        pop_token()
        result = items()
        if next_token() == ')':
            pop_token()
            return result
        raise SyntaxError()

letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:

def letter():
    result = next_token()
    pop_token()
    return result

Let us know if you have questions.

CodePudding user response：

You can get the expected answer using the following code but it's still in string format and not a list.

import re

a='(f(g,h(a,b),a,b(g,h))' 
ans=[]
sub=''
def rec(i,sub):
    if i>=len(a):
        return sub
    if a[i]=='(':
        if i==0:
            
            sub=rec(i 1,sub '[')
        else:
            sub=rec(i 1,sub ',[')
            
    elif a[i]==')':
        sub=rec(i 1,sub ']')
    else:
        sub=rec(i 1,sub a[i])
    return sub


b=rec(0,'')
print(b)
b=re.sub(r"([a-z] )", r"'\1'", b)
print(b,type(b))

Output

[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>

CodePudding user response：

Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.

Good luck, and I hope this helps!

# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])

def recursive_solution(input_str, index):

    # base case A: when index exceeds or equals len(input_str)
    if index >= len(input_str):
        return [], index
    
    char = input_str[index]

    # base case B: when we reach a closed parenthesis stop this level of recursive depth
    if char in CLOSE_PARENTHESIS:
        return [], index

    # do the next recursion, return it's value and the index it stops at
    recur_val, recur_stop_i = recursive_solution(input_str, index   1)

    # with an open parenthesis, we want to continue the recursion after it's associated
    # closed parenthesis. and also the recur_val should be within a new dimension of the list
    if char in OPEN_PARENTHESIS:
        continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i   1)
        return [recur_val]   continued_recur_val, continued_recur_stop_i
    
    # for spacers eg "," we just ignore it
    if char in SPACER:
        return recur_val, recur_stop_i
    
    # and finally with normal characters, we just extent it 
    return [char]   recur_val, recur_stop_i