Optimized way of converting to list containing dictionaries and string based on specific condition f-CodePudding

I have the list containing strings and numbers. I am trying to convert into dictionary based on the type if it is string otherwise it would become a key.

Input:

lst = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

output:

[{'A': [1, 3, 4]}, {'B': [5]}, {'C': [2]}, {'D': 4}]

Working Code:

main_array = []
small_array = []
se = {}
key = None
for i in range(len(lst)-1):
    print(i)
    if i == len(lst)-2:
        if type(lst[i]) == str and type(lst[i 1]) == str:
            main_array.append(lst[i])
            main_array.append(lst[i 1])
        elif type(lst[i]) == str and type(lst[i 1]) != str:
            main_array.append({lst[i]: lst[i 1]})
        elif type(lst[i]) != str and type(lst[i 1]) == str:
            small_array.append(lst[i])
            se.update({key: small_array})
            main_array.append(se)
            se = {}
            small_array = []
            main_array.append(lst[i 1])
        elif lst[i] != type(str) and lst[i   1] != type(str):
            small_array.append(lst[i])
            small_array.append(lst[i 1])
            se.update({key: small_array})
            main_array.append(se)
            se = {}
            small_array = []
    else:
        if type(lst[i]) == str and i != len(lst)-1:
            if type(lst[i 1]) == str:
                main_array.append(lst[i])
            elif type(lst[i 1]) != str:
                key = lst[i]
        elif type(lst[i]) != str and i != len(lst)-1:
            if type(lst[i 1]) == str:
                small_array.append(lst[i])
                se.update({key: small_array})
                main_array.append(se)
                se = {}
                small_array = []
            elif type(lst[i 1]) != str:
                small_array.append(lst[i])
print(main_array)

Is there any way to optimize this code as I am intending to avoid nested loops.

CodePudding user response：

It is much better to create an intermediate dictionary, then convert that dictionary to your desired result. Like this:

import collections

lst = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

dct = collections.defaultdict(lambda: [])
key = None
for el in lst:
    if type(el) == str:
        key = el
        continue
    dct[key].append(el)

result = [{key: dct[key]} for key in dct]
print(result)

CodePudding user response：

Try using a temporary list variable to build up the list until you encounter a str value in the list:

L = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

res = {}
curr: list

for e in L:
    if type(e) == str:
        res[e] = curr = []
    else:
        curr.append(e)

print(res)

Print:

{'A': [1, 3, 4], 'B': [5], 'C': [2], 'D': [4]}

In case you really want a list of single-value dict elements:

L = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

res = []

add = list.append
curr: list

for e in L:
    if type(e) == str:
        curr = []
        add(res, {e: curr})
    else:
        add(curr, e)

print(res)

Output:

[{'A': [1, 3, 4]}, {'B': [5]}, {'C': [2]}, {'D': [4]}]

I was actually curious, so I timed it. It seems close to 40% faster than using an approach with defaultdict.

from timeit import timeit
from collections import defaultdict
from itertools import groupby

L = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

# 1.385
print('dict:          ', timeit("""
res = []

add = list.append
curr: list

for e in L:
    if type(e) == str:
        curr = []
        add(res, {e: curr})
    else:
        add(curr, e)
""", globals=globals()))

# 1.619
print('no temp list:  ', timeit("""
res = []
key = None
add = list.append

for el in L:
    if type(el) == str:
        key = el
        add(res, {el: []})
        continue
    add(res[-1][key], el)
""", globals=globals()))

# 2.150
print('defaultdict:   ', timeit("""
dct = defaultdict(list)
key = None

for el in L:
    if type(el) == str:
        key = el
    else:
        dct[key].append(el)

result = [{key: dct[key]} for key in dct]
""", globals=globals()))

# 2.578
print('groupby:       ', timeit("""
groups = groupby(L, type)
result = []
try:
    while groups:
        result.append({ list(next(groups)[1])[0] : list(next(groups)[1]) })
except StopIteration:
    pass
""", globals=globals()))

CodePudding user response：

Just for fun, you could use itertools.groupby to group your list by the type of value, then iterate the result to get the key and value pairs for each dict to be pushed to the result list:

import itertools

lst = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]
groups = itertools.groupby(lst, type)
result = []
try:
    while groups:
        result.append({ list(next(groups)[1])[0] : list(next(groups)[1]) })
except StopIteration:
    pass

Output:

[
 {'A': [1, 3, 4]},
 {'B': [5]},
 {'C': [2]},
 {'D': [4]}
]

CodePudding user response：

One way to do this is to simply add a new dictionary every time a string comes up, then append to that list until the next string. Like this:

lst = ['A', 1, 3, 4, 'B', 5, 'C', 2, 'D', 4]

res = []
key = None
for el in lst:
    if type(el) == str:
        key = el
        res.append({el: []})
        continue
    res[-1][key].append(el)

print(res)

I believe that this would be much faster than my other answer. All you need is one result array and one variable to keep track of the key.

When I tested it, this method is slightly slower than rv.kvetch's answer, however there is no need to store a temporary list. This method is cleaner (no need to add the last list) and uses less memory in theory.