EDIT: Thanks to @arzo for the catch
I have a nested dictionary structured as:
{
"key1":"string",
"key2":[{"nestedkey":"nestedvalue"}],
"key3":[1,2,3],
"key4":[{"nestedlevel1key":[{"nestedlevel2key":"nestedlevel2value"}]}],
"key5": {},
"key6": {"regularkey": "regularvalue"},
"key7": 15
}
In which a dictionary key can contain:
- strings
- ints
- lists of dictionaries
- lists of dictionaries that contain lists of dictionaries
- an empty dictionary
- a regular dictionary
The problem statement is optimizing a method that can handle returning all of the keys within a dictionary. I can write something like this:
def get_keys(dict_example):
keys = []
for k,v in dict_example.items():
keys.append(k)
if isinstance(v, dict):
for k in v.keys():
keys.append(k)
if isinstance(v, list):
if isinstance(v[0], dict):
for k,v in v[0].items():
keys.append(k)
if isinstance(v, list) and isinstance(v[0], dict):
for k in v[0].keys():
keys.append(k)
return keys
keys = get_keys(dict_example)
print(keys)
Which will get me (in no particular order) a list of the keys:
['key1', 'key2', 'nestedkey1', 'key3', 'key4', 'nestedlevel1key', 'nestedlevel2key', 'key5', 'key6', 'regularkey', 'key7']
But I am not sure of an optimized method that is simplified to take on all the 6 use cases that can also transverse through the array, regardless of how many levels there are. Now I made the heuristic of the number of levels, but there could be deeper levels within this array that I need to account for.
CodePudding user response:
They are only 2 cases to handle : list
and dict
as they contain other thing, then use recursivity
- for a
dict
: keeps keys, and search throught values - for
list
: search throught values
def get_keys(item):
keys = []
if isinstance(item, dict):
for k, v in item.items():
keys.append(k)
keys.extend(get_keys(v))
elif isinstance(item, (list, tuple)):
for x in item:
keys.extend(get_keys(x))
return keys
values = {
"key1": "string", "key2": [{"nestedkey": "nestedvalue"}], "key3": [1, 2, 3],
"key4": [{"nestedlevel1key": [{"nestedlevel2key": "nestedlevel2value"}]}],
"key5": {}, "key6": {"regularkey": "regularvalue"},
}
keys = get_keys(values)
print(keys)
# ['key1', 'key2', 'nestedkey', 'key3', 'key4', 'nestedlevel1key', 'nestedlevel2key', 'key5', 'key6', 'regularkey']
CodePudding user response:
Rather than a high number of nested loops, you can solve this with a recursive function:
origDict = {
"key1":"string",
"key2":[{"nestedkey":"nestedvalue"}],
"key3":[1,2,3],
"key4":[{"nestedlevel1key":[{"nestedlevel2key":"nestedlevel2value"}]}],
"key5": {},
"key6": {"regularkey": "regularvalue"},
"key7": 15
}
keyList = []
def get_keys(inDict, listOfKeys):
for k, v in inDict.items():
listOfKeys.append(k)
if isinstance(v, list):
for item in v:
if isinstance(item, dict):
get_keys(item, listOfKeys)
elif isinstance(v, dict):
get_keys(v, listOfKeys)
get_keys(origDict, keyList)
print(keyList)
This prints:
['key1', 'key2', 'nestedkey', 'key3', 'key4', 'nestedlevel1key',
'nestedlevel2key', 'key5', 'key6', 'regularkey', 'key7']```
CodePudding user response:
Just for fun, you can also use a non-recursive function to save time:
def get_keys(d):
out = []
inside = d
while inside:
out.append(list(inside.keys())[0])
v = inside.pop(out[-1])
if isinstance(v, dict):
inside.update(v)
elif isinstance(v, list) and isinstance(v[0], dict):
for x in v:
inside.update(x)
else:
continue
return out
out = get_keys(d)
Output:
['key1', 'key2', 'key3', 'key4', 'key5', 'key6', 'key7', 'nestedkey', 'nestedlevel1key', 'regularkey', 'nestedlevel2key']