I find myself doing a lot of data transformation like this:
mydf = pd.DataFrame()
for i in my_array:
data = {}
try:
data["foo"] = i["foo"] ## more simple
except:
data["foo"] = ""
...
try:
data["baz"] = item["foo"]["baz"] ## more complex
except:
data["baz"] = ""
mydf = mydf.append(data, ignore_index = True)
It seems repetitive do many try/except statements this way. How would I write a function to manage this scenario or is what I'm doing the best practise?
CodePudding user response:
I don't see a reason why data.get() wouldn't do the job.
data[key] = i.get(key, val) #if key is there in dictionary i then data[key]=i[key], else sets val as the value for data[key]
example:
data["foo"] = i.get("foo", "")
CodePudding user response:
First of all - in such scenario just do source.get("key", "")
.
But If you want to do some more complicated assignments - it can be done as in below code snippet.
Probably there will appear question about used lambda.
It is used to evaluate source[key]
inside of function which is responsible for safe assign.
We can't evaluate it directly in for loop - because then we will have to catch any exceptions here. When using lambda - we are postponing moment when this code gonna be evaluated to a moment when lambda gonna be called.
Also there can appear question why destination_dict
and destination_key
are passed separately. Because if it's used at the left side of =
sign - it has to be constructed this way :). We can't pass destination_dict[destination_key]
to function because it will be evaluated outside of function and it will raise KeyError
.
from typing import Callable
def assign_safely(
destination_dict: dict,
destination_key: str,
source_func: Callable[..., None],
exceptions=(KeyError,),
):
try:
destination_dict[destination_key] = source_func()
except exceptions:
destination_dict[destination_key] = ""
source = {"a": 1, "b": {}}
destination = {}
for key in ["a", "b", "c"]:
assign_safely(
destination_dict=destination,
destination_key=key,
source_func=lambda: source[key],
)
assert destination == {"a": 1, "b": {}, "c": ""}