Home > Net >  How to check & supply missing data in a dict?
How to check & supply missing data in a dict?

Time:08-17

I am pulling data from an API and outputting it in influx line format. However, sometimes there is data missing, Rather than supplying a null value, the API omits that field. That causes this code to die.

    for serial in SERIALS[site]:
        r = requests.get(f"{BASE_API_URL}/equipment/{site}/{serial}/data", {
            'startTime': format_datetime_url(startTime),
            'endTime': format_datetime_url(endTime),
            'api_key': SETTING_API_KEY
        },
            timeout=REQUEST_TIMEOUT)
            
        # Parse request
        for value in r.json()['data']['telemetries']:
            date = value['date']
            print(
                f'data,site={site},sn={serial} I_Temp={value["temperature"]},I_AC_Energy_WH={value["totalEnergy"]},I_AC_Power={value["totalActivePower"]} {to_unix_timestamp(date)}',
                flush=False)
    return True

This code dies with a KeyError if any of the fields in value are missing. I can add in something like this just before the print statement:

            if not 'temperature' in value:
                value["temperature"] = ""
            if not 'totalEnergy' in value:
                value["totalEnergy"] = ""
            if not 'totalActivePower' in value:
                value["totalActivePower"] = ""

but is there a more elegant way?

CodePudding user response:

You want to have a stable output like the text below that contains following metrics or a default value like empty-string '' if missing from your retrieved JSON telemetries:

  1. temperature
  2. totalEnergy
  3. totalActivePower
  4. date

There are two simple ways to achieve this. Choose the most elegant or readable.

Default dict merged with present values

Then you can simply create a default data_points dict and use dict's dict.update(with_other) method to overwrite with the present response-key/values in telemetries:

# Parse request and print data points with defaults
for telemetries in r.json()['data']['telemetries']:
    # default data points to be shown if not present in retrieved telemetries
    data_points = {'temperature': 'N/A', 'totalEnergy': 'N/A', 'totalActivePower': 'N/A', 'date': 'N/A'}
    # overwrite defaults with retrieved
    data_points.update(telemetries)

    text = f"data,site={site},sn={serial} I_Temp={data_points['temperature']},I_AC_Energy_WH={data_points['totalEnergy']},I_AC_Power={data_points['totalActivePower']} {to_unix_timestamp(data_points['date'])}"
    print(text, flush=False)

Note: I used 'N/A' instead empty-string as default value which signals that the telemetry data point is not available.

See How do I merge dictionaries together in Python?

Recipe to prevent key-error and return default

Like suggested by Daniel Hao's comment:

The dict's method dict.get(key, default) can be used to supply a default value if the key is missing:

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.

You can use it directly in your f-string:

print(f'data,site={site},sn={serial} I_Temp={value.get("temperature","")},I_AC_Energy_WH={value.get("totalEnergy","")},I_AC_Power={value.get("totalActivePower","")} {to_unix_timestamp(date)}', flush=False)

This hides the defaulting-behavior inside the f-string so that the code and its intention might be harder to read.

CodePudding user response:

You could use a dataclass with default values for your fields, implement a method to output it into Influx line method, create them from your dict objects, and then just call their format method. This is just an example that doesn't use all the fields you showed, but it should work for demo purposes:

from dataclasses import dataclass

@dataclass
class TelemetryPoint:
    temperature: str = ""
    totalEnergy: str = ""
    totalActivePower: str = ""

    def influx_line_format(self) -> str:
        parts = [
            "data",
            f"I_Temp={self.temperature}",
            f"I_AC_Energy_WH={self.totalEnergy}",
            f"I_AC_Power={self.totalActivePower}"
        ]
   
        return ",".join(parts)

# sample data is missing totalEnergy and totalActivePower
sample_data = { "temperature": "192" }
data_point = TelemetryPoint(**sample_data)
print(data_point.influx_line_format(), flush=False)

And the output is...

data,I_Temp=192,I_AC_Energy_WH=,I_AC_Power=

You could also pre-process the dictionaries you get back so that any missing values are populated.

def hydrate_datapoint(data_point) -> None:
    expected_keys = [ "totalEnergy", "temperature", "totalActivePower" ]
    for k in expected_keys:
        if k not in data_point:
            data_point[k] = ""

sample_data = {"temperature": "192"}
hydrate_datapoint(sample_data)
print(sample_data)

And this would populate any empty fields:

{'temperature': '192', 'totalEnergy': '', 'totalActivePower': ''}
  • Related