Home > OS >  Why does my pandas list contain duplicates?
Why does my pandas list contain duplicates?

Time:12-21

I have a function that takes api_response and tests to see if a condition is met if "meta" not in api_response:. If the condition is met, I extract the key/pair value percent_complete value from the response and print it to the console. This value is a percentage, and only appears once in the api_response.

My issue is, when it prints to the console, the list (which should contain 1x value e.g, 0.19) is printing the value twice.

E.g., if percent_complete == 0.19, the console will print Your data requested, associated with ID: 2219040 is (0.19, 0.19) complete!.

Is there anything wrong with my code, that might be causing this?

Function -

def api_call():

#   Calling function that returns API authentication details for use in endpoint_initializer()
    key, secret, url = ini_reader()
#   Calling function that makes initial API POST call and returns endpoint_url to call, until data is returned.  
    endpoint_url = endpoint_initializer()
    
#   saving current date in a variable, for use when printing user message
    date = dt.datetime.today().strftime("%Y-%m-%d")
#   Printing endpoint_url and current date.
    print("-------------------------------------\n","API URL constructed for:", date, "\n-------------------------------------")
    print("-------------------------------------------------------------\n","Endpoint:", endpoint_url, "\n-------------------------------------------------------------")

#   Loop will continously call the end_point URL until data is returned. When data is not returned the `percent_complete' key value is extracted from api response.
#   this will inform user of status of data aggregation.
    while True:
        response = requests.get(url = endpoint_url, auth = HTTPBasicAuth(key, secret), headers = {"Vendor-firm": "343"})
        api_response = json.loads(response.text)
#       Test condition to see if "meta" is in api_response. Meta only in response, when data is ready.
        if "meta" not in api_response:
            id_value = "id"
            res1 = [val[id_value] for key, val in api_response.items() if id_value in val]
            id_value = "".join(res1)
            percent_value = "percent_complete"
            res2 = (tuple(api_response["data"]["attributes"].get("percent_complete", '') for key, val in api_response.items()))
            print(f' Your data requested, associated with ID: {id_value} is {res2} complete!')
            time.sleep(5)
#       Condition to allow API response to be returned, if condition is not met.
        elif "meta" in api_response:
            return api_response

Example API response -

{
    "data": {
        "id": "2219040",
        "type": "jobs",
        "attributes": {
            "job_type": "PORTFOLIO_VIEW_RESULTS",
            "started_at": "2021-12-18T17:40:17Z",
            "parameters": {
                "end_date": "2021-12-14",
                "output_type": "json",
                "view_id": 304078,
                "portfolio_id": 1,
                "portfolio_type": "firm",
                "start_date": "2021-12-14"
            },
            "percent_complete": 0.19,
            "status": "In Progress"
        },
        "relationships": {
            "creator": {
                "links": {
                    "self": "/v1/jobs/2219040/relationships/creator",
                    "related": "/v1/jobs/2219040/creator"
                },
                "data": {
                    "type": "users",
                    "id": "731221"
                }
            }
        },
        "links": {
            "self": "/v1/jobs/2219040"
        }
    },
    "included": []
}

CodePudding user response:

The dictionary in your response contains two items ('data' and 'included'). Your code that creates res2 iterates over all of the items:

            res2 = (tuple(api_response["data"]["attributes"].get("percent_complete", '') for key, val in api_response.items()))

so you get the information twice. Since you are just pulling data from the 'data' key, it's silly to iterate over the items. Right? Just do:

            res2 = api_response["data"]["attributes"].get("percent_complete", '') 
  • Related