How to convert few JSON attributes values to string-CodePudding

I have a dictionary returned by a response in AWS Lambdas which looks like below:

    data = {
        "id": 1,
        "name": "Hey",
        "category": null,
        "salary": 1222,
        "Roles": [
            {
                "attributes1": {
                "key": "sum_assured",
                "value": 192000
            },
                "attributes2": {
                    "key": "cash_value",
                    "value": 48576
                },
                "attributes3": {
                    "key": "expiry_date",
                    "value": null
                }
            }
        ]
    
    }

Now, I want to format a few numeric columns like salary to 1222.00 and convert other non-string columns to strings for "ID" I should get "1" as a string and not an Int. Here, null is the actual NaN returned by the response.

Below is the code which I tried:

new_dict = {}
for i, j in data.items():
    print(i)
    if (i == "salary"):
        new_dict[i] = "%0.2f" % j
    if isinstance(j, str):
        new_dict[i] = j
    elif j is None:
        new_dict[i] = ""
    else:
        new_dict[i] = str(j)

I know that I could maintain a list for such amount related columns and then I can do some kind of comparisons and replace it. When I test the value for salary inside the loop it gives me decimal places but in the response, it doesn't show up.

Output:

{
  "id": "1",
  "name": "Hey",
  "category": "",
  "salary": "1222",
  "Roles": "[{'attributes4': {'key': 'sum_assured', 'value': 192000}, 'attributes5': {'key': 'cash_value', 'value': 48576}, 'attributes6': {'key': 'expiry_date', 'value': 'null'}}]"
}

Expected:

{
  "id": "1",
  "name": "Hey",
  "category": "",
  "salary": "1222",
  "Roles": "[{"attributes4": {"key": "sum_assured", "value": "192000.00"}, "attributes5": {"key": "cash_value", "value": "48576.00"}, "attributes6": {"key": "expiry_date", "value": ""}}]"
}

Also, how could I format the records of the Roles array with some kind of indent in it? I wasn't able to do that.

Update 1:

I tried the below code for accessing the nested attributes as per my JSON but it gives me a key error:

for i, j in formatted_json.items():
    print(type(j))
    if isinstance(j, list):
        for k in j:
            for key, value in k.items():
                for l, m in value.items():
                    if (m == "salary" or m == 'value'):
                        new_dict[key][l] = "%0.2f" % m
                    elif isinstance(m, str):
                        new_dict[key][l] = m
                    elif m is None:
                        new_dict[key][l] = ""
                    else:
                        new_dict[key][l] = str(m)

    elif (i == "salary" or i == 'value'):
        new_dict[i] = "%0.2f" % j
    elif isinstance(j, str):
        new_dict[i] = j
    elif j is None:
        new_dict[i] = ""
    else:
        new_dict[i] = str(j)



Traceback (most recent call last):
  File "/home/.../Downloads/pyspark/src/sample/json_to_string.py", line 392, in <module>
    new_dict[key][l] = m
KeyError: 'attributes4'

Update 2:

The actual dataset is as below:

json_data = '''{
    "policy_no": "",
    "paid_to_date": "2016-09-28",
    "proposal_no": null,
    "agent_code": "1234",
    "policy_status": "Death",
    "policy_term": "15",
    "policy_premium_payment_term": "7",
    "policy_effective_date": "2013-09-28",
    "last_premium_paid": 13790.85,
    "last_premium_paid_date": "2016-06-17",
    "pivc_flag_status": null,
    "product_code": "",
    "product_name": "Monthly Income Plan",
    "application_date": "2013-09-16",
    "underwriting_decesion_date": "2013-09-26",
    "issuance_date": "2013-09-30",
    "ols_policy": "N",
    "next_premium_due_date": "2016-09-28",
    "policy_year": null,
    "premium_holiday_from_date": null,
    "premium_holiday_to_date": null,
    "product_type": "Non-Linked",
    "proposer_client_code": "",
    "life_insured_client_code": "",
    "ppt_completion_date": "2020-09-28",
    "ppt_left": 4,
    "auto_termination_date": null,
    "fund_value": null,
    "loan_amt": 0,
    "welcome_kit_current_status": null,
    "dispatch_date": "2013-10-25",
    "status_date": "2013-10-25",
    "updated_on": "2013-10-26 19:51:51.983",
    "surrender_value": null,
    "loan_eligiability": null,
    "courier_name": null,
    "awb_no": "",
    "status_id": 5,
    "claim_registration_date": "2021-12-15",
    "claim_approval_reject_date": "2021-12-15",
    "claim_requirement_document": "HEART ATTACK",
    "claim_requirement_status": "P",
    "requirement_raised_on": "2021-12-15",
    "requirement_closed_on": "2021-12-15",
    "claim_status": "P",
    "status_for_death_claims": "P",
    "date_of_transaction": "2021-12-15",
    "ecs_si_registration_date": null,
    "ecs_si_status": "inactive",
    "ecs_si_failure_reasons": null,
    "ecs_si_status_as_per_rls": null,
    "ecs_day": "0",
    "payment_mode": "S",
    "payment_mode_desc": "Semi Anually",
    "payment_method": "S",
    "payment_method_desc": "Cash/Cheque",
    "payment_date": "2021-12-15",
    "payment_amount": 13790.85,
    "premium_base_amt": null,
    "premium_tax_amt": null,
    "due_premium": null,
    "total_premium": 13790.85,
    "advance_premium": null,
    "assignee": [
        {
            "assignee_name": "",
            "attributes1": {
                "key": "assignee_client_id",
                "value": ""
            },
            "attributes2": {
                "key": "assignee_type",
                "value": ""
            }
        }
    ],
    "agents": [
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "agent_name": "",
            "attributes2": {
                "key": "agent_code",
                "value": "1234"
            },
            "attributes3": {
                "key": "email",
                "value": ""
            },
            "attributes4": {
                "key": "channel",
                "value": "AGENCY"
            },
            "attributes5": {
                "key": "subchannel",
                "value": null
            },
            "attributes6": {
                "key": "branch_name",
                "value": ""
            },
            "attributes7": {
                "key": "contact_no",
                "value": ""
            },
            "attributes8": {
                "key": "status",
                "value": "T"
            }
        }
    ],
    "bank_accounts": {
        "attributes1": {
            "key": "accountnumber",
            "value": ""
        },
        "attributes2": {
            "key": "accountholdername",
            "value": ""
        },
        "attributes3": {
            "key": "ifsccode",
            "value": ""
        },
        "attributes4": {
            "key": "micrcode",
            "value": ""
        },
        "bankname": "",
        "attributes5": {
            "key": "branchname",
            "value": ""
        },
        "attributes6": {
            "key": "bank_address",
            "value": ""
        }
    },
    "Funds": [
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "attributes2": {
                "key": "allocation_mode",
                "value": "percentage"
            },
            "attributes3": {
                "key": "allocation_value",
                "value": null
            },
            "attributes4": {
                "key": "fund_code",
                "value": null
            },
            "fund_name": null,
            "investment_strategy_name": null
        }
    ],
    "Investment_Strategies": [
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "attributes2": {
                "key": "allocation_mode",
                "value": "percentage"
            },
            "attributes3": {
                "key": "allocation_value",
                "value": null
            },
            "attributes4": {
                "key": "fund_code",
                "value": null
            },
            "fund_name": null,
            "investment_strategy_name": null
        }
    ],
    "Roles": [
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "attributes2": {
                "key": "beneficiary_percentage",
                "value": "100.00"
            },
            "attributes3": {
                "key": "relationship",
                "value": "SPOUSE"
            },
            "role_name": "Nominee",
            "attributes4": {
                "key": "age",
                "value": "NaN"
            },
            "attributes5": {
                "key": "party_name",
                "value": ""
            },
            "attributes6": {
                "key": "gender",
                "value": "F"
            },
            "attributes7": {
                "key": "date_of_birth",
                "value": "1953-03-20"
            },
            "attributes8": {
                "key": "email",
                "value": null
            },
            "attributes9": {
                "key": "registred_mobile",
                "value": null
            }
        },
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "attributes2": {
                "key": "beneficiary_percentage",
                "value": null
            },
            "attributes3": {
                "key": "relationship",
                "value": null
            },
            "role_name": "Insured",
            "attributes4": {
                "key": "age",
                "value": 63
            },
            "attributes5": {
                "key": "party_name",
                "value": ""
            },
            "attributes6": {
                "key": "gender",
                "value": "M"
            },
            "attributes7": {
                "key": "date_of_birth",
                "value": "1950-06-20"
            },
            "attributes8": {
                "key": "email",
                "value": null
            },
            "attributes9": {
                "key": "registred_mobile",
                "value": null
            }
        },
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "attributes2": {
                "key": "beneficiary_percentage",
                "value": null
            },
            "attributes3": {
                "key": "relationship",
                "value": null
            },
            "role_name": "Owner",
            "attributes4": {
                "key": "age",
                "value": 63
            },
            "attributes5": {
                "key": "party_name",
                "value": ""
            },
            "attributes6": {
                "key": "gender",
                "value": "M"
            },
            "attributes7": {
                "key": "date_of_birth",
                "value": "1950-06-20"
            },
            "attributes8": {
                "key": "email",
                "value": ""
            },
            "attributes9": {
                "key": "registred_mobile",
                "value": ""
            }
        }
    ],
    "Benefits": [
        {
            "attributes1": {
                "key": "policy_no",
                "value": ""
            },
            "benefit_name": "Miii",
            "attributes2": {
                "key": "benefit_term",
                "value": "7"
            },
            "attributes3": {
                "key": "modal_premium",
                "value": null
            },
            "attributes4": {
                "key": "sum_assured",
                "value": 192000
            },
            "attributes5": {
                "key": "cash_value",
                "value": 48576
            },
            "attributes6": {
                "key": "expiry_date",
                "value": null
            },
            "attributes7": {
                "key": "status",
                "value": "Death"
            },
            "attributes8": {
                "key": "benefit_code",
                "value": ""
            },
            "attributes9": {
                "key": "benefit_type",
                "value": "base"
            }
        }
    ]
}
'''

CodePudding user response：

When you have a strong model in your data, and you want to process/validate the data you receive, its better to be done in a object oriented fashion. If you can consider your data value as an object with all logic written inside for validating or correcting, it becomes simple.

See an example approach below:

(Assumed your null as None)

import json

# Your data object..
data = '''{
        "id": 1,
        "name": "Hey",
        "category": null,
        "salary": 1222,
        "Roles": [
            {
                "attributes1": {
                "key": "sum_assured",
                "value": 192000
            },
                "attributes2": {
                    "key": "cash_value",
                    "value": 48576
                },
                "attributes3": {
                    "key": "expiry_date",
                    "value": null
                }
            }
        ]
    
    }'''


# Create  simple object models to represent the data.
class Role:
    def __init__(self, key, value):
        self.key = key
        self.value = value or ""  # fallback to empty string


class Policy:
    def __init__(self, id, name, category, salary, Roles):
        # write all your pre/post processing here
        self.id = id
        self.name = name
        self.category = category or ""
        self.salary = format(salary, '0.2f') if salary else ""
        self.Roles = [Role(**role) for role in Roles[0].values()] if Roles else []


# a json serialization helper function
def serial_default(obj):
    if isinstance(obj, (Policy, Role)):
        return obj.__dict__
    else:
        return obj


# You just create the object with your data. Now all data is in the object.
data = json.loads(data)
p = Policy(**data)

# Serialize the object to json in the proper format
print(json.dumps(p, default=serial_default))

The print generates a valid version of your object as JSON like

{"id": 1, "name": "Hey", "category": "", "salary": "1222.00", "Roles": [{"key": "sum_assured", "value": 192000}, {"key": "cash_value", "value": 48576}, {"key": "expiry_date", "value": ""}]}

This is just an example to the approach. You can modify or improvise a similar approach and make things simpler.

CodePudding user response：

Firstly i have to agree with Kris answer, this kind of conversion looks now a bit ugly and with every added special case Object-Oriented-style makes more sense.

A few points:

null, None, ""

It seems that null won't be seen as None, so your not getting into the if clause. Check JSON to Dict for that

How can JSON data with null value be converted to a dictionary

Salary

According to your question you would expect "salary": "1222.00" correct? If so you are currently overwriting your previously formatted number with

if isinstance(j, str):
   new_dict[i] = j

Just change it to elif so you wont overwrite your conversion.

Roles

The objects within the array of Roles are currently not iterated over. Hence the number are not being formatted. For that you would need another loop on the Roles array.

Following a recursion attempt to cover multiple nested collections.

import json
from collections import abc

dataRaw = '''
{
    "id": 1,
    "name": "Hey",
    "category": null,
    "salary": 1222,
    "Roles": [
        {
            "attributes1": {
            "key": "sum_assured",
            "value": 192000
        },
            "attributes2": {
                "key": "cash_value",
                "value": 48576
            },
            "attributes3": {
                "key": "expiry_date",
                "value": null
            }
        }
    ]

}
'''

data = json.loads(dataRaw)

def processNested(nested):
    if (isinstance(nested, list)):
        for index in range(len(nested)):
            nested[index] = processNested(nested[index]) 
    else:
        for key, value in nested.items(): 
            if isinstance(value, abc.Mapping):
                if "key" in value and "value" in value: 
                    if (value["key"] == "cash_value"):
                        nested[key]["value"] = float(value["value"])
                    else:
                        nested[key] = processNested(value)
            else:
                if (key == "value" and value != None):
                    nested[key] = "%0.2f" % value
                
    return nested

new_dict = data
for i, j in new_dict.items():
    if (i == "salary"):
        new_dict[i] = "%0.2f" % j
    elif isinstance(j, str):
        new_dict[i] = j
    elif j is None:
        new_dict[i] = ""
    elif i == "Roles":
        new_dict[i] = processNested(j)
    else:
        new_dict[i] = str(j)

print(new_dict)