Home > Enterprise >  Dynamically update pandas NaN based on field type in django model
Dynamically update pandas NaN based on field type in django model

Time:04-14

I need to save data from csv file to django models. The data comes from external api so I have no control on its structure. In my schema, I allowed all fields to be nullable.

This is my script

   text = f"{path}/report.csv"
        df = pd.read_csv(text)
        row_iter = df.iterrows()

        for index, row in row_iter:
            rows = {key.replace("-", "_"): row.pop(key) for key in row.keys()}
            # print(f"rows {rows}")
            # default_values = {
            #     "amazon-order-id",merchant-order-id,purchase-date,last-updated-date,order-status,fulfillment-channel,sales-channel,order-channel,ship-service-level,product-name,sku,asin,item-status,quantity,currency,item-price,item-tax,shipping-price,shipping-tax,gift-wrap-price,gift-wrap-tax,item-promotion-discount,ship-promotion-discount,ship-city,ship-state,ship-postal-code,ship-country,promotion-ids,is-business-order,purchase-order-number,price-designation,is-iba,order-invoice-type
            # }

            sb, created = Order.objects.update_or_create(
                sellingpartner_customer=c,
                amazon_order_id=rows["amazon_order_id"],
                sku=rows["sku"],
                asin=rows["asin"],
                defaults={**rows},
            )

However, since some of the csv fields has empty values, pandas will replace it with NaN value, this is where django returns an error

   django.db.utils.OperationalError: (1054, "Unknown column 'NaN' in 'field list'")

I tried replace empty values as empty string("")

   df.fillna("", inplace=True)

But django will return an error for fields that are not charfields

   django.core.exceptions.ValidationError: ['“” value must be a decimal number.']

My question is, how do you handle empty values from csv file in pandas so for example if the field type is boolean, pandas will just replace empty value with boolean False for django boolean type fields, 0 for empty decimal types, and just blank for empty charfields, etc ?

CodePudding user response:

Nullable Django models won't take np.nan or other Pandas-compatible not-a-number objects. It expects taking None as in stock Python. When you have nan values, before you save them to Django, just replace them with None to avoid the validation error.

  • Related