Home > Software engineering >  How to resolve "TypeError: int() argument must be a string, a bytes-like object or a number, no
How to resolve "TypeError: int() argument must be a string, a bytes-like object or a number, no

Time:07-28

So, I'm wanting to do some visualization on EPA environmental media sampling data for PFAS. I'm using pandas and matplotlib for this. I've got the following code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import csv

pd.set_option('display.max_columns', 500)

inputpath="CHI"

col_for_analysis=["Environmental Media Name", "Year", "Result Measure Value (ppt)"]

dataset=pd.read_csv(inputpath,sep=',', dtype={'a': str}, usecols= col_for_analysis, low_memory=False)


dataset.sort_values(by=["Year"], ascending=True, inplace=True)
print(dataset)
dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].fillna(0, inplace=True)
dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].astype(int)

The end goal here, at least for now, is to sort everything by Year and then plot the "Year" column on the x-axis and the "Result Measure Value (ppt)" column on the y-axis. When I tried it initially, I was getting error messages indicating that the "Result Measure Value (ppt)" column contained NoneType values, so matplotlib couldn't plot it. No big deal, I think to myself, I'll just use dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].fillna(0, inplace=True) to remove those NoneType values and replace them with a nice, hopefully plottable 0.

That seemed to work. So I went on to try to change all the values in that column to int values, so they could all be plotted by matplotlib. I tried to do this by adding the line:

dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].astype(int)

That line of code throws the following, rather lengthy error message:

Traceback (most recent call last):
  File "main.py", line 18, in <module>
    dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].astype(int)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/generic.py", line 5912, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 419, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 304, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 580, in astype
    new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1292, in astype_array_safe
    new_values = astype_array(values, dtype, copy=copy)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1237, in astype_array
    values = astype_nansafe(values, dtype, copy=copy)
  File "/home/runner/Fun-Public-Health-Project/venv/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1154, in astype_nansafe
    return lib.astype_intsafe(arr, dtype)
  File "pandas/_libs/lib.pyx", line 668, in pandas._libs.lib.astype_intsafe
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Now, I thought that the line

dataset["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].fillna(0, inplace=True)

would get rid of all NoneType values in the "Result Measure Value (ppt)" column by filling in any NoneType values with a 0. Am I wrong in thinking this? If so, how do I rid the column of NoneType values or otherwise get all the values in that column converted into something that I can work with to plot along with Year? Otherwise, how can I fix the code so that all values in this column can be converted to int and then plotted? Thanks!

CodePudding user response:

You should either change it up "inplace":

dataset["Result Measure Value (ppt)"].fillna(0, inplace=True)

Or assign it without the inplace argument:

["Result Measure Value (ppt)"] = dataset["Result Measure Value (ppt)"].fillna(0)

But not both at the same time, since using the inplace argument makes it not return anything (None)

  • Related