Home > Back-end >  TypeError: argument of type 'float' is not iterable (Python)
TypeError: argument of type 'float' is not iterable (Python)

Time:11-08

I would like to correct values in a df column based on results from another one. This first line gives me the correct version of km based on another df named 'correction'.

df['km_correct'] = df['task_object'].map(correction.set_index('task_object')['km_correct'])

Then I want to replace the current value of "km" with the corrected one only if the year is 2022 and the contact is 'A', 'B' or 'C'. So I'm using the following formula called correction_km:

def correction_km(row):
    if '2022' in row["year"] and ("A" in row["Contacts"] or "B" in row["Contacts"] or "C" in row["Contacts"]):
        return row['km_correct']
    else:
        return row['km']

However when I'm trying to apply the formula to my df on the column km as such:

df['km'] = df.apply(correction_km, axis=1)

I'm getting the error message:

TypeError: argument of type 'float' is not iterable

Can anyone help? Thank you!

CodePudding user response:

Are you applying correction_km on whole dataset?

df['km'] = df.apply(correction_km, axis=1)

Or you just have to apply it into only km series like

df['km'] = df['km'].apply(correction_km, axis=1)

Sorry, I got this type of error before as well. I didn't have the dataset right now, so I am assuming maybe this was the cause of an error.

CodePudding user response:

Cause of the error

tl;dr "2022" in 2022.0" rises that exact error.

Your usage of 'in' is strange. And I suspect that, combined to another problem, it is what make it fail.

A in B means that value A is found in the iterable B.

We don't know how your dataframe is built. But I am pretty sure that neither df['year'] nor `df['Contact'] are columns of arrays of things.

Now, 2022 in row['year'] would have still be correct if row['year'] were a string (as we can expect from your usage). In which case it would mean "a substring of row.year is '2022'". For example, that would be true if row.year was 320221.

Likewise "A" in row.Contact means that row.Contact (if this is a string) contains the letter A, maybe among many other letters)

So, alone, that strange usage of saying in instead of == can not be the source of your error. It should just create unwanted behaviour, but not raise an error.

But if, in addition, row['year'] contains not strings but numbers (floats), then you would get, well exactly the error you got

See

'2022' in 2022.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument of type 'float' is not iterable

Correction

So, solution

  1. Ensure that type of row 'year' is correctly set. It should not be a float (comparing float with exact == is never a good idea. You want a discrete type to hold a discrete value). Best type, I think, is int. But you may prefer string if that column can also hold some special non numeric values (such as "year III of French Republic"). But in any case, not float

  2. Do not, unless you really want to check if "2022" is a substring of the string row['year'] use in as a strange alias of ==. Just say if row['year']==2022.

  3. Same for Contacts. If, as I surmise, it is just a string whose value may be "A", "B", "C", or something else, either replace in by == there too. Or, this time, use in but as it is intended, and say if row['Contacts'] in ['A','B','C'].

Appendix: other method

Your usage of dataframes in general, with apply, is highly inefficient. When using dataframes, the one rule you need to have in mind is "If I am using loops or apply, I am probably doing something wrong". Not that there can't be any valid reason to use those (they exist because sometimes you need them). But you need to be sure that no other solution were possible before using them.

In your case

df.loc[df.Contacts.isin(['A','B','C']) & (df.year==2022), 'km'] = df.km_correct

is a way more efficient method to do what your want to do (once you have corrected type of df.year to be int)

  • Related