Pandas - map converting int to floats-CodePudding

I have a dictionary:

matches = {282: 285,
 266: 277,
 276: 293,
 263: 264,
 286: 280,
 356: 1371,
 373: 262,
 314: 327,
 294: 290,
 285: 282,
 277: 266,
 293: 276,
 264: 263,
 280: 286,
 1371: 356,
 262: 373,
 327: 314,
 290: 294}

And a df, like so:

Now I'm trying to create an 'adversary_id' column, mapped from the dict, like so:

df['adversary_id'] = df['team_id'].map(matches)

But this new column adversary_id is being converted to type float, and two rows are ending up with NaN.

Why, if all data is type int?

How do I fix this, that is, how do I avoid NaNs being generated and map one into the other with no errors?

CodePudding user response：

This is because the np.nan or NaN (they are not exact same) values you see in the dataframe are of type float. It is a limitation that pitifully can't be avoided as long as you have NaN values in your code.

Kindly read more in pandas' documentation here.

Because NaN is a float, a column of integers with even one missing values is cast to floating-point dtype (see Support for integer NA for more). pandas provides a nullable integer array, which can be used by explicitly requesting the dtype:

The proposed solution is to force the type with:

df['team_id'] = pd.Series(df['team_id'],dtype=pd.Int64Dtype())

Returning:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 1 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   Example  4 non-null      Int64
dtypes: Int64(1)
memory usage: 173.0 bytes

CodePudding user response：

If I understand correctly, you could just use python's built-in int() function.