Home > Software engineering >  How to convert np.int32 into pd object?
How to convert np.int32 into pd object?

Time:01-11

My dataset consist of a column "size" whose data type is object but when performing any other function of pandas on it or even checking its dtype, it shows np.int32. How to fix it ? I want to perform pandas functions on it

ID  area_type   availability    location    size    society total_sqft  bath    balcony price
0   0   Super built-up Area 19-Dec  Electronic City Phase II    2 BHK   Coomee  1056    2.0 1.0 39.07
1   1   Plot Area   Ready To Move   Chikka Tirupathi    4 Bedroom   Theanmp 2600    5.0 3.0 120.00
2   2   Built-up Area   Ready To Move   Uttarahalli 3 BHK   NaN 1440    2.0 3.0 62.00
3   3   Super built-up Area Ready To Move   Lingadheeranahalli  3 BHK   Soiewre 1521    3.0 1.0 95.00
4   4   Super built-up Area Ready To Move   Kothanur    2 BHK   NaN 1200    2.0 1.0 51.00
df.info()
RangeIndex: 10656 entries, 0 to 10655
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   ID            10656 non-null  int64  
 1   area_type     10656 non-null  object 
 2   availability  10656 non-null  object 
 3   location      10655 non-null  object 
 4   size          10642 non-null  object 
 5   society       6228 non-null   object 
 6   total_sqft    10656 non-null  object 
 7   bath          10591 non-null  float64
 8   balcony       10152 non-null  float64
 9   price         10656 non-null  float64
dtypes: float64(3), int64(1), object(6)
memory usage: 832.6  KB
1
df.size.dtype
dtype('int32')
1
df.size.unique()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10132/3869038666.py in <module>
----> 1 df.size.unique()

AttributeError: 'numpy.int32' object has no attribute 'unique'

CodePudding user response:

You can't use dot notation here because size is already an attribute of DataFrame. Use bracket notation here []:

>>> df['size'].unique()
array([1, 2, 3])

Reproducible error:

df = pd.DataFrame({'size': [1, 2, 3]})
df.size.unique()

# Output
AttributeError: 'numpy.int64' object has no attribute 'unique'

CodePudding user response:

may try using conversion into required datatype

df['size']=df['size'].astype(str)

your dataset is about housing and size here means bhk? so I think this is better

  • Related