Home > Enterprise >  Why is this float conversion made
Why is this float conversion made

Time:12-17

I have this dataframe

Python 3.9.0 (v3.9.0:9cf6752276, Oct  5 2020, 11:29:23) 
[Clang 6.0 (clang-600.0.57)] on darwin
>>> import pandas as pd  
>>> import datetime as datetime
>>> pd.__version__
'1.3.5'
>>> dates = [datetime.datetime(2012, 2, 3) , datetime.datetime(2012, 2, 4)]
>>> x = pd.DataFrame({'Time': dates, 'Selected': [0, 0], 'Nr': [123.4, 25.2]})
>>> x.set_index('Time', inplace=True)
>>> x
            Selected     Nr
Time                       
2012-02-03         0  123.4
2012-02-04         0   25.2

An integer value from an integer column is converted to a float in the example but I do not see the reason for this conversion. In both cases I assume I pick the value from the 'Selected' column from the first row. What is going on?

>>> x['Selected'].iloc[0]
0
>>> x.iloc[0]['Selected']
0.0
>>> x['Selected'].dtype 
dtype('int64')

CodePudding user response:

x.iloc[0] selects a single "row". A new series object is actually created. When it decides on the dtype of that row, a pd.Series, it uses a floating point type, since that would not lose information in the "Nr" column.

On the other hand, x['Selected'].iloc[0] first selects a column, which will always preserve the dtype.

pandas is fundamentally "column oriented". You can think of a dataframe as a dictionary of columns (it isn't, although I believe it used to essentially have that under the hood, but now it uses a more complex "block manager" approach, but these are internal implementation details)

  • Related