In some circumstances the format (int, float, etc) of a cell is lost when accessing via its row.
In that example the first column has integers and the second floats. But the 111
is converted into 111.0
.
dfA = pandas.DataFrame({
'A': [111, 222, 333],
'B': [1.3, 2.4, 3.5],
})
# A 111.0
# B 1.3
# Name: 0, dtype: float64
print(dfA.loc[0])
# <class 'numpy.float64'>
print(type(dfA.loc[0].A))
The output I would expect is like this
A 111
B 1.3
<class 'numpy.int64'>
I have an idea why this happens. But IMHO this isn't user friendly. Can I solve this somehow? The goal is to access (e.g. read) each cells value without loseing its format.
In the full code below you can also see it is possible when one of the columns is of type string. Wired.
Minimal Working Example
#!/usr/bin/env python3
import pandas
dfA = pandas.DataFrame({
'A': [111, 222, 333],
'B': [1.3, 2.4, 3.5],
})
print(dfA)
dfB = pandas.DataFrame({
'A': [111, 222, 333],
'B': [1.3, 2.4, 3.5],
'C': ['one', 'two', 'three']
})
print(dfB)
print(dfA.loc[0])
print(type(dfA.loc[0].A))
print(dfB.loc[0])
print(type(dfB.loc[0].A))
Output
A B
0 111 1.3
1 222 2.4
2 333 3.5
A B C
0 111 1.3 one
1 222 2.4 two
2 333 3.5 three
A 111.0
B 1.3
Name: 0, dtype: float64
<class 'numpy.float64'>
A 111
B 1.3
C one
Name: 0, dtype: object
<class 'numpy.int64'>
CodePudding user response:
If you want to access a specific value in the DataFrame without losing its data type, you can use the at
method instead of the loc
method. The at
method accesses a scalar value in the DataFrame, so it will preserve the data type of the value. See: print(type(dfA.at[0, 'A']))
In this example, the at
method is used to access the value in the first row and first column of the DataFrame. This value is an integer, so the at method returns it as an integer, preserving its data type.