I'm running into the following issue.
I have pulled some summary statistics from a dataframe, using pd.describe()
. Now, I'm trying to convert the number of observations (or count) into an integer. I've used the following but it does not work:
summary_stats = df.describe()
summary_stats = summary_stats.round(2)
summary_stats.iloc[0] = summary_stats.iloc[0].astype(int)
Then, when I print out the summary statistics table, the number of observations is not an integer. Thanks a lot for your insights!
CodePudding user response:
It is problem, because floats with integers in same column. So integers are converted to floats.
Possible solution with transpose - then column has integer dtype
:
d = {'A':[1,2,3,4,5], 'B':[2,2,2,2,2], 'C':[3,3,3,3,3]}
df = pd.DataFrame(data=d)
summary_stats = df.describe().T
summary_stats = summary_stats.round(2)
summary_stats['count'] = summary_stats['count'].astype(int)
print (summary_stats)
count mean std min 25% 50% 75% max
A 5 3.0 1.58 1.0 2.0 3.0 4.0 5.0
B 5 2.0 0.00 2.0 2.0 2.0 2.0 2.0
C 5 3.0 0.00 3.0 3.0 3.0 3.0 3.0
If need only display values, here is hack - converted values to object
:
summary_stats = df.describe()
summary_stats = summary_stats.round(2).astype(object)
summary_stats.iloc[0] = summary_stats.iloc[0].astype(int)
print (summary_stats)
A B C
count 5 5 5
mean 3.0 2.0 3.0
std 1.58 0.0 0.0
min 1.0 2.0 3.0
25% 2.0 2.0 3.0
50% 3.0 2.0 3.0
75% 4.0 2.0 3.0
max 5.0 2.0 3.0