I need to create a table of summary statistics using Python's prettytable
. I have a dataset of n columns and I need to compute the mean, median, standard deviation and variance of each of the columns. I can use numpy
to calculate a list of statistics as follows:
import numpy
means=df.mean()
medians=df.median()
standard_deviations=df.std()
variances=df.var()
However, when filling the table, I am not sure how to insert a list of columns and values in the field_names
and add_row
options. The below code works if I know in advance the number of columns and I can easily specify their names in a list:
from prettytable import PrettyTable
x = PrettyTable()
x.title = 'Dataset Summary Statistics'
x.field_names = ['Metric','Var(1)','Var(2)',...,'Var(n)']
x.add_row(['Mean',means[0],means[1],means[2],...,means[n]])
x.add_row(['Median',medians[0],medians[1],medians[2],...,medians[n]])
x.add_row(['Standard Deviation',standard_deviations[0],standard_deviations[1],standard_deviations[2],..., standard_deviations[n]])
x.add_row(['Variance',variances[0],variances[1],variances[2],variances[n]])
print(x)
----------------------------------------------------------------------------------
| Dataset Summary Statistics |
-------------------- -------------------- -------------------- -------------------
| Metric | Var(1) | Var(2) | Var(n) |
-------------------- -------------------- -------------------- -------------------
| Mean | 1774.723516111245 | 1784.5797186405343 | 1764.1535926315878|
| Median | 1413.0899658203125 | 1419.4949951171875 | 1406.0249633789062|
| Standard Deviation | 831.055540944934 | 833.9177417328348 | 827.9240611593201 |
| Variance | 690653.312135277 | 695418.7999767909 | 685458.2510465416 |
-------------------- -------------------- -------------------- -------------------
However, if the number of columns in the DataFrame is too large to manually call each element of the means
, medians
, standard_deviations
and variances
lists, how can I fill the table using a list of values without specifying their positions in the list?
CodePudding user response:
IIUC convert Series to list and use *
for unpack, similar for medians, standard_deviations, variances
:
x.add_row(['Mean', *means.tolist()])