PYTHON list print statment-CodePudding

Ive found the variance and the mean of the columns that contain a float. Below prints out just that, but I want it to print out as the following below. train_1 float columns ['Age', 'RestingBP'].

variance of Age is 0.03595686694781368 mean of age is: 0.4641151293344085 variance of RestingBP 0.006797712487953147 mean of RestingBP: 0.6499696048632219

How do I do that and can I save this as a set/list to later use it in a formula.

I want to be able to identify which mean and variance goes with each columns, so that later i can potentially multiply the values.

numerical = [var for var in train_1.columns if train_1[var].dtype=='float64']
for var in numerical: 
    variance1 = variance(train_1[var])
    mean1 =  statistics.mean(train_1[var])
    print(f"variance {numerical}" , variance1)
    print("mean:",mean1)

CodePudding user response：

One approach would be to generate a list of results containing a dictionary with the various values which you are interested in. You could achieve this with something like:

numerical = [var for var in train_1.columns if train_1[var].dtype=='float64']
results = [{'variance': (variance(train_1[var]),
            'mean': statistics.mean(train_1[var])) } for var in numerical]

for result in results:
    print(f'mean:     {result["mean"]}')
    print(f'variance: {result["variance"]}')

Note you could also do this in the initial list comprehension, but the example minimises changes.

CodePudding user response：

You have all what you need. I have just only slightly changed the f-string for printing and added a list collecting the results:

import pandas as pd
import statistics
train_1 = pd.DataFrame({'Age':[30.0, 40.0, 20.0, 15.0], 'RestingBP': [60.0, 70.0, 50.0, 80.0]})
numerical = [var for var in train_1.columns if train_1[var].dtype=='float64']
lst_results = []
for var in numerical: 
    variance =  statistics.variance(train_1[var])
    mean     =  statistics.mean(train_1[var])
    lst_results.append( (var, variance, mean ) )
    print(f"variance of {var} is: {variance} and mean of {var} is: {mean}")
print(f'{lst_results=}')

gives:

variance of Age is: 122.91666666666667 and mean of Age is: 26.25
variance of RestingBP is: 166.66666666666666 and mean of RestingBP is: 65.0
lst_results=[('Age', 122.91666666666667, 26.25), ('RestingBP', 166.66666666666666, 65.0)]

And if you want a nice dictionary for storing the results along with a nice print here a debugged and improved version from the another answer:

results = [{var: {'variance': statistics.variance(train_1[var]),
                      'mean': statistics.mean(train_1[var]) }}  
    for var in train_1.columns if train_1[var].dtype=='float64']
for result in results:
    for column, calc in result.items(): 
        print(column)
        print(f'    mean:     {calc["mean"]}')
        print(f'    variance: {calc["variance"]}')
print(f'{results=}')

giving:

Age
    mean:     26.25
    variance: 122.91666666666667
RestingBP
    mean:     65.0
    variance: 166.66666666666666
results=[{'Age': {'variance': 122.91666666666667, 'mean': 26.25}}, {'RestingBP': {'variance': 166.66666666666666, 'mean': 65.0}}]