Home > database >  Get mean from row values
Get mean from row values

Time:02-21

I have a dataframe of a set of voltages at different points in time

edit: thanks for your help! This df was created from a google sheet file where everything was a string. Today was my first time using pandas, I learned:

  • setting axis
  • filter
  • astype
  • select_dtypes & head()_to_dict()

//

  Measurements Release 5 Release 6 Release 7
0           V0        48        48        50
1           V1        49        51        53
2           V2        50        52        54
3           V3        51        53        55

All voltages are measured at the same point for each new fw release. I want to calculate the mean for each one of these 4 points but can't seem to make it work, though documentation seems simple enough

print(df.mean())
Release 5    12123762.75
Release 6    12128813.25
Release 7    12633863.75
dtype: float64

Not sure where it gets those numbers. I tried using df.loc to get a row and then get the mean

print(df.loc['V1'].mean())


ValueError: No axis named V1 for object type DataFrame

And then

print(df.iloc[1].mean())
TypeError: Could not convert V1495153 to numeric

CodePudding user response:

For me working well your solution, maybe need numeric_only=True parameter:

print(df.mean(numeric_only=True))
Release 5    49.5
Release 6    51.0
Release 7    53.0
dtype: float64

For last pandas version use DataFrame.select_dtypes:

df.select_dtypes('number').mean()

If need mean per rows:

df.set_index('Measurements').select_dtypes('number').mean(axis=1)

EDIT: For converting columns to numeric before mean use:

 df.drop('Measurements', axis=1).astype('float').mean(axis=1)   

Or if float failed because bad non numeric values:

 (df.drop('Measurements', axis=1)
    .apply(pd.to_numeric, errors='coerce')
    .mean(axis=1)   )

CodePudding user response:

Update

All your columns have the object dtype:

>>> df.mean()
Release 5    12123762.75
Release 6    12128813.25
Release 7    12633863.75
dtype: float64

>>> df.dtypes
Measurements    object
Release 5       object
Release 6       object
Release 7       object
dtype: object

Convert your columns to numeric first

df['mean'] = df.filter(like='Release').astype(float).mean(axis=1)
print(df)

# Output
  Measurements  Release 5  Release 6  Release 7       mean
0           V0         48         48         50  48.666667
1           V1         49         51         53  51.000000
2           V2         50         52         54  52.000000
3           V3         51         53         55  53.000000

Old answer

You can also define Measurements column as the index of your dataframe. It makes sense if you have only numeric columns after that:

out = df.set_index('Measurements').mean(axis=1)
print(out)

# Output
Measurements
V0    48.666667
V1    51.000000
V2    52.000000
V3    53.000000
dtype: float64
  • Related