Below is my data set.
I want to calculate the average temperature of each station. It is more desired if I can remove zero (noise) data.
How can I do it? enter image description here
I have no idea on how to start with.
CodePudding user response:
I think you can use standard csv reader to iterate over rows, collect all temperatures to the list t
and do something like sum(t) / len(t)
CodePudding user response:
Just turn the data into a pandas.DataFrame
and use pandas.DataFrame.groupby
method:
file = #your csv file
df = pd.read_csv(file)
df = df.drop(df[df['temp'] == 0].index)
print(df.groupby('ID')[['temp']].mean())
Gives:
temp
ID
1 20.5
2 32.1
3 14.4
Note: the file I used looks like...
ID,stuff,temp
1,3,20
1,6,20.1
1,7,21.4
2,1,30.2
2,3,0
2,2,34
3,7,0
3,6,0
3,2,14.4