Home > Back-end >  Averaging of several values
Averaging of several values

Time:11-22

I have a dataset (df3) with five columns x, y, r, g and b, although I only need to work with x, y and r. I want to find the average of all the consecutive rows in which the value of r is equal and store it in a database (df_final). To do this, I have generated a code that stores all the values in which r is equal to the one in previous row in a temporary database (df_inter), to later store the average of all the values in the final database (df_final). The code is this one:

for i in range(len(df3)):
  if df3.iloc[i,3] == df3.iloc[i-1,3]:
    df_inter = pd.DataFrame(columns=['x','y', 'r'])
    df_inter.append(df3.iloc[i,1],df3.iloc[i,2],df3.iloc[i,3])
    df_inter.to_csv(f'Resultados/df_inter.csv', index=False, sep=',')
  else:
    df_final.append(df_inter['x'].mean(),df_inter['y'].mean(),df_inter['r'].mean())
    del [[df_inter]]
    gc.collect()
    df_inter=pd.DataFrame()
    df_inter = pd.DataFrame(columns=['x','y', 'r'])
    df_inter.append(df3.iloc[i,1],df3.iloc[i,2],df3.iloc[i,3])
    df_final.to_csv(f'Resultados/df_final.csv', index=False, sep=',')

Nevertheless, when I execute the code I get this error message:

TypeError: cannot concatenate object of type '<class 'numpy.int64'>'; only Series and DataFrame objs are valid

I'm not sure what the problem is or even if there is a code more efficient for the purpose. Please, I would be grateful if you could help me. Thank you in advance.

Irene

CodePudding user response:

If you have some knowledge of SQL, it can be intuitively done using sqldf and pandas:

import sqldf
import pandas as pd

df = pd.DataFrame({"class":[1,1,1,2,2,2,1,2,2,1],"value":[10,10,12,11,15,17,98,23,22,0]})

averages = sqldf.run("""
                     SELECT class,AVG(value)
                     FROM df
                     GROUP BY class
""")

The output being:

    class  AVG(value)

0      1        26.0
1      2        17.6

Is that what you are looking for ?

CodePudding user response:

You may want to append to the end of the dataframe using


df_inter = df_inter.append({'x':df3.iloc[i,1],'y':df3.iloc[i,2],'r':df3.iloc[i,3]}, ignore_index=True)

  • Related