Averaging of several values-CodePudding

I have a dataset (df3) with five columns x, y, r, g and b, although I only need to work with x, y and r. I want to find the average of all the consecutive rows in which the value of r is equal and store it in a database (df_final). To do this, I have generated a code that stores all the values in which r is equal to the one in previous row in a temporary database (df_inter), to later store the average of all the values in the final database (df_final). The code is this one:

for i in range(len(df3)):
  if df3.iloc[i,3] == df3.iloc[i-1,3]:
    df_inter = pd.DataFrame(columns=['x','y', 'r'])
    df_inter.append(df3.iloc[i,1],df3.iloc[i,2],df3.iloc[i,3])
    df_inter.to_csv(f'Resultados/df_inter.csv', index=False, sep=',')
  else:
    df_final.append(df_inter['x'].mean(),df_inter['y'].mean(),df_inter['r'].mean())
    del [[df_inter]]
    gc.collect()
    df_inter=pd.DataFrame()
    df_inter = pd.DataFrame(columns=['x','y', 'r'])
    df_inter.append(df3.iloc[i,1],df3.iloc[i,2],df3.iloc[i,3])
    df_final.to_csv(f'Resultados/df_final.csv', index=False, sep=',')

Nevertheless, when I execute the code I get this error message:

TypeError: cannot concatenate object of type '<class 'numpy.int64'>'; only Series and DataFrame objs are valid

I'm not sure what the problem is or even if there is a code more efficient for the purpose. Please, I would be grateful if you could help me. Thank you in advance.

Irene

CodePudding user response：

If you have some knowledge of SQL, it can be intuitively done using sqldf and pandas:

import sqldf
import pandas as pd

df = pd.DataFrame({"class":[1,1,1,2,2,2,1,2,2,1],"value":[10,10,12,11,15,17,98,23,22,0]})

averages = sqldf.run("""
                     SELECT class,AVG(value)
                     FROM df
                     GROUP BY class
""")

The output being:

    class  AVG(value)

0      1        26.0
1      2        17.6

Is that what you are looking for ?

CodePudding user response：

You may want to append to the end of the dataframe using


df_inter = df_inter.append({'x':df3.iloc[i,1],'y':df3.iloc[i,2],'r':df3.iloc[i,3]}, ignore_index=True)