I have the following dataframe:
Country Qty
Belgium 54
Belgium 8
Belgium 67
France 12
France 3
France 34
Italy 25
Italy 45
Italy 9
Is it possible to groupBy this dataframe by column "Country", aggregate average of the "Qty" output average Qty by Belgium? I am using Spark Python.
CodePudding user response:
This has been solved!
df.filter(df['country'] == 'Belgium').agg(avg(col("Qty")
CodePudding user response:
from pyspark.sql import functions as F
(
df
.groupBy("Country")
.agg(F.mean("Qty").alias("avg"))
.filter(F.col("Country") == "Belgium")
.show()
)
# output
------- ----
|Country| avg|
------- ----
|Belgium|43.0|
------- ----