I am learning to use SQL in R. I want to query a database about the mountains whose height is less than 100 meter difference from the average height of all mountains.
My code so far looks like this:
sql6 = "SELECT Name, Height, AVG(Height) FROM mountain WHERE AVG(Height) > 100 AND AVG(Height) < 100"
result6 = dbSendQuery(con, sql6)
df6 = dbFetch(result6)
head(df6)
But I get an error saying
Error: Invalid use of group function [1111]
I have tried substituting the AVG(Height) > 100
with Height > 100
and the problem seems to be somewhere around there. Any ideas how to solve this?
CodePudding user response:
I think you can use HAVING
. Using sqldf
and mtcars
...
The base query:
sqldf::sqldf("
select cyl, avg(disp) as avgdisp
from mtcars
group by cyl")
# cyl avgdisp
# 1 4 105.1364
# 2 6 183.3143
# 3 8 353.1000
If we just want the higher two, then
sqldf::sqldf("
select cyl, avg(disp) as avgdisp
from mtcars
group by cyl
having avg(disp) > 150")
# cyl avgdisp
# 1 6 183.3143
# 2 8 353.1000
An alternative:
sqldf::sqldf("
with cte as (
select cyl, avg(disp) as avgdisp
from mtcars
group by cyl)
select *
from cte
where avgdisp > 150")
# cyl avgdisp
# 1 6 183.3143
# 2 8 353.1000
I think this will apply just as well to other DBMSes, though un-tested.
CodePudding user response:
you need to use HAVING
as not WHERE
For your math. Use the difference between the height and the average and use the absolute to check if it is lower than 100
sql6 = "SELECT Name, Height, AVG(Height) FROM mountain HAVING ABS(Height - AVG(Height)) < 100"