Home > Software design >  How to resolve sql query In pandas
How to resolve sql query In pandas

Time:02-12

I am new to sql query and having a small problem , here i want to find "top 3 venues which hosted most number of eliminator matches" but I think my sql is wrong how to get this done? (I'm using jupiter notebook and pyspark)

import pandas as pd
from pandasql import sqldf

pysqldf = lambda q: sqldf(q, globals())

from matplotlib import pyplot as plt

data2 = pd.read_csv('ipl_matches.csv')

data2

#Most number of eliminator matches
eliminator = data2[data2.eliminator == 'Y']
eliminator

##I'm having problem in this query##
q = """SELECT venue_id, eliminator 
       FROM eliminator """

names = pysqldf(q)
names

Here is the eliminator data

CodePudding user response:

See if this query works for you:

q=""" SELECT venue_id
    FROM eliminator group by venue_id order by count(venue_id) desc LIMIT 3"""

CodePudding user response:

use this query :

q = """SELECT venue_id, eliminator FROM eliminator order by venue_id desc limit 3 """

this will give you last 3 biggest venue_id in descending mode

  • Related