I have the below PostgreSQL table:
cust prod day month year state quant
0 Bloom Pepsi 2 12 2011 NY 4232
1 Bloom Bread 23 5 2015 PA 4167
2 Bloom Pepsi 22 1 2016 CT 4404
3 Bloom Fruits 11 1 2010 NJ 4369
4 Bloom Milk 7 11 2016 CT 210
I have to find and display the average sale of Bloom per state and display like this:
CUST AVG_NY AVG_CT AVG_NJ
Bloom 28923 3241 1873
I converted the data to the below form:
[('Bloom', 'Pepsi', 2, 12, 2011, 'NY', 4232), ('Bloom', 'Eggs', 30, 11, 2010, 'NJ', 559), ('Bloom', 'Yogurt', 25, 7, 2014, 'PA', 17), ('Bloom', 'Yogurt', 3, 4, 2011, 'NJ', 1203), ('Bloom', 'Coke', 7, 2, 2010, 'NY', 1229), ('Bloom', 'Coke', 6, 10, 2018, 'PA', 2867), ('Bloom', 'Soap', 6, 1, 2015, 'CT', 4623), ('Bloom', 'Milk', 8, 9, 2010, 'NJ', 1106), ('Bloom', 'Milk', 19, 4, 2013, 'NY', 3516), ('Bloom', 'Soap', 7, 6,
2015, 'PA', 3404)]
Below is my code and is probably the worsts way to do so:
connection = psycopg2.connect(user="postgres",
password="ss",
host="127.0.0.1",
port="8800",
database="postgres")
cursor = connection.cursor()
postgreSQL_select_Query = "select * from sales"
cursor.execute(postgreSQL_select_Query)
mobile_records = cursor.fetchall()
def takeSecond(elem):
return elem[0][0]
mobile_records.sort(key=takeSecond)
Bloom1 = []
for i in mobile_records:
if i[5] == 'NY' and i[0] == 'Bloom':
Bloom1.append(i)
s1 = 0
for j in Bloom1:
s1 = j[6]
avg1 = s1/len(Bloom1)
Bloom2 = []
for i in mobile_records:
if i[5] == 'CT' and i[0] == 'Bloom':
Bloom2.append(i)
s2 = 0
for j in Bloom2:
s2 = j[6]
avg2 = s2/len(Bloom2)
Bloom3 = []
for i in mobile_records:
if i[5] == 'NJ' and i[0] == 'Bloom':
Bloom3.append(i)
s3 = 0
for j in Bloom3:
s3 = j[6]
avg3 = s3/len(Bloom3)
How do I even start to achieve this?
CodePudding user response:
You should take a deeper look into SQL. It's totally not necessary to do it like that. Just use a group by statement.
statement = "SELECT state, AVG(quant) FROM sales WHERE cust = Bloom GROUP BY state"
After executing it you can simply loop through the returned list and check each state.
data = cursor.fetchall()
for dataset in data:
if dataset[0] == 'NJ':
# do something with dataset[1]
Note: dataset[0]
stores the state and dataset[1]
the average.