Home > Mobile >  MS-Access Select 1 row from GROUP BY query
MS-Access Select 1 row from GROUP BY query

Time:09-23

Always had a hard time wrapping my head around GROUP BY functionality, and this one is no exception.

I have a simple Join query as such

Select t1.g1, t1.g2, t2.id, t2.datetime, t3.name 
From ((table1 t1 Inner Join table2 t2 on t1.fld1=t2.fld1)
Inner Join table3 t3 on t1.fld2=t3.fld2)
Order By t2.datetime, t2.id

This returns my data as expected. Here are some sample rows that illustrate what I am trying to retrieve with Group By...

t1.g1 t2.g2 t2.id t2.datetime t3.name
726 4506 32 9/12/2021 nameA
726 4506 33 9/12/2021 nameB
726 4506 30 9/13/2021 nameC

I want to grab ONLY the first row in each Group of t1.g1, t1.g2.

So, I try the following:

Select t1.g1, t1.g2, FIRST(t2.id), FIRST(t2.datetime), FIRST(t3.name) 
From ((table1 t1 Inner Join table2 t2 on t1.fld1=t2.fld1)
Inner Join table3 t3 on t1.fld2=t3.fld2)
Group By t1.g1, t1.g2
Order By FIRST(t2.datetime), FIRST(t2.id)

For the example Group above, this returns the following record...

t1.g1 t2.g2 t2.id t2.datetime t3.name
726 4506 30 9/13/2021 nameC

So, Order By operates after the Grouping is done, not before. Or so it seems. Perhaps the reason for the order of the SQL keywords (Select, From, Where, Group By, Order By). Ok, makes sense if my assumption is correct. I think it finds t2.id=30 ahead of the other 726/4506 records because t2.id is a primary key on table2.

So, now I try a nested Query, wherein my first query above returns the data in the correct order and the outside query groups and grabs the first record.

Select t1.g1, t1.g2, FIRST(t2.id), FIRST(t2.datetime), FIRST(t3.name) 
FROM (
Select t1.g1, t1.g2, t2.id, t2.datetime, t3.name 
From ((table1 t1 Inner Join table2 t2 on t1.fld1=t2.fld1)
Inner Join table3 t3 on t1.fld2=t3.fld2)
Order By t2.datetime, t2.id
)
Group By t1.g1, t1.g2
Order By FIRST(t2.datetime), FIRST(t2.id)

Same results! I am at a loss to understand how this is happening. So, if anyone can shed light on the order of functioning under-the-covers for Access SQL in this instance I would love to know. On my 2nd query (nested Select), it seems as though I am ordering the target data such that after Grouping the FIRST() aggregate function should select the first row found in the inner result set. But that is not happening.

And of course, if anyone can tell me how to return the row I am after ...

t1.g1 t2.g2 t2.id t2.datetime t3.name
726 4506 32 9/12/2021 nameA

That is all I really need.

CodePudding user response:

I want to grab ONLY the first row in each Group of t1.g1, t1.g2.

You don't want aggregation. You want to filter the data. In this case, a correlated subquery does what you want:

Select t1.g1, t1.g2, t2.id, t2.datetime, t3.name 
From (table1 t1 Inner Join
      table2 t2
      on t1.fld1 = t2.fld1
     ) Inner Join
     table3 t3
     on t1.fld2 = t3.fld2
where t2.id = (select top 1 tt2.id
               from (table1 tt1 Inner Join
                     table2 tt2
                     on tt1.fld1 = tt2.fld1
                    ) Inner Join
                    table3 tt3
                    on tt1.fld2 = tt3.fld2
               where tt1.g1 = t1.g1 and tt1.g2 = t1.g2
               order by tt2.datetime, tt2.id
              );

CodePudding user response:

Here is a solution that scales well (6s on 250k recs in t2) and does what I am asking for.

I could not get Gordon's answer to work in Access. Seems like it should have however. And I have my doubts about how well it would perform with 250k recs in t2. I would love to test a solution like Gordon's, if I could figure out how to get Access to take it.

See problem description for an example on exactly which record I am after. I only need t2.id from the result set. This was not stated originally, but I don't see how that changes the problem statement or solution. I could be wrong there. I still need t3.name, but it can be retrieved later using t2.id.

But I still need to pick the record GROUP'd BY t1.g1, t1.g2 that comes first when all records are sorted by t2.dateandtime, t2.id. Or stated another way, amongst all records with the same t1.g1 t1.g2, I need exactly the first record when the group is sorted by "t2.dateandtime, t2.id".

Perhaps I am thinking about this solution to my problem all wrong, and there are better ways to resolve this with SQL; if so, I would love to hear it.

I seem to have learned that GROUP BY does group records together based on this SQL clause, but this grouping loses any concept of individual records at this point; e.g. you can only extract other fields by using an Aggregate Function (MIN, MAX, SUM, etc), but - and importantly - FIRST does not get the value of the record that you can predict, as the ORDER BY clause has not been performed yet.

With all that said, here is my solution that works.

  1. I removed reference to the Join on t3 as with t2.id I can retrieve all the other info I need from t3 after the fact, using t2.id.
  2. Don't need to select 't1.g1, t1.g2', that is superfluous. I originally thought that any Group By fields had to be specified in the Select clause also.
  3. I combine t2.dateandtime and t2.id into a Text field and use MIN to Select the data/record I am after once it is GROUP'd BY. No need to Sort my result set, as the record with the MIN value of t2.dateandtime, then t2.id has been chosen! Thus satisfying my condition and selection of the correct record.
  4. Since all I need is t2.id returned for further processing, I extract t2.id from the String built in #3 and convert back to Long data type.

Here is the brief and simple query:

Select 
 MIN(Format(t2.dateandtime, "yyyymmddhhmmss") & '_' & t2.id) as dt_id,  
 CLNG(MID(dt_id, INSTR(dt_id, '_')   1)) as id
From 
 (table1 t1 Inner Join table2 t2 on t1.fld1=t2.fld1)
Group By
 t1.g1, t1.g2
  • Related