Home > Software engineering >  MS Access pass-through query performance issues
MS Access pass-through query performance issues

Time:12-09

My company uses an MRP system for things such as inventory, generating sales orders, invoicing, etc.... This system runs on Visual Fox Pro, and allows us to export table data into excel spreadsheets which I use as linked tables for my Access databases. Most of the databases I've created are for other departments and the end users don't all have great computer skills, so rather than require them to export the MRP tables to excel with a certain file name (to maintain the link) I've started to mess around with making ODBC connections directly to our MRP system's .dbf tables so that the end users no longer have to export data themselves. I've read in multiple forums that pass-through queries typically run better than linking the tables to my database and running queries locally on Access. While I have tested this out and confirmed that this is true, these pass through queries still run very slowly. Please see sample code below:

SELECT sales.Accountno, sales.sono, sales.itemno, sales.datereq, sales.shipvia, sales.orqtyreq, sales.qtyship, sales.custpono, sales.partno, sales.terms, sales.complete, sales.confirmed
FROM sales
WHERE complete = "N" AND confirmed = .T.
order by sales.Accountno;

This comes out to about 2000 records. However, this runs much more slowly than selecting all records from the sales table, even though that brings up about 100,000 records.

I guess my questions are as follows:

  • How is selecting all records faster than doing a filter which results in 98,000 less records?
  • How can I increase performance of these pass-through queries? Or are there other approaches to extracting data straight from the MRP tables that would perform better?
  • Is it better to run the query through VBA vs the SQL view of the query designer?

Edit: Sometimes the queries run at a still slow but acceptable time of 5s or so. Other times it locks up my database and can take a few minutes. Could this have something to do with other people on the MRP software who are using the table I'm trying to run a PT query on?

Any advice/suggestions are much appreciated

CodePudding user response:

Well, first up, assuming these are FoxPro "dbf" files?

then a pass-though means ZERO in this context, and I don't belive ANY performance gains can be had here.

Next up:

How is selecting all records faster than doing a filter which results in 98,000 less records?

VERY common in database land. The reason of course is the sort order. And furthermore, often using a index is MUCH slower.

Think of it this way:

You are in a motor boat, and you put you hand in the water to scoop out water. Your hand scoops out water as FAST as the boat is going.

Now, lets decide to do random access lake water grabbing. In place of placing your hand in the water scooping up water as faster as possible?. Now take a small bucket, and scoop out one bucket of water, and then dump that bucket into the boat. Each time you do this is VERY slow.

So, it is OFTEN determined by the query processor/plan, that it is MUCH faster to read the data sequentially at a VERY high speed as opposed to jumping around in the huge file, and pulling one record at a time. Worse yet, often the disk drive and "page size" of the database can't pull one record, but ONLY the spot where the one record exists (say a 4000k chunk, to get that 200 char record).

So, each random seek will pull 4000 bytes to get that one 200 character record.

So, what do you think is faster:

reading 1000 records in a row, as the disk drive screens past the reading head (like your hand in the water)?

Or, doing 1000 record seeks pulling a larger chunk of the file to read the one record, stop reading, and then pull out the record, and now seek again and now read the page chunk, and then pull out the one record?

In fact, you often find that single record seek (which occurs when you use a index) is MUCH slower. In fact, often the database engine will say, that is a REALLY bad idea, it will ignore the indexing, ignore the 1000 seeks, and simple start reading the whole file from the start - the result is MUCH faster. And toss in disk read ahead buffering, and often it no contest. The full speed start reading records from the start is MUCH faster. (as noted, toss in read ahead buffering, DMA controllers - and again, often a sequential read will blow away a indexing seek and pull of one record by MANY factors).

of course the above is not always true!

Next up:

You state/suggest/hint/imply that a query against the 100k rows is faster. But, is it really?

it can depend!!!

Often FoxPro (or access) when pulling data - say 100k rows?

it will return the first page of data while it continues to read in the background. So it FOOLS you into thinking the query was instant when in fact it was not.

And even more interesting? In most cases a PT query does not do this trick!

However, if I am reading your post correctly?

A PT query will NOT help against a FoxPro/dbf table. And in fact, if I had to guess, I suspect in MANY cases it might be slower.

If this is really a PT query?

I STRONG suggest you attempt/try/use a linked table. Try that, and see if it runs faster (it probably will). Why? Well, when you link a table in Access->FoxPro, then Access can then build use a better query plan, and ALSO use indexing.

So, check if the accountNo column has a index (it probably does, but YOU REALLY want to be 100%, if not 200% sure.

last but not least:

you leave out a MASSIVE detail, a mount Everest one!!!

Are the dbf (FoxPro) files local to the current workstation or is there a network involved? (I mean, you can't possible leave out this whopper of a detail here when talking about performance).

But, even 100x time worse? Is there a VPN involved? Since that going to be even MUCH slower!!!

So, using a FoxPro client, or Access one should not make a difference here. But, of course if a network is involved, then that information has to be taken into account, and if a VPN is involved, then you are in for heaps of pain here.

So, if you don't know ahead of time the table(s) in question, then I would suggest you use a transfer database command to create the required linked table in Access, and test/try that performance. I am betting it should do as good as the PT query.

However, as noted, if you leaving out that huge/large/massive/Mount Everest detail about a network between Access and the dbf file? Well, then that information is required here for further discussion and information.

And if say you have workers at home, and now not only a network is between Access and the dbf file, but a VPN connection, then we have a LARGER issue that is even larger then Mount Everest here!

But, as to why a condition and especially a sort on a file is slower then a full table scan? No, often not a surprise at all, and as noted, often the database engine will make this choice to improve performance (ie: sequential table scan vs hopping around and using a index seek).

now, EVEN with a SSD? Well, ok, now we do NOT have disk drive head to hop around seeking the data, but we STILL have the index doing this!!!!

Based on this lower performance? I can already guess/determine that you have a network involved here!!! - since how computers work tells me so, and thus like Dr. House, or Sherlock Homes?

You have a potential slow network here, and be it Access, or Foxpro? They are BOTH file based systems, and they can tolerate a network connection, but a VPN is another matter.

Edit: Those vpn's - and file share database - not good!

Ok, so somewhat like "Dr. House", most of my spider sense was 100% correct here.

As for dynaset vs snapshot? Probably not going to see a much of a difference to write home about.

Top of your list is that VPN - that is a bottle neck.

You will MOST certainly have to try/test typical software operation with that VPN (else you might setup some software - try it out, and "then" deploy with users on that VPN, and they will find it rather slow).

You might be able to toss more hardware - maybe run a fiber cable down the road to that other site.

So, you don't have a lot of options are.

but, some are:

Make sure the query limits the data pull by a column that has a index - this is critical, else access (or even FoxPro) will do a full table scan, and pull the whole table over the network.

Does the vendor offer a sql server based edition? That could/would help, as then Access could hit SQL server, - and that can/does work VERY well over a VPN.

However, the other stop-gap?

You have those users over the VPN use remote desktop. So, that way, the FoxPro (and access) software runs on a remote desktop - only the screen keyboard mouse clicks travel over the VPN - the data stays on that one "real" local area network - no VPN between Access and the dbf file).

If you have a remote desktop server running now then this would be ideal. However, if you don't, then that will cost $$ to setup, but is often your only practical way to get decent performance.

As noted, the other possible is to use SQL server, but given that application runs FoxPro/dbf files, then that's not a likely choice.

So, you suffer performance unless you can get/have/use the application from that "down the street" location by using and adopting remote desktop. The fact of Foxpro dbf being a file system (like ms-access) then options to achieve performance here are limited.

You can read this 20 year old article of mine - and it still is 100% true today:

http://www.kallal.ca/Wan/Wans.html

  • Related