I see how these methods translates to sql:
FirstOrDefault()
->Select TOP(1) ...
SingleOrDefault()
->Select TOP(2) ...
So I have two questions:
- Can we assume that in only one existing result scenario
SingleOrDefault()
scans the whole table and is ~2x slower thanFirstOrDefault()
? - Is there any recommendation which method should be used? For instance, when we are querying by
PK
and are sure that there's only one result?
CodePudding user response:
Can we assume that in only one existing result scenario
SingleOrDefault()
scans the whole table and is ~2x slower thanFirstOrDefault()
?
It depends, so generally no.
For unique indexed column criteria (like PK), it doesn't matter - the SQL optimizers are smart enough to use the information about value cardinality from the index definition.
For others it's similar to LINQ to Objects. In general (in case there are no value distribution statistics) database has to perform full table scan. So the question is if it can stop the search earlier. If no matching item exists, then both operations are the same O(N). If item exists, FirstOrDefault
can stop earlier while SingleOrDefault
should complete the full scan. Hence it depends of how close the matching item is to the "beginning" of the scan and if second matching item exists and how close it is. In case only one matching value can exist, we could expect the later method to be average 2 times slower.
Is there any recommendation which method should be used? For instance, when we are querying by PK and are sure that there's only one result?
This is considered opinionated, but I would say FirstOrDefault
- not only because of the performance. SingleOrDefault
is strange method in my opinion since it serves two different purposes - finding a match and validating that there are no duplicates. The second part usually is ensured by the property data structure in memory and unique constraint/index in database.
Finally, since there are some comments/discussions regarding Find
method vs the aforementioned two methods when searching for PK. The main purpose of Find
method is that it first searches the local cache and returns the object from there if exists. If it doesn't, then it executes internally what - FirstOrDefault
. This should answer what EF Core team think is preferable.
CodePudding user response:
Can we assume that in only one existing result scenario SingleOrDefault() scans the whole table and is ~2x slower than FirstOrDefault()?
Well, SingleOrDefault
may scan whole table, if there is no another similar record in the table. So if you use this method - you want to ensure that there is no another record in database.
- If you are looking not by indexed columns, it can be really slower
than
FirstOrDefault
, more than two times. - If you are looking by indexed columns, it should be non-remarkable slower.
Is there any recommendation which method should be used? For instance, when we are querying by PK and are sure that there's only one result?
If you do not need check that another similar record may exists database, avoid using SingleOrDefault
.