Home > Mobile >  C# Best Practices when using !=null, Count > 0, and .Any()
C# Best Practices when using !=null, Count > 0, and .Any()

Time:05-19

Out of curiosity, what are the underlying differences between !=null, Count > 0, and .Any(), and when is the optimal time to use each? - for both architectural and performance.

I know that .Any() is for IEnumerables, not lists, but I find myself using them (!=null and Count > 0) interchangeably when allowed to. I don't want to get into the habit if it is bad practice.

CodePudding user response:

The way Enumerable.Any works, is the following

public static bool Any<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
        throw Error.ArgumentNull(nameof (source));
    using (IEnumerator<TSource> enumerator = source.GetEnumerator())
    {
        // Returns very early if there are any items
        if (enumerator.MoveNext())
            return true;
    }
    return false;
}

There is some overhead, but not that much, really, hence I'd assume that Any() performs comparably performant to List<T>.Count (the IEnumerable extension tries to fall back to that value, hence this should be comparable, too).

Clarity-wise, it eventually boils down to a matter of taste. Personally I like the expressiveness of Any(), for it's a bit closer to natural language to me.

rows.Any()

makes me think less than

rows.Count == 0

But not that much.

null comparison works in a totally different way. If this is not clear to you, you should educate yourself about value and reference types in C# (warning: Things have become a bit different lately, since value types do not allow null by default in current versions of C#/.NET).

Summarized: Reference types are stored differently and the actual variable is (kind of) just a pointer to where the data is stored (not 100% accurate). Classically pointers in C could take the value 0 to indicate that there is no data they point to, which was carried over (at least as a concept) to C# (and other C-like languages as Java). Anyway, since the pointers are not accessed directly in C#, null was invented to indicate that the variable does not point to actual data ("there is no data associated with that variable"). Hence comparing a List<int> variable with null means that you ask whether the list instance has been created at all, which is a nescessary, but not a sufficient criteria for the actual list items being stored.

CodePudding user response:

It depends on your goals. A List can be not null, but have Count == 0. What about Any() - use it in LINQ style like:

if(MyEnumerable.Any(x => x.Id == myId))
{
    // do something
}

CodePudding user response:

  !=null

Checks if the list is null. This may or may not represent the same thing as "no items". I would recommend to avoid using null for lists at all since the meaning is unclear. Use an empty collection to represent "no items". If you need something else to represent "does not exist", use a maybe/optional type, or a possibly nullable reference if they are non-nullable per default. That should be a clearer signal to the reader that "no items" and "does not exist" are different states that may need to be handled differently.

.Any()

This is part of LINQ, so will work for any enumerable collection. It is very clear and convenient, but will be slightly slower than checking a property. But it allows inline filtering if you want to check the presence of some specific value.

.Count > 0 or .Length > 0

This will only work for collections with a Count/Length property, but should be fastest since it just does a compare. At least assuming the properties are simple and does not do a bunch of work.

But note that performance concerns are not very relevant unless you are running the code very frequently. So start by measuring the performance before trying to do any optimizations.

CodePudding user response:

TL;DR: Any() will in fact call ICollection.Count, if the collection type implements it. Otherwise it will try and pick the "best" way of checking if the collection is non-empty.

Checking for null is quite different to checking for empty, and it doesn't really make sense to compare it with empty checks.


Paul Kertscher has already answered the meat of your question, but to help with future queries on this sort of thing it is well worth remembering that the source code for dotnet is, in fact, available online.

The code for the LINQ Any() method, for example, may be found here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Linq/src/System/Linq/AnyAll.cs (and there's plenty more in the rest of the runtime repo, with the libraries here). It is a little more complex than the version of the method that Paul included in his answer... possibly he's referencing a different version of the runtime. Here's the latest implementation though (comments present in original source):

public static bool Any<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.source);
    }

    if (source is ICollection<TSource> collectionoft)
    {
        return collectionoft.Count != 0;
    }
    else if (source is IIListProvider<TSource> listProv)
    {
        // Note that this check differs from the corresponding check in
        // Count (whereas otherwise this method parallels it).  If the count
        // can't be retrieved cheaply, that likely means we'd need to iterate
        // through the entire sequence in order to get the count, and in that
        // case, we'll generally be better off falling through to the logic
        // below that only enumerates at most a single element.
        int count = listProv.GetCount(onlyIfCheap: true);
        if (count >= 0)
        {
            return count != 0;
        }
    }
    else if (source is ICollection collection)
    {
        return collection.Count != 0;
    }

    using (IEnumerator<TSource> e = source.GetEnumerator())
    {
        return e.MoveNext();
    }
}

The code for ICollection is part of System.Collection, and IListProvider is part of System.Linq.

You can see that there's multiple ways in which Any() can test if the underlying collection has any items in it, and you can reasonably assume that they're tried in order of speed/efficiency. Obviously, you could write a terribly implementation of ICollection.Count { get }, but that would be a silly thing to do and it would be nice to think that Microsoft haven't done that in any of the collection types that ship with the dotnet runtime. (incidentally, arrays implement ICollection.Count via an explicit interface, which maps it to their Length property, which is why Any() doesn't need to care about checking for Length)

I haven't looked up any implementations of IListProvider<T>.GetCount,. The docs for the interface method say this:

If true then the count should only be calculated if doing so is quick (sure or likely to be constant time), otherwise -1 should be returned.

Again, terrible implementations could exist, but probably don't.

There's generally nothing wrong with using Any() instead of Count. The additional overhead in calling a method and running through a few ifs and casts is generally negligible. Just make sure not to confise Count with Count(), as the latter may require traversing the whole of a collection which can be a comparatively expensive and time consuming thing to do, and is quite pointless if all you you want to do is check for emptiness.

  • Related