Home > OS >  Get first (or any) value from HashSet
Get first (or any) value from HashSet

Time:11-07

Currently, my code looks like this:

    private List<Node> dirtyNodes = new List<Node> dirtyNodes();

    public void UpdateDirtyNodes()
    {
        while(dirtyNodes.Count > 0)
        {
            Node nodeToUpdate = dirtyNodes[0];
            nodeToUpdate.UpdateNode();
            dirtyNodes.Remove(nodeToUpdate);
        }
    }

    public void MarkNodeDirty(Node node)
    {
        if(!dirtyNodes.Contains(node))
        {
            dirtyNodes.Add(node);
        }
    }

    public void MarkNodeClean(Node node)
    {
        dirtyNodes.Remove(node);
    }

This is a performance-critical part of the code, and it's slower than I'd like because dirtyNodes.Contains has to iterate over the entire array in most cases. I'd like to replace the List with a HashSet because it should be faster, but I can't figure out how to make that work with UpdateDirtyNodes().

The difficulty is that UpdateNode() can add or remove nodes from dirtyNodes at any time, hence the slightly awkward while loop. Is there a way I can get the "first" value from a HashSet? Order doesn't matter, I just need to stay in the while loop until dirtyNodes is empty, updating whatever node comes next.

I would prefer to avoid using Linq since this code will be part of a library and I don't want to force them to include Linq.

How can I do this?

CodePudding user response:

Add a bool dirty field inside the Node class. This is in addition to keeping a hash set. Then MarkNodeClean() doesn't need to remove nodes from the HashSet, shaving off some CPU cycles.

If you feel that adding a field inside Node class is too "dirty" (pun intended), then just make a HashSet<(Node, bool)> instead of HashSet<Node>, but you are loosing performance on allocating and garbage-collecting extra objects, which is not ideal since your code is performance-critical.

UpdateDirtyNodes() will take nodes one at a time until the HashSet is empty. After taking each node, it will look at the boolean flag to decide whether the node is actually dirty.

P.S. You should remove dirtyNodes.Clear(); from UpdateDirtyNodes(). This is a race condition. If a node is added after the while loop finds that dirtyNodes.Count is 0, then dirtyNodes.Clear(); clears out that node without processing it. This is a separate bug, not related to your question.

CodePudding user response:

Turns out it's very easy to just use the enumerator directly:

    public void UpdateDirtyNodes()
    {
        while(dirtyNodes.Count > 0)
        {
            using(HashSet<Node>.Enumerator enumerator = dirtyNodes.GetEnumerator())
            {
                if(enumerator.MoveNext())
                {
                    Node nodeToUpdate = enumerator.Current;
                    nodeToUpdate.UpdateNode();
                    dirtyNodes.Remove(nodeToUpdate);
                }
            }
        }
    }

    public void MarkNodeDirty(Node node)
    {
        dirtyNodes.Add(node);
    }

I originally tried something similar but didn't fully understand how to manually use enumerators and it didn't work.

It's significantly faster than the List (overall frame time is ~25-50% faster depending on the situation just from that one change) so I'm very happy. (Don't panic about that 30MB allocation in the screenshot below - I'm working on it.)

enter image description here

  • Related