How can I create a set containing both int and float numbers?-CodePudding

I want to create a set containing both integers and floats. Something like this:

s = {4, 6.7, 2.12, 9}

However, I ran into something unexpected (at least to me): I cannot add both 9 (an integer) and 9.0 (a float). Here is an example:

>>> s = {4, 6.7, 2.12, 9}
>>> s
{9, 2.12, 4, 6.7}
>>> s.add(9.0)
>>> s
{9, 2.12, 4, 6.7}

Why is this happening?
How can I add both numbers to my set?

I don't want 9.0 in s to hold true if I did not add the float number to my set. I really can't figure out how to do it.

As a side note, I noticed that this is true also with keys of a dictionary. So I can't map 3 and 3.0 to different values.

CodePudding user response：

In Python 9.0 == 9 and equality is the basis for uniqueness in sets.

If you want equal float and int numbers not to collide in your set you actually have two properties for comparison: the value, and the type.

The simplest solution is simply to store them both in the set, using a tuple. For example

s = set()
s.add( (float, 1.0) )
s.add( (float, 5.0) )
s.add( (int, 5) )
s.add( (float, 5.0) )

Simple but a bit awkward, and relies on you to set the correct type: there's nothing to stop you adding a float using an int value. Alternatively, you could implement a set subclass that handles the magic for you


class TypedSet(set):
   
   def add(self, v):
       vtype = type(v)
       super().add( (vtype, v) )

   def remove(self, v):
       vtype = type(v)
       super().remove( (vtype, v) )


s = TypedSet()
s.add(1.0)
s.add(5.0)
s.add(5)
s.add(5.0)

Note: for a real implementation this will need a lot more work.

The above will produce the following set (note the duplicate 5.0 behaves as expected).

TypedSet({(<class 'float'>, 1.0), (<class 'int'>, 5), (<class 'float'>, 5.0)})

So this is possible, but if I were you I'd stop and check whether really you want to do this at all. If you've got two things that you want to track, maybe what you want is two sets?

You can always combine the sets to a tuple when you need to iterate/work with them, e.g.


my_floats = set()
my_ints = set()

my_floats.add(1.0)
my_floats.add(5.0)
my_ints.add(5)
my_floats.add(5.0)

combined = (*my_floats, *my_ints)  # combine to a tuple

Will give you the following (without any magic)...

(1.0, 5.0, 5)

CodePudding user response：

set is fundamentally a hashtable. To handle collisions, it checks if the two objects with equal hashes are equal, and if they are, it just assumes the one it already has is correct. Or something along those lines.

The problem is that floats are integers have the same hashcode as the integers would. The hash of any sufficiently small integer in python is equal to that integer, incidentally.

Instead of re-implementing or subclassing set, it's easier to re-implement the float class to simply make its hash different from an integer:

class hfloat(float):
    def __hash__(self):
        if self == int(self):
            return hash((float, float.__hash__(self)))
        else:
            return float.__hash__(self)

s = {4, 6.7, 2.12, 9}
s.add(hfloat(9.0))
print(s)
# {2.12, 4, 6.7, 9, 9.0}

disclaimer: a hfloat and a float with the same integer value will now not have the same hash, so this could also happen:

s = {9.0, hfloat(9.0)}
print(s)
# {9.0, 9.0}

Doing as the comments suggest, and storing (value, type) 2-tuples in the hashtable might be better for your use case.