Why does CPython return a new pointer to the True and False singletons and increment its reference c-CodePudding

A Python Boolean is described as follows in the documentation:

Booleans in Python are implemented as a subclass of integers. There are only two booleans, Py_False and Py_True.

The Py_False and Py_True are, as I understand it, singletons corresponding to False and True respectively.

Indeed, the following returns True in my Jupyter notebook:

a = True
b = True
a is b

False works the same way.

The PyBool_FromLong method (code here) creates a new Boolean object from a long. However, it creates an entirely new pointer to it and increments the reference count before returning it:

PyObject *PyBool_FromLong(long ok)
{
    PyObject *result;

    if (ok)
        result = Py_True;
    else
        result = Py_False;
    return Py_NewRef(result);
}

The Py_True and Py_False are defined as follows:

/* Py_False and Py_True are the only two bools in existence.
Don't forget to apply Py_INCREF() when returning either!!! */

/* Don't use these directly */
PyAPI_DATA(PyLongObject) _Py_FalseStruct;
PyAPI_DATA(PyLongObject) _Py_TrueStruct;

/* Use these macros */
#define Py_False _PyObject_CAST(&_Py_FalseStruct)
#define Py_True _PyObject_CAST(&_Py_TrueStruct)

The comments above are quite insistent that you increment the reference count when returning either, and that's exactly what the method I showed above does. I'm somewhat confused as to why this is necessary, though, since (as I understand it) these are just singletons that will never be garbage collected.

I was able to find this Q&A about whether incrementing the ref count is always necessary, but I'm still confused about why it's needed in the first place, given that the True and False objects are singletons that would never be garbage collected.

I'm not sure if I'm missing something obvious, but can someone explain why it's necessary to increment the reference count when returning a reference to Py_False or Py_True? Or is this to prevent the object from ever being garbage collected?

CodePudding user response：

In Python, objects are garbage collected when there are no more references to them. This means that when you create a new object, Python will keep track of how many references to that object exist, and it will automatically delete the object when the reference count reaches zero.

The Py_False and Py_True objects are special in that they are singletons, meaning that there is only one instance of each object. Because of this, it is guaranteed that there will always be at least one reference to each of these objects, so they will never be garbage collected.

However, when a new reference to Py_False or Py_True is created, the reference count for that object needs to be incremented. This is because the reference count is used to track the number of references to an object, and it needs to be accurate in order for Python to manage its memory properly.

In the case of PyBool_FromLong, the method creates a new reference to Py_False or Py_True, depending on the value of the input parameter. It then increments the reference count for that object before returning the new reference. This ensures that the reference count for Py_False and Py_True is accurate, and that Python can manage its memory properly.