I ran into what seems to be a collision between two QString
objects with three-character strings:
const QString one = "haç";
const QString two = "hek";
qDebug() << one << qHash(one);
qDebug() << two << qHash(two);
Here is the output (which I redirected from the console to a file so that the special character would come out correctly):
Debug: "haç" 103182
Debug: "hek" 103182
I'm accustomed to think of hash collisions as a theoretical possibility rather than something that would happen in practice. Is qHash
is more likely to produce collisions for short strings?
(I'm getting this behavior in Qt 5.15.1.)
Here is a MWE if anyone else wants to test it:
main.cpp
#include <QCoreApplication>
#include <QtDebug>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
const QString one = "haç";
const QString two = "hek";
qDebug() << one << qHash(one);
qDebug() << two << qHash(two);
return a.exec();
}
untitled.pro
QT -= gui
CONFIG = c 11 console
CONFIG -= app_bundle
# You can make your code fail to compile if it uses deprecated APIs.
# In order to do so, uncomment the following line.
#DEFINES = QT_DISABLE_DEPRECATED_BEFORE=0x060000 # disables all the APIs deprecated before Qt 6.0.0
SOURCES = \
main.cpp
# Default rules for deployment.
qnx: target.path = /tmp/$${TARGET}/bin
else: unix:!android: target.path = /opt/$${TARGET}/bin
!isEmpty(target.path): INSTALLS = target
CodePudding user response:
Looking at the source code of qHash
, one could inspect the used algorithm:
uint h = seed;
if (seed && hasFastCrc32())
return crc32(p, len, h);
for (size_t i = 0; i < len; i)
h = 31 * h p[i].unicode();
return h;
CRC32 is only used when a seed is specified and the cpu supports it. Otherwise a much simpler algorithm is used.
If you specify a seed (like QHash and other Qt classes would do by default), you would obtain a different result, i.e. no collision occurs.