I'm trying to convert a HDR image float array I load to a 10-bit DWORD with WIC.
The type of the loading file is GUID_WICPixelFormat128bppPRGBAFloat and I got an array of 4 floats per color.
When I try to convert these to 10 bit as follows:
struct RGBX
{
unsigned int b : 10;
unsigned int g : 10;
unsigned int r : 10;
int a : 2;
} rgbx;
(which is the format requested by the NVIDIA encoding library for 10-bit rgb),
then I assume I have to divide each of the floats by 1024.0f in order to get them inside the 10 bits of a DWORD.
However, I notice that some of the floats are > 1, which means that their range is not [0,1] as it happens when the image is 8 bit.
What would their range be? How to store a floating point color into a 10-bits integer?
I'm trying to use the NVidia's HDR encoder which requires an ARGB10 like the above structure.
How is the 10 bit information of a color stored as a floating point number?
Btw I tried to convert with WIC but conversion from GUID_WICPixelFormat128bppPRGBAFloat to GUID_WICPixelFormat32bppR10G10B10A2 fails.
HRESULT ConvertFloatTo10(const float* f, int wi, int he, std::vector<DWORD>& out)
{
CComPtr<IWICBitmap> b;
wbfact->CreateBitmapFromMemory(wi, he, GUID_WICPixelFormat128bppPRGBAFloat, wi * 16, wi * he * 16, (BYTE*)f, &b);
CComPtr<IWICFormatConverter> wf;
wbfact->CreateFormatConverter(&wf);
wf->Initialize(b, GUID_WICPixelFormat32bppR10G10B10A2, WICBitmapDitherTypeNone, 0, 0, WICBitmapPaletteTypeCustom);
// This last call fails with 0x88982f50 : The component cannot be found.
}
Edit: I found a paper (https://hal.archives-ouvertes.fr/hal-01704278/document), is this relevant to this question?
CodePudding user response:
Floating-point color content that is greater than the 0..1 range is High Dynamic Range (HDR) content. If you trivially convert it to 10:10:10:2 UNORM then you are using 'clipping' for values over 1. This doesn't give good results.
You should instead use tone-mapping which converts the HDR signal to a SDR (Standard Dynamic Range a.k.a. 0..1) before or as part of doing the conversion to 10:10:10:2.
There a many different approaches to tone-mapping, but a common 'generic' solution is the Reinhard tone-mapping operator. Here's an implementation using DirectXTex.
std::unique_ptr<ScratchImage> timage(new (std::nothrow) ScratchImage);
if (!timage)
{
wprintf(L"\nERROR: Memory allocation failed\n");
return 1;
}
// Compute max luminosity across all images
XMVECTOR maxLum = XMVectorZero();
hr = EvaluateImage(image->GetImages(), image->GetImageCount(), image->GetMetadata(),
[&](const XMVECTOR* pixels, size_t w, size_t y)
{
UNREFERENCED_PARAMETER(y);
for (size_t j = 0; j < w; j)
{
static const XMVECTORF32 s_luminance = { { { 0.3f, 0.59f, 0.11f, 0.f } } };
XMVECTOR v = *pixels ;
v = XMVector3Dot(v, s_luminance);
maxLum = XMVectorMax(v, maxLum);
}
});
if (FAILED(hr))
{
wprintf(L" FAILED [tonemap maxlum] (X%ls)\n", static_cast<unsigned int>(hr), GetErrorDesc(hr));
return 1;
}
maxLum = XMVectorMultiply(maxLum, maxLum);
hr = TransformImage(image->GetImages(), image->GetImageCount(), image->GetMetadata(),
[&](XMVECTOR* outPixels, const XMVECTOR* inPixels, size_t w, size_t y)
{
UNREFERENCED_PARAMETER(y);
for (size_t j = 0; j < w; j)
{
XMVECTOR value = inPixels[j];
const XMVECTOR scale = XMVectorDivide(
XMVectorAdd(g_XMOne, XMVectorDivide(value, maxLum)),
XMVectorAdd(g_XMOne, value));
const XMVECTOR nvalue = XMVectorMultiply(value, scale);
value = XMVectorSelect(value, nvalue, g_XMSelect1110);
outPixels[j] = value;
}
}, *timage);
if (FAILED(hr))
{
wprintf(L" FAILED [tonemap apply] (X%ls)\n", static_cast<unsigned int>(hr), GetErrorDesc(hr));
return 1;
}
Reference
Reinhard et al., "Photographic tone reproduction for digital images", ACM Transactions on Graphics, Volume 21, Issue 3 (July 2002). ACM DL.
CodePudding user response:
@ChuckWalnbourn answer is helpful, however I don't want to tonemap to [0,1] as there is no point in tonemapping to SDR then going to 10-bit HDR.
What I 'd think it's correct is to scale to [0,4] instead by first using g_XMFour
.
const XMVECTOR scale = XMVectorDivide(
XMVectorAdd(g_XMFour, XMVectorDivide(v, maxLum)),
XMVectorAdd(g_XMFour, v));
then using a specialized 10-bit store which scales by 255 instead of 1023:
void XMStoreUDecN4a(DirectX::PackedVector::XMUDECN4* pDestination,DirectX::FXMVECTOR V)
{
using namespace DirectX;
XMVECTOR N;
static const XMVECTOR Scale = { 255.0f, 255.0f, 255.0f, 3.0f };
assert(pDestination);
N = XMVectorClamp(V, XMVectorZero(), g_XMFour);
N = XMVectorMultiply(N, Scale);
pDestination->v = ((uint32_t)DirectX::XMVectorGetW(N) << 30) |
(((uint32_t)DirectX::XMVectorGetZ(N) & 0x3FF) << 20) |
(((uint32_t)DirectX::XMVectorGetY(N) & 0x3FF) << 10) |
(((uint32_t)DirectX::XMVectorGetX(N) & 0x3FF));
}
And then a specialized 10-bit load which divides with 255 instead of 1023:
DirectX::XMVECTOR XMLoadUDecN4a(DirectX::PackedVector::XMUDECN4* pSource)
{
using namespace DirectX;
fourx vectorOut;
uint32_t Element;
Element = pSource->v & 0x3FF;
vectorOut.r = (float)Element / 255.f;
Element = (pSource->v >> 10) & 0x3FF;
vectorOut.g = (float)Element / 255.f;
Element = (pSource->v >> 20) & 0x3FF;
vectorOut.b = (float)Element / 255.f;
vectorOut.a = (float)(pSource->v >> 30) / 3.f;
const DirectX::XMVECTORF32 j = { vectorOut.r,vectorOut.g,vectorOut.b,vectorOut.a };
return j;
}