Home > database >  Absolute achievable minimum of GC with string concat in LUA?
Absolute achievable minimum of GC with string concat in LUA?

Time:11-10

Runtime: lua 5.1.x compiled under ARM64, no c-modules allowed

Example code, ready to run: https://paste.gg/p/anonymous/08f364480a5f470e9da610ab565e11c0

I need to concat bunch of string per X ms in a loop. From my understanding, LUA supports string interning, which means that string literals are "cached" and not allocated each time. Therefore, only direct calls tostring() (or .. sugar) will allocate. The rest of existing string-values will be passed by reference.

What I've done so far:

  • eliminated all integer->string allocations (via LUT)
  • although tostring(bool) does return interned string from cache, I eliminated that too
  • created pseudo-stringbuilder via table that works via indicies (~16B each)
  • "pre-resized" said table to avoid cost of associative addition and made it a global one so it is not collected and re-created each time
  • used table.concat() for final big string concatenation

The final results still make me sad:

Allocated pre-concat: 2.486328125 KB
Allocated post-concat: 39.7451171875 KB
Total table meta bytes: 1544 B
Total tostring meta bytes: 273 B

Is there something I'm missing or am I at the limit of LUA here?

CodePudding user response:

I assume that the problem you mention is linked to the memory consumption of the function CONTAINER.PopulateState. I think your code is OK, but you are not measuring the correct things. I removed all the collectgarbage in order to gather them into a single part of the code:

print("Allocated PRE-concat:                " ..  tostring(collectgarbage("count")))

-- First time
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

-- One more try
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

-- One more try
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

The results are very different and make more sense:

Allocated PRE-concat:                48.70703125
Allocated POST-concat BEFORE-COLLECT:54.3232421875
Allocated POST-concat  AFTER-COLLECT:51.8515625
Allocated POST-concat BEFORE-COLLECT:54.5576171875
Allocated POST-concat  AFTER-COLLECT:51.8515625
Allocated POST-concat BEFORE-COLLECT:54.5576171875
Allocated POST-concat  AFTER-COLLECT:51.8515625

After the initialization of the program and before calling the CONTAINER.PopulateState(), the program already use 48.7 KB.

In the first call to CONTAINER.PopulateState(), there is a small addition of 3 kilobytes of memory which seems to be persistent: this memory does not seems to be freed in the program execution. This might be due to the bytecode compilation, caching or internal use.

But the following executions of CONTAINER.PopulateState() typically use 2.7 KB of memory and this memory is released each time. The program behavior seems to be pretty consistent: the execution of CONTAINER.PopulateState() will not make the program use more memory. Actually, the memory temporary used by the function CONTAINER.PopulateState() (2.7 KB) is negligible compared to the rest of the program (48 KB).

If you want to have a better control of what is happening, you could implement this part using C language and provide an interface to Lua.

Full code:

CONTAINER =
{
      Ver = "0.3",
      --- integer lookup for the DateTime
      timeLUT = {[0]="00",[1]="01",[2]="02",[3]="03",[4]="04",[5]="05",[6]="06",[7]="07",[8]="08",[9]="09"},
      strCACHE = { [100] = ""},
      SubStrA  = "Unknown",
      SubAPrst = "ASjdasda",
}
     
    
for i = 10,99,1 do
  CONTAINER.timeLUT[i] = tostring(i)
end
        
DataBlob = {
  vAng = { x = 1.0, y = 2.0, z = 3.0},
  vPos = { x = 2131.0, y = 42.0, z = -433.0},
    
  Composite =
  {
        VARIANT1 = { isFirst = true, isMiddle = false, isLast = true },
        VARIANT2 = { isIgnored = true},
        VARIANT3 = { isAccurate = false },
        VARIANT4 = { bEnabled = false },
        VARIANT5 = { isLocked = false, ImpactV = 1.8 },
        VARIANT6 = { troCoWal = true },
        VARIANT7 = { isBroCal = false }
  } 
} 
    
Global = {
  isLocked = function(x)return false end,
  GetTimeStamp = function(x)return math.random()   math.random(1, 99) end,
  GetLocalTimeStamp = function(x)return math.random()   math.random(1, 99) end,
  GetTotalPTime = function(x)return math.random()   math.random(1, 99) end,
  GetDataBlob = function(x)return DataBlob end,
  GetName = function(x)return "AThing" end
}
    
function CONTAINER.PopulateState()
 
    local gcInit = 0
    local gcLast = 0
    
  -- Cachig globals
    
  local floor, mod, tostring = math.floor, math.mod, tostring
  local G = Global
  local intCache = CONTAINER.timeLUT
  local strBuilder = CONTAINER.strCACHE
    
  -- Fetching & caching data
  local locDB, Name = G.GetDataBlob(), G.GetName()
  local ts = G.GetTimeStamp()
  local lag = math.random()   math.random(1, 2)
    
  -- Local helpers
  local function sBool(bool)
    return bool and "1" or "0"
  end
    
  local t = 0
    
  function cAppend(cTbl, ...)
    for i=0, arg.n do
      cTbl[#cTbl 1] = arg[i]
      t = t  1
    end
  end

  function cClear(cTbl)
    for _=0, #cTbl do
      cTbl[#cTbl] = nil
    end
  end
        
  -- Populating table
  cClear(strBuilder)
        
  if locDB ~= nil then
    locDB = G.GetDataBlob()
    local PC = locDB.Composite
    local tp = G.GetTotalPTime()
    local d, h, m, s = floor(tp/86400), floor(mod(tp, 86400)/3600), floor(mod(tp,3600)/60), floor(mod(tp,60))
    
    cAppend(strBuilder,  "[", Name, "]:\n",
            "Ang :",      "(", tostring(locDB.vAng.x),",",tostring(locDB.vAng.y),",",tostring(locDB.vAng.z), ")\n",
            "Pos :",      "(", tostring(locDB.vPos.x),",",tostring(locDB.vPos.y),",",tostring(locDB.vPos.z), ")\n",
            "isLocked: ", sBool(G.isLocked()),  "\n")
        
    if (locDB.Composite["VARIANT1"] ~= nil) then
      cAppend(strBuilder, "isFirst / isLast: ", sBool(PC.VARIANT1.isFirst)," / ",sBool(PC.VARIANT1.isLast), "\n",   
              "isMiddle: ",         sBool(PC.VARIANT1.isMiddle), "\n")
    end
    
    if (locDB.Composite["VARIANT2"] ~= nil) then
      cAppend(strBuilder, "isIgnored: ",  sBool(PC.VARIANT2.isIgnored),  "\n")
    end
    
    if (locDB.Composite["VARIANT4"] ~= nil) then
      cAppend(strBuilder, "bEnabled: ",   sBool(PC.VARIANT4.bEnabled),   "\n")
    end
    
    if (locDB.Composite["VARIANT3"] ~= nil) then
      cAppend(strBuilder, "isAccurate: ", sBool(PC.VARIANT3.isAccurate), "\n")
    end
    
    if (locDB.Composite["VARIANT5"] ~= nil) then
      cAppend(strBuilder, "isLocked: ",   sBool(PC.VARIANT5.isLocked),   "\n",
              "ImpactV: ",    tostring(PC.VARIANT5.ImpactV), "\n")
    end
    
    if (locDB.Composite["VARIANT6"]) then
      cAppend(strBuilder, "troCoWal: ",   sBool(PC.VARIANT6.troCoWal),   "\n")
    end
    
    if (locDB.Composite["VARIANT7"]) then
          cAppend(strBuilder, "isBroCal: ",   sBool(PC.VARIANT7.isBroCal),   "\n")
    end
    
    cAppend(strBuilder, "Time taken: ",intCache[d],":",intCache[h],":",intCache[m],":",intCache[s], "\n",
    
                        "TS: ",        tostring(ts),                   "\n",    
                        "local TS: ",  tostring(G.GetLocalTimeStamp()),"\n",    
                        "Lag: ",       string.format("%.5f", lag) , " ms\n",    
                        "Heap: ",      tostring(gcLast),             "KB\n")
    
 
    
    cAppend(strBuilder, "Alloc: ",     tostring(gcLast-gcInit),"KB"," (v", CONTAINER.Ver, ")","\n",
    
                        "Extra: ",     CONTAINER.SubStrA, "_", CONTAINER.SubAPrst,             "\n")    
  end
end
    
 
print("Allocated PRE-concat:                " ..  tostring(collectgarbage("count")))

-- First time
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

-- One more try
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

-- One more try
CONTAINER.PopulateState()
print("Allocated POST-concat BEFORE-COLLECT:" ..  tostring(collectgarbage("count")))
collectgarbage("collect") 
print("Allocated POST-concat  AFTER-COLLECT:" ..  tostring(collectgarbage("count")))

CodePudding user response:

You want to minimize the number of intermediate allocations of strings object in order to reduce the GC pressure and slow down GC hits. In this case, I suggest you to limit yourself to 1 call to string.format with the string your want to format:

  • The string format can be declared globally so that it is interned once.
  • The string.format code can be read here. What we can see from this code is that the intermediate string transformations are done on the C stack with a buffer of LUAL_BUFFERSIZE bytes. This size is declared in luaconf.h and can be customized according to your needs. This approach should be the most efficient for your use-case as you just drop all the intermediate steps (table insertions, table.concat, etc).
local MY_STRING_FORMAT = [[My Very Big String
param-string-1 %d
param-string-2 %x
param-string-3 %f
param-string-4 %d
param-string-5 %d
]]

string.format(MY_STRING_FORMAT,
              Param1,
              Param2,
              Param3,
              Param4,
              Param5,
              etc...)
  • Related