Using an image-generation AI I'm getting centered objects on a dark background. My goal is to convert all pixels outside this object to transparent. I figured a good-enough approach would be to flood-fill from all 4 corners using a fuzzy threshold, so that similar colors are erased too. But using e.g. the following recursive approach causes a StackOverflow:
static void FillPixels(Color[][] pixels, int x, int y, Color originColor, Color fillColor, float threshold)
{
int width = pixels.Length;
int height = pixels[0].Length;
bool isInLimits = x >= 0 && x < width && y >= 0 && y < height;
if (isInLimits && ColorDistance(pixels[x][y], originColor) <= threshold)
{
pixels[x][y] = fillColor;
FillPixels(pixels, x - 1, y, originColor, fillColor, threshold);
FillPixels(pixels, x 1, y, originColor, fillColor, threshold);
FillPixels(pixels, x, y - 1, originColor, fillColor, threshold);
FillPixels(pixels, x, y 1, originColor, fillColor, threshold);
}
}
The images are up to 1024x1024 pixels in size. The specific background color is unknown -- I can instruct the image AI to make it black, but it will usually not be a precise rgb(0,0,0) -- so I'm initially color-picking dynamically on each corner. What can be done to flood fill with a threshold, or otherwise find a good mask for the object to erase its background? Thanks!
CodePudding user response:
The first thing to check is that the distance between fillColor
to originColor
is larger than the threshold.
An alternative would be to keep an explicit record of visited nodes, either with a bool[][]
or a HashSet<(int x, int y)>
.
Next thing would be to move to a iterative algorithm. Since this is images the worst case stack depth would be width*height
. This is unlikely to occur, but the actual depth might get large enough for a stackoverflow. Changing to a explicit stack should be very easy, something like:
var stack = new Stack<(int x, int y)>();
stack.Push((x, y));
while (stack.Count > 0)
{
var (x, y) = stack.Pop();
// insert logic
if(...){
stack.Push((x 1, y));
...
}
}
I would also consider using some better data types. A multidimensional array, i.e. Color[,]
would be better, then at least you know all rows have the same length. But in image processing it is fairly common to use raw data, i.e. byte[]
or Span<byte>
, and calculate pixel indices by hand: var indexToFirstPixelByte = y * span x * bytesPerPixel
, where span is the number of bytes in a row. Then you can fetch the bytes for your color directly. This should save time/memory since a Color-struct is much larger than the 4 bytes required for ARGB. Using a Point
or other type to represent a pair of x,y coordinates is probably also good idea.
CodePudding user response:
JonasH deserves the credit for the right answer that helped solve this (also thanks Andrey Ischencko), but in case anyone arriving here is looking for the complete flood fill result class, here it is:
using System.Collections.Generic;
using UnityEngine;
public static class ImageFloodFill
{
public static void FillFromPoint(Texture2D texture, Color color, Vector2Int point, float threshold = 0f)
{
var points = new Vector2Int[] { point };
FillFromPoints(texture, color, points, threshold);
}
public static void FillFromCorners(Texture2D texture, Color color, float threshold = 0f)
{
var points = new Vector2Int[]
{
new Vector2Int(0, 0),
new Vector2Int(texture.width - 1, 0),
new Vector2Int(0, texture.height - 1),
new Vector2Int(texture.width - 1, texture.height - 1)
};
FillFromPoints(texture, color, points, threshold);
}
public static void FillFromPoints(Texture2D texture, Color color, Vector2Int[] points, float threshold = 0f)
{
Color[,] pixelsGrid = GetPixelsGrid(texture);
foreach (Vector2Int point in points)
{
FillPixels(pixelsGrid, point, color, threshold);
}
texture.SetPixels(GetPixelsLinearFromGrid(pixelsGrid));
texture.Apply();
}
static void FillPixels(Color[,] pixels, Vector2Int startPoint, Color color, float threshold)
{
int width = pixels.GetLength(0);
int height = pixels.GetLength(1);
bool[,] pixelsHandled = new bool[width, height];
Color originColor = pixels[startPoint.x, startPoint.y];
var size = new RectInt(0, 0, width, height);
var stack = new Stack<Vector2Int>();
stack.Push(startPoint);
while (stack.Count > 0)
{
Vector2Int point = stack.Pop();
if (size.Contains(point) && !pixelsHandled[point.x, point.y])
{
pixelsHandled[point.x, point.y] = true;
if (ColorDistance(pixels[point.x, point.y], originColor) <= threshold)
{
pixels[point.x, point.y] = color;
stack.Push(new Vector2Int(point.x - 1, point.y));
stack.Push(new Vector2Int(point.x 1, point.y));
stack.Push(new Vector2Int(point.x, point.y - 1));
stack.Push(new Vector2Int(point.x, point.y 1));
}
}
}
}
static Color[,] GetPixelsGrid(Texture2D texture)
{
int width = texture.width;
int height = texture.height;
Color[] pixelsLinear = texture.GetPixels();
Color[,] pixels = new Color[width, height];
for (int x = 0; x < width; x )
{
for (int y = 0; y < height; y )
{
pixels[x, y] = pixelsLinear[y * height x];
}
}
return pixels;
}
static Color[] GetPixelsLinearFromGrid(Color[,] pixelsGrid)
{
int width = pixelsGrid.GetLength(0);
int height = pixelsGrid.GetLength(1);
Color[] pixelsLinear = new Color[width * height];
for (int x = 0; x < width; x )
{
for (int y = 0; y < height; y )
{
pixelsLinear[y * height x] = pixelsGrid[x, y];
}
}
return pixelsLinear;
}
static float ColorDistance(Color color1, Color color2)
{
return Mathf.Sqrt(
Mathf.Pow(color1.r - color2.r, 2)
Mathf.Pow(color1.g - color2.g, 2)
Mathf.Pow(color1.b - color2.b, 2)
);
}
}