Image processing technique for image segmentation-CodePudding

I'm trying to create a model that segment various part of an aerial image.

I'm using a dataset found in kaggle: https://www.kaggle.com/datasets/bulentsiyah/semantic-drone-dataset

My question regards about the right way of treat images for semantic segmentation.

In this case is it better to simply resize the images (e.g. 6000x4000 to 256x256 pixel) or is it better to resize them less but then create patches from it (e.g. 6000x4000 to 1024x1024 pixel and then patches in 256x256 pixel).

I think that resizing too much an image may cause the loss of information but at the same time patching could not guarantee a full view of the image. I also found a notebook that got 96% accuracy just by resizing so i'm not sure how to proceed: https://www.kaggle.com/code/yesa911/aerial-semantic-segmentation-96-acc/notebook

CodePudding user response：

I think there is not one correct answer to this. Dependant on the amount and size of the areas you want to segmentate, it seems unlikely to get a proper/accurate segemantion with images of your size. However, if there are only easy detectable and big areas in the image I would definetly go for the approach without patches, since the patch-approach is way more complex as it has more variables to consider (size of patches, overlapping patches, edge treatment). It would save you a lot of implementation time for preprocessing and stichting afterwards.

TLDR: I would start without patching and - if the result is sufficient - stop there. Else, try the patching approach afterwards.