Home > Software engineering >  Nudity detection in python
Nudity detection in python

Time:10-17

I'm classifying a dataset of 10.000 images into nudes/not nudes in google colab(python)

I'm using the NudeClassifier from nudenet, which mainly works like this.

from nudenet import NudeClassifier

# initialize classifier (downloads the checkpoint file automatically the first time)
classifier = NudeClassifier()

# A. Classify single image
print(classifier.classify('./image1.jpg'))

# This would print something like:
# {
#   './image1.jpg': {
#      'safe': 0.00015856953, 
#      'unsafe': 0.99984145
#   }
# }

# B. Classify multiple images
# Returns {'path_to_image_1': {'safe': PROBABILITY, 'unsafe': PROBABILITY}}
# Classify multiple images (batch prediction)
# batch_size is optional; defaults to 4
print(
    classifier.classify(
        ['./image1.jpg', './image2.jpg', './image3.jpg', './image4.jpg'],
        batch_size=4
    )
)

The problem is that, using a loop and classifying each image individually takes a lot of time (1s aprox. for image)

Using this last option, would a bigger bacth_size make the classification problem run faster?

in that case, which would be the ideal batch_size for this problem?

thank you very much

CodePudding user response:

For the purpose of inference, the batch size just defines how many images will be loaded into the memory to be processed at once. So the larger the batch size, the faster the inference will be.

Of course the limit is the size of your memory, so you can estimate how far you can raise it.

If it is still too slow, consider inference on GPU.

CodePudding user response:

Increasing the batch size defenitely improves the inference time on multiple images. The batch size will vary depending upon the GPU memory available. Try to set a batch size that completely utilises the available GPU.

  • Related