Fast "max entropy" thresholding for 16 bit images with CImg

In this article I shown a C++ implementation of the "max entropy" threshold algorithm using the CImg library. This implementation also performs for 16 and 32 bit / float images.

First, a little bit of context: Some time ago I implemented the OTSU threshold algorithm as a pre-processing step for image binarization. I used that threshold algorithm to distinguish the noise pixels from the potential "star" pixels. This worked quite well for the high contrast input images at that time. However, for weak stars it unfortunately failed badly.

The Max Entropy Threshold Algorithm

So I needed a threshold function which was able to address this problem! Again, I came across ImageJ. ImageJ has a list of many different threshold algorithms which you can directly apply to your image. This way I found the max entropy threshold algorithm which turned out to work perfectly for the old and the new input images.

The algorithm was first proposed by Karpur J.N., Sahoo P.K. and Wong A.K.C. (1985): "A New Method for Gray-Level Picture Thresholding Using the Entropy of the Histogram". In June 2007 this algorithm was ported to the ImageJ library by G. Landini.

Dealing with the histogram

I found an implementation of this algorithm on the web. I translated this algorithm to a simple C++ program and... it turned out that it is terribly slow! It was easy to find out why: The max entropy threshold algorithm works with histograms. For my tests I was using a 16 bit image. A 16 bit image can have different brightness levels i.e. the corresponding histogram will have 65536 different entries. The bigger the histogram, the higher the runtime of the max entropy algorithm - more precisely the order of execution is more or less with as histogram size.

However, running the ImageJ Java implementation on my 16 bit image worked perfectly well - even for float images. In fact there was no difference in the runtime between images with different bit depths. So I had a closer look to the code. It turned out that they do a simple "trick". The histogram is being "crushed" before the actual max entropy algorithm runs. Then, finally, the calculated threshold value needs to be transformed back, so that it fits to the original input image.

This was just not obvious from the code snippet. The ImageJ plugin code uses additional histogram classes which do the job under the hood. That did not really help to understand what was going on. However, after a while it became clear what needed to be done. The resulting C++ code is shown below. The comment sections indicate the additional steps which were necessary.

Code example

The following line compiles the code above:

The result

The resulting binary can be executed with:

The output is the calculated threshold (plus some debug output):

I used the following three images as input for this test:

The first image shows the result for a weak star image.

The second image shows the effect of the max entropy threshold algorithm on a noise image. Obviously the max entropy threshold algorithm alone is not enough to decide if there is a star or not.

The last image shows a high contrast star image and the corresponding max entropy threshold image. Note the dot in the upper, left corner which also hits the threshold criteria.

Summary

In short, one can see that the algorithm does what it is supposed to do. It works for high contrast stars as well as for weak stars. It also behaves as expected for noise. Looking at the result also means, that the max entropy threshold algorithm alone is not sufficient to decide if there is a star in the area or not. But it can be combined for example with the Signal-to-noise ration (SNR). If for example the SNR is reasonable, the max entropy threshold algorithm can be applied to decide if there are one or more stars in the selected area.

Leave a Reply

Your email address will not be published. Required fields are marked *