Image Feature Extraction

When extracting data from technical images, one is solving an inverse problem. The pixels in the image and their x,y locations are essentially a set of observations, from which we would like to discern the underlying inputs that produced them. An important initial step in processing technical documents is image feature extraction. These features need to be normalized and applicable across as wide an array of image types as possible, and computationally tractable. A good candidate from myriad possibilities is Gaussian mixture modeling of pixel x,y locations. Free parameters in the mixture are rapidly optimized via one or more algorithms, and further, features relevant to rescaling data ie., tics are captured at the boundaries in the same step. In the figure below, the top panel is an image of two data series, one solid and one dashed. The second row is the result of fitting 120 Gaussians to the pixel locations.

The simple application outputs Gaussian model parameters and also produces a QC image gauss_out.png, from which you can determine how to rescale your data. For example, consider the output image below:

As you can see, Gaussians with index 74 and 113 have corresponding x labels, likewise indices 98 and 65 are associated with ylabels. The math for rescaling the Gaussian mixture parameters is left as an exercise 🙂