Top 3 prediction models used in Near Infrared Spectroscopy

When applying Near Infrared Reflectance (NIR), we use prediction models to give answers to business problems. Since no business problem is the same, there are different types of models to apply in your case. In this article we clarify the 3 main models.


Near Infrared Spectroscopy (NIRS) explained

Near-infrared spectroscopy uses light to make a fingerprint which is specific for the material under study. The light interacts with the material on a molecular level. Energy from the light is absorbed and scattered, and results in vibrations of the bonds. This interaction is specific for the composition of the material and therefore it contains information that can be used to identify the material or its composition. The detector captures the light after interaction with the molecule resulting in a spectral fingerprint. For Near-Infrared the detector captures light in the range which is not visible for the human eye.


That’s the power of NIR spectroscopy, visualizing things which are invisible for the human eye. Using advanced data science technologies Xpectrum offers three approaches to implement NIR in an industrial setting.


Quantification model

A typical example in NIR spectroscopy is quantification. This model predicts the content of a particular molecule or product in a sample. A quantification model is based on data from samples with different concentrations of the molecule or product of interest. With regression analysis, the model will return a concentration as the prediction result.


Quantification models are mostly used to indicate the quality of a certain product. If you want to find out whether the product you use is high quality - as you probably were promised - this NIR prediction model is the one to use.

If you want to find out whether the product you use is high quality - as you probably were promised - this NIR prediction model is the one to use.

However, for food fraud, quantification is not appropriate in most cases to solve the problem. Perhaps you trade in fish and buy high quality fish. That doesn't mean you bought the fish you thought you did. Check out our case on fish and shellfish fraud and find out common cases of fish fraud and how to put an end to it. The qualification model, on the other hand, is suitable for detecting food fraud.


Classification model

To differentiate different products from each other with NIRS, the classification model is often applied. Based on the dataset of two different species, you can distinguish similar products from each other.


The mislabeling of fish is one of the most common forms of food fraud. The classification model makes sure you get the product you paid for, and not a cheaper variant.


Another variant of a classification model is based on multiple look-a-likes. In this case, the model predicts whether the product is the correct one, or one of the look-a-likes. A multiclass identification model will even be able to tell which look-alike product it is.


Consistent quality model

Finally, we highlight the consistent quality model. This model is used when a product can be adulterated with multiple adulterants. Instead of creating classification models for each adulteration, we create a model for consistent quality based on a dataset of pure product. The result of the prediction model will be to indicate whether the sample is consistent or not.


Note that consistent quality models generally require more data than classification models. In addition, it is generally believed that to see differences in the NIR spectrum, only adulterations of at least 1% can be detected. In food fraud, adulterations are often greater than 1% to make the process of adulteration worthwhile.



With Xpectrum, we have developed an automated data science pipeline tailored to creating models for food fraud. We can create quantification, classification or consistent quality models for any customer in no time using our Xpectrum platform.

We have developed an automated data science pipeline tailored to creating models for food fraud.

A dataset of 300 samples typically takes our pipeline about 15 minutes to come up with the best performing model after comparing 1000's of combinations. Further manual tweaking of the parameters of the model is always possible, to make the model even more performant.


Find out more about NIR and the Pure & Sure spectrometer and download our white paper.