[4] EMSLIBS classification contest

Let's remind EMSLIBS 2019 conference in Brno a bit. As a part of the program, we have prepared challenging benchmark dataset, which was used for the classification contest. An extensive report could be found in our recent paper, but for highlights, you are just at the right place.

Unsurprisingly, the spectroscopic community was not missed out by the recent boom of modern machine learning approaches to data processing. Thus, we felt it was an ideal time to push the limits, compare performance, and also share knowledge across the community.

Out-of-sample classification¶

The cornerstone of the contest task is the so-called out-of-sample classification. This basic concept is rather simple and well known in general machine learning, but it is useful to look closer to its spectroscopic context. Unfortunately, there are some papers published, which violates even the simplest out-of-sample principle (test spectra are used for model training). In such a case, the presented classification performance posses no valid information, and a model is completely useless.

However, to translate the "true" out-of-sample task to spectroscopy, even spectra previously unseen by the model, but measured from the same (homogeneous) sample, as other spectra used for training the model, should be restricted from test dataset. The reasoning behind this idea is simple. Having a robust experimental setup, only the intensity of spectra fluctuates slightly, but the position of spectral features remains unchanged. So, to achieve "true" out-of-sample performance, the model has to be able to generalize to spectra from new samples. Of course, those samples have to be somehow related to "training" ones. In our case, one category (or class) was composed of several physical samples of mineral powders (e.g., hematite) with slightly varying compositions allowed by this geological classification. Thus, out-of-sample classification in this task could be thought of as "according to the elemental composition". Only this kind of classification should be considered and posses real challenge, which IS NOT TRUE for two previously mentioned kinds (1. test spectra used for model training, 2. test spectra from the same homogeneous sample as training ones).

What is the best model for this task?¶

Well, this is a tough question... Seemingly, simpler models (PLS-DA, LDA) obtained slightly better performances than more up-to-date ones (ANN-based models), but with considerable data preprocessing. Especially, spectroscopic expertise and handcrafted feature selection were key factors in finding important information in the data. Leaving all work to the model (feature selection by autoencoders, UMAP and more techniques) is a highly elegant approach, but still, some minor improvements are necessary to overcome spectroscopic approaches. A detailed description of all approaches is provided in the publication, so you are welcome to try it at home.

What will be next?¶

Currently, we are working on a small improvements of the original contest/benchmark engine, where you will be able to check your performance and obtain some useful feedback (in short time will be accessible here). For long-term period, we are preparing something new. You may expect completely new challenge aimed to even broader audience.

References

[1] Vrabel et. al., Contest paper

[2] Kepes et. al., Benchmark paper