[5] Py:Visualization and PCA

Welcome back in our Hands-on section for processing of spectroscopic data in Python. At this point, we suppose you have already gone through the last post, considering data formats and importing. You should have imported the benchmark dataset and loaded the following variables:

  • trainData
  • trainClass
  • wavelengths
  • testData

Today, we will use all mentioned variables except testData. Firstly, just simple plotting and visualization, followed by the demonstration of the PCA algorithm.

more ...

[3] Py:Loading data; Exploring the benchmark dataset

Welcome back.

In our previous post, we introduced a spectroscopic dataset aimed at benchmarking classification models. In this post, we will load in that dataset. Thus, we will go through the script provided in the repository with the dataset. Then, we will go a step further by improving that script by making it faster. This will mark the beginning of a series of posts dealing with the processing of spectroscopic data in python more ...


[2] Benchmark dataset

In our previous post, we described the common characteristics of spectroscopic data (sparsity, redundancy, and high dimensionality). In this post, we describe a dataset that you can use to experiment with spectroscopic data and to get familiar with the challenges posed by its properties. In the following, we state the motivation behind the creation of the dataset and its properties.

more ...