A new issue of this journal has just
been published. To see abstracts of the papers it contains (with links through
to the full papers) click here:
Selected
papers from the latest issue:
Achieving bilinearity in non-bilinear augmented first order kinetic data applying calibration transfer
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Maryam Khoshkam, Frans van den Berg, Mohsen Kompany-Zareh
In this paper a calibration transfer method is used to achieve bilinearity for augmented first order kinetic data. First, the proposed method is investigated using simulated data and next the concept is applied to experimental data. The experimental data consists of spectroscopic monitoring of the first order degradation reaction of carbaryl. This component is used for control of pests in fruits, vegetables, forages, cotton and other crops. It is highly toxic and likely human carcinogen, and is lethal to many non-target beneficial insects. The kinetic experiment is performed at different pH-values and emission wavelengths using an excitation wavelength equal to 275nm. Rate constants of different data matrices at different pH values were calculated based on a hard modeling method. Analysis of simulated and experimental data shows that if there is a deviation from bilinearity, applying the model based methods to augmented datasets leads to inaccurate results. The application of a calibration transfer method as an additional step in the hard modeling procedure improves the results, and accurate estimation of reaction rate constants are obtained. The proposed method was compared to Local Spectra Mode of Analysis (LSMA) which was proposed by Puxty et al. A comparison of the results shows that the proposed method is more efficient than LSMA and leads to less uncertainty in estimated rate constants and less percent error in the relative residuals.
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Maryam Khoshkam, Frans van den Berg, Mohsen Kompany-Zareh
In this paper a calibration transfer method is used to achieve bilinearity for augmented first order kinetic data. First, the proposed method is investigated using simulated data and next the concept is applied to experimental data. The experimental data consists of spectroscopic monitoring of the first order degradation reaction of carbaryl. This component is used for control of pests in fruits, vegetables, forages, cotton and other crops. It is highly toxic and likely human carcinogen, and is lethal to many non-target beneficial insects. The kinetic experiment is performed at different pH-values and emission wavelengths using an excitation wavelength equal to 275nm. Rate constants of different data matrices at different pH values were calculated based on a hard modeling method. Analysis of simulated and experimental data shows that if there is a deviation from bilinearity, applying the model based methods to augmented datasets leads to inaccurate results. The application of a calibration transfer method as an additional step in the hard modeling procedure improves the results, and accurate estimation of reaction rate constants are obtained. The proposed method was compared to Local Spectra Mode of Analysis (LSMA) which was proposed by Puxty et al. A comparison of the results shows that the proposed method is more efficient than LSMA and leads to less uncertainty in estimated rate constants and less percent error in the relative residuals.
Highlights
► Calibration transfer method is used to achieve bilinearity in augmented first order kinetic data for first time. ► The data were analyzed based on hard modelling methods. ► calibration transfer was used as an extra step inside the procedure. ► It is shown that using calibration transfer in hard modelling methods improve the results. ► A general and simple method is proposed for correction of non-bilinearity in full rank systems.Again about partial least squares and feature selection
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Piotr Zerzucha, Beata Walczak
Permutation (randomization) tests are often used to establish significance of experimental features in the classification or regression PLS models. Standard approach assumes that permutations are performed for the data objects, so the data correlation structure is preserved. In our study, this approach was compared with the UVE-PLS method and its modification, RUVE-PLS. Results of the intensive simulation study give evidence that permutation of objects is not a proper approach in the case of the PLS models and it should be replaced by permutation performed for individual features, and then the performance of all the compared methods is very similar. Performance of UVE-PLS is never worse than performance of R-PLS and it allows fast computations of the statistics of interest (stability of regression coefficients of the PLS model).
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Piotr Zerzucha, Beata Walczak
Permutation (randomization) tests are often used to establish significance of experimental features in the classification or regression PLS models. Standard approach assumes that permutations are performed for the data objects, so the data correlation structure is preserved. In our study, this approach was compared with the UVE-PLS method and its modification, RUVE-PLS. Results of the intensive simulation study give evidence that permutation of objects is not a proper approach in the case of the PLS models and it should be replaced by permutation performed for individual features, and then the performance of all the compared methods is very similar. Performance of UVE-PLS is never worse than performance of R-PLS and it allows fast computations of the statistics of interest (stability of regression coefficients of the PLS model).
Highlights
► New method for feature selection. ► RUVE-PLS preserves distribution of the experimental variables. ► RUVE-PLS preserves correlation structure of the experimental variables. ► UVE-PLS and its modification, RUVE-PLS, perform equally well.Assessment of the chemical composition of waters associated with oil production using PARAFAC
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Fabiana Alves de Lima Ribeiro, Francisca Ferreira do Rosário, Maria Carmen Moreira Bezerra, André Luis Mathias Bastos, Vera Lúcia Alves de Melo, Ronei Jesus Poppi
In this work, Parallel Factor Analysis (PARAFAC) was used to assess the composition of produced water in 8 oil wells, using their levels of salinity, calcium, magnesium, strontium, barium and sulphate (mg/L), collected during the years 2004 and 2005. This method allowed the identification of tracers for seawater and formation water, as well as identification of standards related to seasonality. The method indicates that the variables salinity, calcium and strontium are associated with formation water, while magnesium and sulphate are associated with water injection. These variables may be used as tracers to distinguish seawater, used as injection water, and formation water, and can be very useful to evaluate the produced water composition. Seasonality aspects are associated with the variation in the levels of sulphate and magnesium, which tend to increase over time while the levels of barium usually decrease. Chemical patterns related to the original reservoirs of each oil well, called A, B and C, also were observed. Samples collected in reservoir B presented the lowest salinity, calcium, strontium and barium levels and the highest magnesium and sulphate levels, while samples from reservoir A showed intermediate levels for the same variables. Reservoir C samples presented the highest values for salinity, calcium, strontium and barium, and the lowest levels of sulphate.
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Fabiana Alves de Lima Ribeiro, Francisca Ferreira do Rosário, Maria Carmen Moreira Bezerra, André Luis Mathias Bastos, Vera Lúcia Alves de Melo, Ronei Jesus Poppi
In this work, Parallel Factor Analysis (PARAFAC) was used to assess the composition of produced water in 8 oil wells, using their levels of salinity, calcium, magnesium, strontium, barium and sulphate (mg/L), collected during the years 2004 and 2005. This method allowed the identification of tracers for seawater and formation water, as well as identification of standards related to seasonality. The method indicates that the variables salinity, calcium and strontium are associated with formation water, while magnesium and sulphate are associated with water injection. These variables may be used as tracers to distinguish seawater, used as injection water, and formation water, and can be very useful to evaluate the produced water composition. Seasonality aspects are associated with the variation in the levels of sulphate and magnesium, which tend to increase over time while the levels of barium usually decrease. Chemical patterns related to the original reservoirs of each oil well, called A, B and C, also were observed. Samples collected in reservoir B presented the lowest salinity, calcium, strontium and barium levels and the highest magnesium and sulphate levels, while samples from reservoir A showed intermediate levels for the same variables. Reservoir C samples presented the highest values for salinity, calcium, strontium and barium, and the lowest levels of sulphate.
Highlights
► Petroleum exploitation study. ► Assessment of the chemical patterns in produced water from oil wells. ► Parallel Factor Analysis (PARAFAC). ► Patterns recognition related to original reservoir and seasonality.Classifying cultivars of rice (Oryza sativa L.) based on corrected canopy reflectance spectra data using the orthogonal projections to latent structures (O-PLS) method
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Wen-Shin Lin, Chwen-Ming Yang, Bo-Jein Kuo
To improve the accuracy in discriminating plant species or genotypes in the field with canopy spectral data, a number of statistical methods incorporating measurement techniques have been developed. This study analyzed canopy reflectance spectra collected at the booting stage by using partial least square regression in combination with discriminant analysis (PLS-DA) to establish a classification model for the discrimination of three mega rice cultivars. To improve the model's capability to interpret and sharpen the separation between cultivars, PLS-DA was combined with orthogonal projection to the latent structure (O-PLS) to derive the OPLS-DA models by removing noise and the Y-orthogonal variation. The ground-based high-resolution reflectance spectra (330–1030nm) were acquired from paddy field experiments during the growing periods, and were recalculated at intervals of 10nm. With the PLS-DA approach, the total accuracy for discriminating three cultivars in the calibration datasets was 90% and was above 80% for individual cultivars. In the validation datasets, a similar capability for cultivar discrimination was obtained for both pooled and individual cultivars. However, the Y-orthogonal variation might be embedded within the PLS-DA model. Using the OPLS-DA approach, the large variation within rice cultivars (the intra variation) was effectively removed to improve the performance of both group separation and model establishment. The overall accuracy reached 100% in the calibration datasets and had superior discrimination than the PLS-DA model in the validation datasets. Therefore, the OPLS-DA method is recommended for establishing a classification model for the cultivar discrimination of rice in the vegetative phase using remotely sensed canopy reflectance spectra.
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Wen-Shin Lin, Chwen-Ming Yang, Bo-Jein Kuo
To improve the accuracy in discriminating plant species or genotypes in the field with canopy spectral data, a number of statistical methods incorporating measurement techniques have been developed. This study analyzed canopy reflectance spectra collected at the booting stage by using partial least square regression in combination with discriminant analysis (PLS-DA) to establish a classification model for the discrimination of three mega rice cultivars. To improve the model's capability to interpret and sharpen the separation between cultivars, PLS-DA was combined with orthogonal projection to the latent structure (O-PLS) to derive the OPLS-DA models by removing noise and the Y-orthogonal variation. The ground-based high-resolution reflectance spectra (330–1030nm) were acquired from paddy field experiments during the growing periods, and were recalculated at intervals of 10nm. With the PLS-DA approach, the total accuracy for discriminating three cultivars in the calibration datasets was 90% and was above 80% for individual cultivars. In the validation datasets, a similar capability for cultivar discrimination was obtained for both pooled and individual cultivars. However, the Y-orthogonal variation might be embedded within the PLS-DA model. Using the OPLS-DA approach, the large variation within rice cultivars (the intra variation) was effectively removed to improve the performance of both group separation and model establishment. The overall accuracy reached 100% in the calibration datasets and had superior discrimination than the PLS-DA model in the validation datasets. Therefore, the OPLS-DA method is recommended for establishing a classification model for the cultivar discrimination of rice in the vegetative phase using remotely sensed canopy reflectance spectra.
Highlights
► Classifying rice cultivars based on corrected canopy reflectance spectra data. ► Using PLS to construct a classification model (PLS-DA) for discriminant analysis. ► PLS-DA combined with OPLS (OPLS-DA) to remove the intra variation. ► The OPLS-DA models could improve group separation and model establishment.Combining bootstrap and uninformative variable elimination: Chemometric identification of metabonomic biomarkers by nonparametric analysis of discriminant partial least squares
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Xiao-Ming Sun, Xiao-Ping Yu, Yun Liu, Lu Xu, Duo-Long Di
Interpretation and mining of complex metabonomic data depend heavily on proper use of chemometric methods. Due to the “small n” paradigm and the absence of sufficient information concerning distribution of data, the classical parametric methods based on known theoretical distributions are sometimes unsuitable or unreliable to treat such data. Therefore, nonparametric methods requiring no or very limited assumptions provide useful alternative tools in many practical applications. In this paper, a new discriminant partial least squares combined with bootstrap and uninformative variable elimination (DPLS–BS–UVE) method is proposed for biomarker discovery in metabonomics. The method was tested on two real chromatographic data sets containing plasma metabolic profilings for S180 and H22 tumor-bearing mice. A robust version of c j was used as the cutoff criterion. The results of biomarker discovery were compared with those obtained using variable importance in the projection (VIP) as well as BS. It is demonstrated that similar results are obtained using the three methods and DPLS–BS–UVE could provide easy interpretation of raw data. When the resampling unit increases to 500, the results were not significantly affected. In conclusion, DPLS–BS–UVE is a reliable alternative method for biomarker discovery, especially when the sample size is small.
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Xiao-Ming Sun, Xiao-Ping Yu, Yun Liu, Lu Xu, Duo-Long Di
Interpretation and mining of complex metabonomic data depend heavily on proper use of chemometric methods. Due to the “small n” paradigm and the absence of sufficient information concerning distribution of data, the classical parametric methods based on known theoretical distributions are sometimes unsuitable or unreliable to treat such data. Therefore, nonparametric methods requiring no or very limited assumptions provide useful alternative tools in many practical applications. In this paper, a new discriminant partial least squares combined with bootstrap and uninformative variable elimination (DPLS–BS–UVE) method is proposed for biomarker discovery in metabonomics. The method was tested on two real chromatographic data sets containing plasma metabolic profilings for S180 and H22 tumor-bearing mice. A robust version of c j was used as the cutoff criterion. The results of biomarker discovery were compared with those obtained using variable importance in the projection (VIP) as well as BS. It is demonstrated that similar results are obtained using the three methods and DPLS–BS–UVE could provide easy interpretation of raw data. When the resampling unit increases to 500, the results were not significantly affected. In conclusion, DPLS–BS–UVE is a reliable alternative method for biomarker discovery, especially when the sample size is small.
Highlights
► DPLS-BS-UVE method was firstly used for biomarker discovery in metabonomics. ► The method is especially effective for the small dataset. ► The results were not significantly affected when the BS unit increases to 500.A new dissimilarity method integrating multidimensional mutual information and independent component analysis for non-Gaussian dynamic process monitoring
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Mudassir M. Rashid, Jie Yu
Traditional multivariate statistical processes monitoring (MSPM) techniques like principal component analysis (PCA) and partial least squares (PLS) are not well-suited in monitoring non-Gaussian processes because the derivation of T 2 and SPE indices requires the approximate multivariate Gaussian distribution of the process data. In this paper, a novel pattern analysis driven dissimilarity approach is developed by integrating multidimensional mutual information (MMI) with independent component analysis (ICA) in order to quantitatively evaluate the statistical dependency between the independent component subspaces of the normal benchmark and monitored data sets. The new MMI based ICA dissimilarity index is derived from the higher-order statistics so that the non-Gaussian process features can be extracted efficiently. Moreover, the moving-window strategy is used to deal with process dynamics. The multidimensional mutual information based ICA dissimilarity method is applied to the Tennessee Eastman Chemical process. The process monitoring results of the proposed method are demonstrated to be superior to those of the regular PCA, PCA dissimilarity, regular ICA and angle based ICA dissimilarity approaches.
Source:Chemometrics and Intelligent Laboratory Systems, Volume 115
Mudassir M. Rashid, Jie Yu
Traditional multivariate statistical processes monitoring (MSPM) techniques like principal component analysis (PCA) and partial least squares (PLS) are not well-suited in monitoring non-Gaussian processes because the derivation of T 2 and SPE indices requires the approximate multivariate Gaussian distribution of the process data. In this paper, a novel pattern analysis driven dissimilarity approach is developed by integrating multidimensional mutual information (MMI) with independent component analysis (ICA) in order to quantitatively evaluate the statistical dependency between the independent component subspaces of the normal benchmark and monitored data sets. The new MMI based ICA dissimilarity index is derived from the higher-order statistics so that the non-Gaussian process features can be extracted efficiently. Moreover, the moving-window strategy is used to deal with process dynamics. The multidimensional mutual information based ICA dissimilarity method is applied to the Tennessee Eastman Chemical process. The process monitoring results of the proposed method are demonstrated to be superior to those of the regular PCA, PCA dissimilarity, regular ICA and angle based ICA dissimilarity approaches.
Highlights
► A new dissimilarity method is developed for process monitoring. ► Integrating multidimensional mutual information with independent component analysis. ► Higher-order statistics to handle process non-Gaussianity. ► Better fault detection performance than ICA and angle based dissimilarity methods.An Iterative Hyperspectral Image Segmentation Method Using a Cross Analysis of Spectral and Spatial Information
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems
N. Gorretta, G. Rabatel, C. Fiorio, C. Lelong, J.M. Roger
The combined use of available spectral and spatial information for object detection, which has been promoted by the advent of high spatial resolution hyperspectral imaging devices, now seems essential for many application domains (characterization of urban areas, agriculture, etc.). The proposed approach called "butterfly" is focusing on this issue and realizes a spectral-spatial cooperation scheme to split images into spectrally homogeneous adjoining regions (segmentation). The main idea of the method is to extract spatial and spectral features simultaneously. For achieving this goal, it establishes some correspondences between the spatial and the spectral concepts, in order to run alternately in the two spaces. Thus, the notion of partition specific to the spatial space is associated with the notion of classes in the spectral space. In parallel, the concept of latent variable own to the spectral space is associated with the notion of image plans in the spatial space. The proposed scheme is therefore to update the features specific to each space (i.e. partition, classes, latent variables and plans) by the knowledge of the features in the complementary space and this recursively. An implementation of this generic scheme using a split and merge strategy is given. Experimental results are presented for a synthetic image and two real hyperspectral images with two different spatial resolution. Results on the set of real images are also compared to those obtained with conventional approaches.
Source:Chemometrics and Intelligent Laboratory Systems
N. Gorretta, G. Rabatel, C. Fiorio, C. Lelong, J.M. Roger
The combined use of available spectral and spatial information for object detection, which has been promoted by the advent of high spatial resolution hyperspectral imaging devices, now seems essential for many application domains (characterization of urban areas, agriculture, etc.). The proposed approach called "butterfly" is focusing on this issue and realizes a spectral-spatial cooperation scheme to split images into spectrally homogeneous adjoining regions (segmentation). The main idea of the method is to extract spatial and spectral features simultaneously. For achieving this goal, it establishes some correspondences between the spatial and the spectral concepts, in order to run alternately in the two spaces. Thus, the notion of partition specific to the spatial space is associated with the notion of classes in the spectral space. In parallel, the concept of latent variable own to the spectral space is associated with the notion of image plans in the spatial space. The proposed scheme is therefore to update the features specific to each space (i.e. partition, classes, latent variables and plans) by the knowledge of the features in the complementary space and this recursively. An implementation of this generic scheme using a split and merge strategy is given. Experimental results are presented for a synthetic image and two real hyperspectral images with two different spatial resolution. Results on the set of real images are also compared to those obtained with conventional approaches.
Linking GC-MS and PTR-TOF-MS fingerprints of food samples
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems
Luca Cappellin, Eugenio Aprea, Pablo Granitto, Ron Wehrens, Christos Soukoulis, Roberto Viola, Tilmann D. Märk, Flavia Gasperi, Franco Biasioli
Recently the first applications in food science and technology of the newly available volatile organic compounds (VOCs) detection technique proton transfer reaction - mass spectrometry, coupled with a time of flight mass analyzer (PTR-TOF-MS), have been published. In comparison with standard techniques such as GC-MS, PTR-TOF-MS has the remarkable advantage of being extremely fast but has the drawback that compound identification is more challenging and often not possible without further information. In order to better exploit and understand the analytical information entangled in the PTR-TOF-MS fingerprint and to link it with GC/SPME-MS analyses we employed two multivariate calibration methods, PLS and the more recent LASSO. We show that, while in some cases it is sufficient to consider a single PTR-TOF-MS peak in order to predict the intensity of a GC/SPME-MS peak, in general a multivariate approach is needed. We compare the performances of PLS and LASSO in terms of prediction capabilities and interpretability of the model coefficients and conclude that LASSO is more suitable for this problem. As case study, we compared GC and PTR-MS data for different matrices, namely olive oil and grana cheese.
Source:Chemometrics and Intelligent Laboratory Systems
Luca Cappellin, Eugenio Aprea, Pablo Granitto, Ron Wehrens, Christos Soukoulis, Roberto Viola, Tilmann D. Märk, Flavia Gasperi, Franco Biasioli
Recently the first applications in food science and technology of the newly available volatile organic compounds (VOCs) detection technique proton transfer reaction - mass spectrometry, coupled with a time of flight mass analyzer (PTR-TOF-MS), have been published. In comparison with standard techniques such as GC-MS, PTR-TOF-MS has the remarkable advantage of being extremely fast but has the drawback that compound identification is more challenging and often not possible without further information. In order to better exploit and understand the analytical information entangled in the PTR-TOF-MS fingerprint and to link it with GC/SPME-MS analyses we employed two multivariate calibration methods, PLS and the more recent LASSO. We show that, while in some cases it is sufficient to consider a single PTR-TOF-MS peak in order to predict the intensity of a GC/SPME-MS peak, in general a multivariate approach is needed. We compare the performances of PLS and LASSO in terms of prediction capabilities and interpretability of the model coefficients and conclude that LASSO is more suitable for this problem. As case study, we compared GC and PTR-MS data for different matrices, namely olive oil and grana cheese.
Highlights
► Two different non-invasive headspace techniques are compared: GC-MS and PTR-ToF-MS. ► Two different food matrices (olive oil and grana cheese) are analyzed. ► LASSO and PLS have been used to compare GC-MS and PTR-ToF-MS fingerprints. ► LASSO and PLS have similar performances but LASSO results are more interpretable. ► It is possible to set reliable prediction models for many compounds.Comments on Multiple Self Organising Maps (mSOMs) for simultaneous classification and prediction: Illustrated by spoilage in apples using volatile organic profiles by S.F. Sim and V. Sági-Kiss
22 May 2012,
16:05:15
Publication year:
2012
Source:Chemometrics and Intelligent Laboratory Systems
Richard G. Brereton, Virág Sági-Kiss
This paper comments on the article “Multiple Self Organising Maps (mSOMs) for simultaneous classification and prediction: Illustrated by spoilage in apples using volatile organic profiles by S.F. Sim and V. Sági-Kiss, Chemometrics and Intelligent Laboratory Systems 57–64 (2011)”. It describes the origin of most of the methods and software, from the Bristol group, which is unattributed in the original paper. The article comments about conventions for citing software, and authorship of articles, and puts the work into context.
Source:Chemometrics and Intelligent Laboratory Systems
Richard G. Brereton, Virág Sági-Kiss
This paper comments on the article “Multiple Self Organising Maps (mSOMs) for simultaneous classification and prediction: Illustrated by spoilage in apples using volatile organic profiles by S.F. Sim and V. Sági-Kiss, Chemometrics and Intelligent Laboratory Systems 57–64 (2011)”. It describes the origin of most of the methods and software, from the Bristol group, which is unattributed in the original paper. The article comments about conventions for citing software, and authorship of articles, and puts the work into context.
No comments:
Post a Comment