A new issue of this journal has just
been published. To see abstracts of the papers it contains (with links through
to the full papers) click here:
Selected
papers from the latest issue:
Gravitational search algorithm: A new feature selection method for QSAR study of anticancer potency of imidazo[4,5-b]pyridine derivatives
15 January 2013,
08:42:41
15 March 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 122
Choosing the most suitable subset of descriptors among a large number of structural parameters is one of the most important and challenging steps in quantitative structure–activity relationship (QSAR) studies. So far, many feature selection algorithms have been applied in these studies, but none of them behave generally. In this study, a binary version of gravitational search algorithm (GSA) as a novel feature selection method is developed and coded for QSAR studies. The GSA is applied as a descriptor selection tool for anticancer potency modeling of a set of imidazo[4,5-b]pyridine derivatives consisting of 65 compounds. The GSA selected descriptors were subjected to Bayesian regularized artificial neural networks to model the anticancer potency. The generated model satisfactorily describes the experimental variation in the biological activity of the data set compounds. The results of external validation (R v 2 =0.98) and internal cross-validation tests (Q LOO 2 =0.94, R L4O 2 =0.93, R L8O 2 =0.92) in conjunction with Y-randomization confirm the predictive ability, robustness and effectiveness of the generated model. Also, comparison between GSA and genetic algorithm (GA) indicates that GSA has certain advantages over the GA.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 122
Choosing the most suitable subset of descriptors among a large number of structural parameters is one of the most important and challenging steps in quantitative structure–activity relationship (QSAR) studies. So far, many feature selection algorithms have been applied in these studies, but none of them behave generally. In this study, a binary version of gravitational search algorithm (GSA) as a novel feature selection method is developed and coded for QSAR studies. The GSA is applied as a descriptor selection tool for anticancer potency modeling of a set of imidazo[4,5-b]pyridine derivatives consisting of 65 compounds. The GSA selected descriptors were subjected to Bayesian regularized artificial neural networks to model the anticancer potency. The generated model satisfactorily describes the experimental variation in the biological activity of the data set compounds. The results of external validation (R v 2 =0.98) and internal cross-validation tests (Q LOO 2 =0.94, R L4O 2 =0.93, R L8O 2 =0.92) in conjunction with Y-randomization confirm the predictive ability, robustness and effectiveness of the generated model. Also, comparison between GSA and genetic algorithm (GA) indicates that GSA has certain advantages over the GA.
Highlights
► Gravitational search algorithm (GSA) is developed and coded for QSAR studies. ► Anticancer potency of 65 imidazo[4,5-b]pyridine derivatives is investigated. ► The GSA is applied as descriptor selection tool for anticancer potency modeling. ► BR-ANN is used to model the anticancer potency using GSA selected descriptors. ► Comparison between GSA and GA indicates that GSA has certain merit over the GA.Use of multivariate chemometric algorithms on 1H NMR data to assess a soluble fiber (Plantago ovata husk) nutritional intervention
15 January 2013,
08:42:41
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
The study of nutritional interventions in humans is difficult to assess because the induced metabolic changes are lower than the natural biological variability between subjects. Due to its holistic approach, 1H NMR is one of the preferred technologies for this type of studies, even though it has a very low sensitivity. This work shows how the use of several chemometric algorithms on the measured data compensates for these drawbacks and allows the study of the effects of the nutritional intervention isolating them from the natural variability inherent to human studies. Mild to moderate hypercholesterolemic patients received either placebo or soluble fiber in a low saturated fat diet. Plasma samples were collected at week 0 and week 8. Spectra obtained with NMR equipment were processed with ANOVA simultaneous component analysis (ASCA). The application of clustering techniques revealed different responses based on the patient's basal state, which allowed the identification of responders from non-responders. Results showed a triglyceride level reduction of up to 15% (p=0.0032), with a higher reduction for those patients with a higher initial lipid profile. Moreover, line-shape fitting techniques applied to the NMR spectra allowed the conclusion that LDL (and VLDL) lipoprotein particles, and more noticeably triglycerides, moved to a profile configuration associated with lower cardiovascular risk. Results shed light on some of the metabolic modifications that husk fiber induces in humans which could not be seen with more conventional data analysis approaches. Our conclusion is that by using the right chemometric techniques it is possible to assess nutritional intervention effects in human NMR human studies despite the low sensitivity and selectivity that the technique offers today.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
The study of nutritional interventions in humans is difficult to assess because the induced metabolic changes are lower than the natural biological variability between subjects. Due to its holistic approach, 1H NMR is one of the preferred technologies for this type of studies, even though it has a very low sensitivity. This work shows how the use of several chemometric algorithms on the measured data compensates for these drawbacks and allows the study of the effects of the nutritional intervention isolating them from the natural variability inherent to human studies. Mild to moderate hypercholesterolemic patients received either placebo or soluble fiber in a low saturated fat diet. Plasma samples were collected at week 0 and week 8. Spectra obtained with NMR equipment were processed with ANOVA simultaneous component analysis (ASCA). The application of clustering techniques revealed different responses based on the patient's basal state, which allowed the identification of responders from non-responders. Results showed a triglyceride level reduction of up to 15% (p=0.0032), with a higher reduction for those patients with a higher initial lipid profile. Moreover, line-shape fitting techniques applied to the NMR spectra allowed the conclusion that LDL (and VLDL) lipoprotein particles, and more noticeably triglycerides, moved to a profile configuration associated with lower cardiovascular risk. Results shed light on some of the metabolic modifications that husk fiber induces in humans which could not be seen with more conventional data analysis approaches. Our conclusion is that by using the right chemometric techniques it is possible to assess nutritional intervention effects in human NMR human studies despite the low sensitivity and selectivity that the technique offers today.
Highlights
► Mild to moderate hypercholesterolemic patients received either placebo or Po-husk. ► ASCA discerned induced metabolic changes from natural variability between subjects. ► Our study revealed different responses that depended on the patient's basal state. ► Spectral line shape fitting algorithms helped diagnose metabolic syndrome. ► Results showed a triglyceride level reduction of up to 15% (p=0.0032).An investigation on hydrogen bonding between 3-methylindole and ethanol using trilinear decomposition of fluorescence excitation–emission matrices
15 January 2013,
08:42:41
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
The multi-state fluorescence characteristics of 3-methylindole (MI) make its spectra rich in chemical information and the spectral interpretation rather challenging. The trilinear decomposition method could be appropriate for this task and provide a deeper insight into the hydrogen bonding to MI. Taking the excitation fluorescence spectra together with the emission counterparts to formulate a three-way data array and solving the data array using the Alternating Trilinear Decomposition (ATLD) algorithm is beneficial for studying hydrogen bonding to MI in several aspects. Firstly, making full use of the excitation spectra could guarantee that the experimentally collected data contain sufficient information necessary for investigating signals originated from the weak interactions buried in the strong interaction background. Secondly, the resolution of a three-way data array could theoretically guarantee the uniqueness of the resolved component spectra with actual physical meaning. Thirdly, the ATLD algorithm resolves spectra of complex mixture and determines the spectra of corresponding individual components of different states without disturbing the complex chemical equilibrium involved. The hydrogen bonding interaction of MI with other molecules has been studied using the ATLD algorithm. A detailed investigation has been undertaken for the 1La and 1Lb states as the lowest excited singlet states which dominate the fluorescence emission of MI depending on the effect of other molecules and the surrounding microenvironment. The hydrogen bonding between indole derivatives and other molecules has been examined and some association constants involving hydrogen bond formation have been estimated and compared with theoretical simulation results or experimental observations of previous researchers.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
The multi-state fluorescence characteristics of 3-methylindole (MI) make its spectra rich in chemical information and the spectral interpretation rather challenging. The trilinear decomposition method could be appropriate for this task and provide a deeper insight into the hydrogen bonding to MI. Taking the excitation fluorescence spectra together with the emission counterparts to formulate a three-way data array and solving the data array using the Alternating Trilinear Decomposition (ATLD) algorithm is beneficial for studying hydrogen bonding to MI in several aspects. Firstly, making full use of the excitation spectra could guarantee that the experimentally collected data contain sufficient information necessary for investigating signals originated from the weak interactions buried in the strong interaction background. Secondly, the resolution of a three-way data array could theoretically guarantee the uniqueness of the resolved component spectra with actual physical meaning. Thirdly, the ATLD algorithm resolves spectra of complex mixture and determines the spectra of corresponding individual components of different states without disturbing the complex chemical equilibrium involved. The hydrogen bonding interaction of MI with other molecules has been studied using the ATLD algorithm. A detailed investigation has been undertaken for the 1La and 1Lb states as the lowest excited singlet states which dominate the fluorescence emission of MI depending on the effect of other molecules and the surrounding microenvironment. The hydrogen bonding between indole derivatives and other molecules has been examined and some association constants involving hydrogen bond formation have been estimated and compared with theoretical simulation results or experimental observations of previous researchers.
Highlights
► This paper implemented an in-situ interpretation of hydrogen bonding. ► Three types of hydrogen bond interaction were analyzed simultaneously. ► The hydrogen bond interactions were quantitatively detected for the first time. ► The trilinear decomposition method makes effective use of multi-state fluorescence.Statistical process monitoring via generalized non-negative matrix projection
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
As a famous dimension reduction technique, non-negative matrix factorization (NMF) has been used in diverse scientific fields since its appearance. In this work, we aim to propose a new statistical monitoring method based on NMF framework. Considering that the projection method is standardly used in conventional methods such as principal component analysis (PCA), a new variant of NMF method based on positively constrained projections is presented here. This algorithm also relieves the non-negative restriction for original data. Hence it can be called generalized non-negative matrix projection (GNMP). Then, we use GNMP to extract the latent variables that drive a process and to combine them with process monitoring techniques for fault detection. Kernel density estimation (KDE) is adopted to calculate the confidence limits of defined statistical metrics. In addition, corresponding contribution plots are defined for fault isolation. Afterwards, the proposed method is applied to the Tennessee Eastman process to evaluate the monitoring performance. The experiment results clearly illustrate the feasibility of the proposed method.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
As a famous dimension reduction technique, non-negative matrix factorization (NMF) has been used in diverse scientific fields since its appearance. In this work, we aim to propose a new statistical monitoring method based on NMF framework. Considering that the projection method is standardly used in conventional methods such as principal component analysis (PCA), a new variant of NMF method based on positively constrained projections is presented here. This algorithm also relieves the non-negative restriction for original data. Hence it can be called generalized non-negative matrix projection (GNMP). Then, we use GNMP to extract the latent variables that drive a process and to combine them with process monitoring techniques for fault detection. Kernel density estimation (KDE) is adopted to calculate the confidence limits of defined statistical metrics. In addition, corresponding contribution plots are defined for fault isolation. Afterwards, the proposed method is applied to the Tennessee Eastman process to evaluate the monitoring performance. The experiment results clearly illustrate the feasibility of the proposed method.
Highlights
► Propose a new variant named generalized non-negative matrix projection (GNMP). ► Define the monitoring metrics and adopt KDE to calculate the confidence limits. ► Define the contribution plots for the monitoring indices, respectively. ► Apply TE process for evaluating the monitoring performance.Nonlinear regression method with variable region selection and application to soft sensors
15 January 2013,
08:42:41
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
Regions of explanatory variables, X, are attempted to be selected in many fields such as spectral analysis and process control. A genetic algorithm-based wavelength selection (GAWLS) method is one of the methods used to select combinations of important variables from X-variables using regions as a unit of measurement. However, a partial least squares method is used as a regression method, and hence, a GAWLS method cannot handle nonlinear relationship between X and an objective variable, y. We therefore proposed a region selection method based on GAWLS and support vector regression (SVR), one of the nonlinear regression methods. The proposed method is named GAWLS–SVR. We applied GAWLS–SVR to simulation data and industrial polymer process data, and confirmed that predictive, easy-to-interpret, and appropriate models were constructed using the proposed method.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
Regions of explanatory variables, X, are attempted to be selected in many fields such as spectral analysis and process control. A genetic algorithm-based wavelength selection (GAWLS) method is one of the methods used to select combinations of important variables from X-variables using regions as a unit of measurement. However, a partial least squares method is used as a regression method, and hence, a GAWLS method cannot handle nonlinear relationship between X and an objective variable, y. We therefore proposed a region selection method based on GAWLS and support vector regression (SVR), one of the nonlinear regression methods. The proposed method is named GAWLS–SVR. We applied GAWLS–SVR to simulation data and industrial polymer process data, and confirmed that predictive, easy-to-interpret, and appropriate models were constructed using the proposed method.
Highlights
► Regions of explanatory variables (X) are attempted to be selected in many fields. ► A traditional method cannot handle nonlinear relationship between variables. ► Our goal is to select appropriate X-variable regions and construct a nonlinear model. ► We proposed new variable region selection method with support vector regression. ► The performance of the proposed method was confirmed with a variety of data sets.Product quality modelling and prediction based on wavelet relevance vector machines
15 January 2013,
08:42:41
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
In order to predict product quality and optimize production process, the product quality models need to be built. However, there are complex nonlinear relationship among the product quality parameters and the production process variables. The common methods cannot model the production process with high accuracy and the prediction intervals cannot be given by those methods. In contrast, the kernel methods can transform the original input data into a feature space via kernel function, and then the linear methods can be used to resolve the nonlinear problem accurately. Moreover, the relevance vector machine as a kernel method can give the prediction intervals, and wavelet kernel can inherit the ability of local analysis and feature extraction from the wavelet function. The product quality models based on wavelet relevance vector machine are proposed in this paper. A simulation data set, two chemistry data sets and a real field data set of zinc coating weights from strip hot-dip galvanizing are used to validate the model. The results demonstrate that the model based on wavelet relevance vector machines has a higher prediction precision than the common methods such as partial least squares(PLS), orthogonal signal correction-partial least squares(OSC-PLS), Quadratic-PLS, kernel partial least squares(KPLS), orthogonal signal correction-kernel partial least squares(OSC-PLS), least squares-support vector machines (LS-SVM) and ordinary relevance vector machines(RVM). The prediction intervals are also given by the presented model. Mexican, Morlet and Difference of Gaussian (DOG) wavelet relevance vector machines (WRVMs) for multi-group data show superior prediction performance compared to other methods mentioned above.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
In order to predict product quality and optimize production process, the product quality models need to be built. However, there are complex nonlinear relationship among the product quality parameters and the production process variables. The common methods cannot model the production process with high accuracy and the prediction intervals cannot be given by those methods. In contrast, the kernel methods can transform the original input data into a feature space via kernel function, and then the linear methods can be used to resolve the nonlinear problem accurately. Moreover, the relevance vector machine as a kernel method can give the prediction intervals, and wavelet kernel can inherit the ability of local analysis and feature extraction from the wavelet function. The product quality models based on wavelet relevance vector machine are proposed in this paper. A simulation data set, two chemistry data sets and a real field data set of zinc coating weights from strip hot-dip galvanizing are used to validate the model. The results demonstrate that the model based on wavelet relevance vector machines has a higher prediction precision than the common methods such as partial least squares(PLS), orthogonal signal correction-partial least squares(OSC-PLS), Quadratic-PLS, kernel partial least squares(KPLS), orthogonal signal correction-kernel partial least squares(OSC-PLS), least squares-support vector machines (LS-SVM) and ordinary relevance vector machines(RVM). The prediction intervals are also given by the presented model. Mexican, Morlet and Difference of Gaussian (DOG) wavelet relevance vector machines (WRVMs) for multi-group data show superior prediction performance compared to other methods mentioned above.
Highlights
► A product quality model based on wavelet relevance vector machine (WRVM) is proposed. ► Wavelet relevance vector machine model can give the exact prediction interval. ► Zinc coating weights from strip hot-dip galvanizing are predicted to validate the model. ► WRVM has a higher prediction precisions than PLS, Q-PLS KPLS, SVM and RVM.Automatic image-based estimation of texture analysis as a monitoring tool for crystal growth
15 January 2013,
08:42:41
15 February 2013
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
Online monitoring and feedback control are crucial elements in a commercial crystallization operation because they ensure that key production variables are closely regulated so as to achieve specified textural and physical properties of the end-product. Digital image texture analysis is a promising method in monitoring and control systems, and is becoming increasingly more attractive due to availability of high speed imaging devices and equally powerful computers. This paper investigates the use of texture analyses in the form of fractal dimension (FD) and energy signatures as characteristic parameters to track the crystal growth. This methodology deals with issues such as touching and overlapping problem in crystal images which limit available off-line and on-line imaging techniques. The algorithm uses a combination of thresholding and wavelet-texture analysis. The thresholding method is used to identify crystal clusters and remove empty backgrounds. Wavelet–fractal and energy signatures are performed afterwards to estimate texture on crystal clusters. A series of images obtained at different crystal growth stages during a NaCl–water–ethanol anti-solvent crystallization system is investigated and their texture characteristics as well as transform tendency during the crystallization process are evaluated.
Publication year: 2013
Source:Chemometrics and Intelligent Laboratory Systems, Volume 121
Online monitoring and feedback control are crucial elements in a commercial crystallization operation because they ensure that key production variables are closely regulated so as to achieve specified textural and physical properties of the end-product. Digital image texture analysis is a promising method in monitoring and control systems, and is becoming increasingly more attractive due to availability of high speed imaging devices and equally powerful computers. This paper investigates the use of texture analyses in the form of fractal dimension (FD) and energy signatures as characteristic parameters to track the crystal growth. This methodology deals with issues such as touching and overlapping problem in crystal images which limit available off-line and on-line imaging techniques. The algorithm uses a combination of thresholding and wavelet-texture analysis. The thresholding method is used to identify crystal clusters and remove empty backgrounds. Wavelet–fractal and energy signatures are performed afterwards to estimate texture on crystal clusters. A series of images obtained at different crystal growth stages during a NaCl–water–ethanol anti-solvent crystallization system is investigated and their texture characteristics as well as transform tendency during the crystallization process are evaluated.
No comments:
Post a Comment