AA1: Machine learning

Showing posts with label Machine learning. Show all posts

Thursday, 16 March 2023

Machine Learning to Predict the Adsorption Capacity of Microplastics

Nanomaterials 2023, 13(6), 1061

Nowadays, there is an extensive production and use of plastic materials for different industrial activities. These plastics, either from their primary production sources or through their own degradation processes, can contaminate ecosystems with micro- and nanoplastics. Once in the aquatic environment, these microplastics can be the basis for the adsorption of chemical pollutants, favoring that these chemical pollutants disperse more quickly in the environment and can affect living beings. Due to the lack of information on adsorption, three machine learning models (random forest, support vector machine, and artificial neural network) were developed to predict different microplastic/water partition coefficients (log Kd) using two different approximations (based on the number of input variables). The best-selected machine learning models present, in general, correlation coefficients above 0.92 in the query phase, which indicates that these types of models could be used for the rapid estimation of the absorption of organic contaminants on microplastics.

Wednesday, 14 December 2022

Global Solar Irradiation Modelling and Prediction Using Machine Learning Models for Their Potential Use in Renewable Energy Applications

Mathematics 2022, 10(24), 4746

Global solar irradiation is an important variable that can be used to determine the suitability of an area to install solar systems; nevertheless, due to the limitations of requiring measurement stations around the entire world, it can be correlated with different meteorological parameters. To confront this issue, different locations in Rias Baixas (Autonomous Community of Galicia, Spain) and combinations of parameters (month and average temperature, among others) were used to develop various machine learning models (random forest -RF-, support vector machine -SVM- and artificial neural network -ANN-). These three approaches were used to model and predict (one month ahead) monthly global solar irradiation using the data from six measurement stations. Afterwards, these models were applied to seven different measurement stations to check if the knowledge acquired could be extrapolated to other locations. In general, the ANN models offered the best results for the development and testing phases of the model, as well as for the phase of knowledge extrapolation to other locations. In this sense, the selected ANNs obtained a mean absolute percentage error (MAPE) value between 3.9 and 13.8% for the model development and an overall MAPE between 4.1 and 12.5% for the other seven locations. ANNs can be a capable tool for modelling and predicting monthly global solar irradiation in areas where data are available and for extrapolating this knowledge to nearby areas.

Friday, 2 December 2022

Comparison of machine learning techniques for reservoir outflow forecasting

Nat. Hazards Earth Syst. Sci., 22, 3859–3874, 2022

Reservoirs play a key role in many human soci- eties due to their capability to manage water resources. In addition to their role in water supply and hydropower pro- duction, their ability to retain water and control the flow makes them a valuable asset for flood mitigation. This is a key function, since extreme events have increased in the last few decades as a result of climate change, and therefore, the application of mechanisms capable of mitigating flood dam- age will be key in the coming decades. Having a good esti- mation of the outflow of a reservoir can be an advantage for water management or early warning systems. When histori- cal data are available, data-driven models have been proven a useful tool for different hydrological applications. In this sense, this study analyzes the efficiency of different machine learning techniques to predict reservoir outflow, namely mul- tivariate linear regression (MLR) and three artificial neu- ral networks: multilayer perceptron (MLP), nonlinear au- toregressive exogenous (NARX) and long short-term mem- ory (LSTM). These techniques were applied to forecast the outflow of eight water reservoirs of different characteristics located in the Miño River (northwest of Spain). In general, the results obtained showed that the proposed models pro- vided a good estimation of the outflow of the reservoirs, im- proving the results obtained with classical approaches such as to consider reservoir outflow equal to that of the previous day. Among the different machine learning techniques anaAbstract. Reservoirs play a key role in many human soci- eties due to their capability to manage water resources. In addition to their role in water supply and hydropower pro- duction, their ability to retain water and control the flow makes them a valuable asset for flood mitigation. This is a key function, since extreme events have increased in the last few decades as a result of climate change, and therefore, the application of mechanisms capable of mitigating flood dam- age will be key in the coming decades. Having a good esti- mation of the outflow of a reservoir can be an advantage for water management or early warning systems. When histori- cal data are available, data-driven models have been proven a useful tool for different hydrological applications. In this sense, this study analyzes the efficiency of different machine learning techniques to predict reservoir outflow, namely mul- tivariate linear regression (MLR) and three artificial neu- ral networks: multilayer perceptron (MLP), nonlinear au- toregressive exogenous (NARX) and long short-term mem- ory (LSTM). These techniques were applied to forecast the outflow of eight water reservoirs of different characteristics located in the Miño River (northwest of Spain). In general, the results obtained showed that the proposed models pro- vided a good estimation of the outflow of the reservoirs, im- proving the results obtained with classical approaches such as to consider reservoir outflow equal to that of the previous day. Among the different machine learning techniques analyzed, the NARX approach was the option that provided the best estimations on average.

Thursday, 17 February 2022

Integrated Machine Learning and Chemoinformatics-Based Screening of Mycotic Compounds against Kinesin Spindle ProteinEg5 for Lung Cancer Therapy

Molecules 2022, 27(5), 1639

Among the various types of cancer, lung cancer is the second most-diagnosed cancer worldwide. The kinesin spindle protein, Eg5, is a vital protein behind bipolar mitotic spindle establishment and maintenance during mitosis. Eg5 has been reported to contribute to cancer cell migration and angiogenesis impairment and has no role in resting, non-dividing cells. Thus, it could be considered as a vital target against several cancers, such as renal cancer, lung cancer, urothelial carcinoma, prostate cancer, squamous cell carcinoma, etc. In recent years, fungal secondary metabolites from the Indian Himalayan Region (IHR) have been identified as an important lead source in the drug development pipeline. Therefore, the present study aims to identify potential mycotic secondary metabolites against the Eg5 protein by applying integrated machine learning, chemoinformatics based in silico-screening methods and molecular dynamic simulation targeting lung cancer. Initially, a library of 1830 mycotic secondary metabolites was screened by a predictive machine-learning model developed based on the random forest algorithm with high sensitivity (1) and an ROC area of 0.99. Further, 319 out of 1830 compounds screened with active potential by the model were evaluated for their drug-likeness properties by applying four filters simultaneously, viz., Lipinski’s rule, CMC-50 like rule, Veber rule, and Ghose filter. A total of 13 compounds passed from all the above filters were considered for molecular docking, functional group analysis, and cell line cytotoxicity prediction. Finally, four hit mycotic secondary metabolites found in fungi from the IHR were screened viz., (−)-Cochlactone-A, Phelligridin C, Sterenin E, and Cyathusal A. All compounds have efficient binding potential with Eg5, containing functional groups like aromatic rings, rings, carboxylic acid esters, and carbonyl and with cell line cytotoxicity against lung cancer cell lines, namely, MCF-7, NCI-H226, NCI-H522, A549, and NCI H187. Further, the molecular dynamics simulation study confirms the docked complex rigidity and stability by exploring root mean square deviations, root mean square fluctuations, and radius of gyration analysis from 100 ns simulation trajectories. The screened compounds could be used further to develop effective drugs against lung and other types of cancer.

Friday, 8 October 2021

Machine Learning Applied to the Oxygen-18 Isotopic Composition, Salinity and Temperature/Potential Temperature in the Mediterranean Sea

Mathematics 2021, 9(19), 2523

This study proposed different techniques to estimate the isotope composition (δ18O), salinity and temperature/potential temperature in the Mediterranean Sea using five different variables: (i–ii) geographic coordinates (Longitude, Latitude), (iii) year, (iv) month and (v) depth. Three kinds of models based on artificial neural network (ANN), random forest (RF) and support vector machine (SVM) were developed. According to the results, the random forest models presents the best prediction accuracy for the querying phase and can be used to predict the isotope composition (mean absolute percentage error (MAPE) around 4.98%), salinity (MAPE below 0.20%) and temperature (MAPE around 2.44%). These models could be useful for research works that require the use of past data for these variables.

Wednesday, 23 June 2021

Metal and metalloid profile as a fingerprint for traceability of wines under any Galician protected designation of origin

Journal of Food Composition and Analysis, 102, 104043, 2021

Effective and cheap methods for detecting fraud and, guaranteeing wine authenticity, are of paramount importance in the sector. In this sense, three different kinds of prediction models (random forest, artificial neural networks, and support vector machines) were developed to classify wines, according to their element contents (metals and metalloids, obtained using an inductively coupled plasma with a quadrupole mass spectrometer, and an optic emission spectrophotometer). One models were developed using 45 inputs variables, and then the models were subjected to a process of reducing variables to simplify models and save material and time costs. A total accuracy was reached in all phases for the white wines-random forest models. From a practical point of view, the accuracy and the errors obtained by the selected models (except for red wines-artificial neural network developed using reduced variables) are acceptable. The models developed with fewer variables, can make the prediction task easier.

Monday, 13 July 2020

Stability assessment of extracts obtained from Arbutus unedo L. fruits in powder and solution systems using machine-learning methodologies

Food Chem. 2020, 333,127460

Arbutus unedo L. (strawberry tree) has showed considerable content in phenolic compounds, especially flavan-3-ols (catechin, gallocatechin, among others). The interest of flavan-3-ols has increased due their bioactive actions, namely antioxidant and antimicrobial activities, and by association of their consumption to diverse health benefits including the prevention of obesity, cardiovascular diseases or cancer. These compounds, mainly catechin, have been showed potential for use as natural preservative in foodstuffs; however, their degradation is increased by pH and temperature of processing and storage, which can limit their use by food industry. To model the degradation kinetics of these compounds under different conditions of storage, three kinds of machine learning models were developed: i) random forest, ii) support vector machine and iii) artificial neural network. The selected models can be used to track the kinetics of the different compounds and properties under study without the prior knowledge requirement of the reaction system.

Saturday, 4 July 2020

Stability assessment of extracts obtained from Arbutus unedo L. fruits in powder and solution systems using machine-learning methodologies

Food Chemistry, 2020, 33, 127460

DOI:10.1016/j.foodchem.2020.127460

Arbutus unedo L. (strawberry tree) has showed considerable content in phenolic compounds, especially flavan-3-ols (catechin, gallocatechin, among others). The interest of flavan-3-ols has increased due their bioactive actions, namely antioxidant and antimicrobial activities, and by association of their consumption to diverse health benefits including the prevention of obesity, cardiovascular diseases or cancer. These compounds, mainly catechin, have been showed potential for use as natural preservative in foodstuffs; however, their degradation is increased by pH and temperature of processing and storage, which can limit their use by food industry. To model the degradation kinetics of these compounds under different conditions of storage, three kinds of machine learning models were developed: i) random forest, ii) support vector machine and iii) artificial neural network. The selected models can be used to track the kinetics of the different compounds and properties under study without the prior knowledge requirement of the reaction system.

Páginas

Thursday, 16 March 2023

Wednesday, 14 December 2022

Friday, 2 December 2022

Thursday, 17 February 2022

Friday, 8 October 2021

Wednesday, 23 June 2021

Monday, 13 July 2020

Saturday, 4 July 2020