Quantitative Modeling of Physical Properties of Crude Oil Hydrocarbons Using Volsurf+ Molecular Descriptors
Saaidpour Saadi *, Ghaderi Faraidon
Department of Chemistry, Faculty of Science
Sanandaj Branch, Islamic Azad University, Sanandaj, Iran
*E-mail: sasaaidpour@iausdj.ac.ir
Received:
Received: 29 April 2016; revised: 18 July 2016; accepted: 26 July 2016; published online: 24 August 2016
DOI: 10.12921/cmst.2016.0000022
Abstract:
The quantitative structure-property relationship (QSPR) method is used to develop the correlation between structures of crude oil hydrocarbons and their physical properties. In this study, we used VolSurf+ descriptors for QSPR modeling of the boiling point, Henry law constant and water solubility of eighty crude oil hydrocarbons. A subset of the calculated descriptors selected using stepwise regression (SR) was used in the QSPR model development. Multivariate linear regressions (MLR) are utilized to construct the linear models. The prediction results agree well with the experimental values of these properties. The comparison results indicate the superiority of the presented models and reveal that it can be effectively used to predict the boiling point, Henry law constant and water solubility values of crude oil hydrocarbons from the molecular structures alone. The stability and predictivity of the proposed models were validated using internal validation (leave one out and leave many out) and external validation. Application of the developed models to test a set of 16 compounds demonstrates that the new models are reliable with good predictive accuracy and simple formulation.
Key words:
boiling point, crude oil hydrocarbons, Henry’s law constant, volsurf+ descriptors, water solubility
References:
[1] L.M. Egolf, P.C. Jurs, Prediction of boiling points of organic heterocyclic compounds using regression and neural network techniques,
J. Chem. Inf. Comp. Sci., 33, 616-625(1993).
[2] L.H. Hall, L.B. Kier, Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence
State Information, J. Chem. Inf. Comput. Sci., 35, 1039-1045 (1995).
[3] O. Ivanciuc, T. Ivanciuc, A.T. Balaban, Quantitative structure-property relationship study of normal boiling points for halogen-/
oxygen-/ sulfur-containing organic compounds using the CODESSA program, Tetrahedron, 54, 9129-9142(1998).
[4] D. Plavsic, N. Trinajstic, D. Amic, et al., Comparison between the structure-boiling point relationships with different descriptorscondensed benzenoids, New J. Chem., 22, 1075-1078 (1998).
[5] D. Sola, A. Fer, M. Banchero, L. Manna, S.Sicardi, QSPR prediction of N-boiling point and critical properties of organic compounds
and comparison with a group-contribution method, Fluid Phase Equilib., 263(1), 33-42 (2008).
[6] D. Yi-min , Z.Z. ping , Z.Z. Cao, Y.F. zhang, J.I. Zeng, X. Li, Prediction of boiling points of organic compounds by QSPR tools,Mol. Graph. Model., 44,113-119 (2013).
[7] D. Abooali , M.A. Sobati , Novel method for prediction of normal boiling point and enthalpy of vaporization at normal boiling point
of pure refrigerants: A QSPR approach, Int. J. Refrig., 40, 282-293(2014).
[8] I. Oprisiu, G. Marcou , D. Horvath, D.B. Brunel, F. Rivollet, A.Varnek, Publicly available models to predict normal boiling pointorganic compounds, Thermochim. Acta, 553, 60-67 (2013).
[9] K. Panneerselvam, C.V.S. Brahmmananda Rao, M.P. Antony, Correlation of normal boiling points of dialkylalkyl phosphonatestopological indices on the gas chromatographic retention data, Thermochim. Acta, 600, 77-81(2015).
[10] J. Ghasemi, S. Saaidpour, Artificial Neural Network Based Quantitative Structural Property Relationship for Predicting Boiling
Points of Refrigerants, QSAR Comb. Sci., 28, 1245-1254 (2009).
[11] S. Saaidpour, A. Bahmani, A. Rostami, Prediction the Normal Boiling Points of Primary, Secondary and Tertiary Liquid Amines
from their Molecular Structure Descriptors, CMST, 21(4), 201-210 (2015).
[12] M. Goodarzi , E.V. Ortiz, L.D.S. Coelho, P.R. Duchowicz, Linear and non-linear relationships mapping the Henry’s law parameters
of organic pesticides, Atmos. Environ., 44(26), 3179-3186 (2010).
[13] P.R. Duchowicz , J.C.M. Garro, E.A. Castro, QSPR study of the Henry’s Law constant for hydrocarbons, Chemom. Intell. Lab.91(2), 133-140 (2008).
[14] H. Modarresi , H. Modarress, J.C. Dearden, QSPR model of Henry’s law constant for a diverse set of organic chemicals basedgenetic algorithm-radial basis function network approach, Chemosphere, 66(11), 2067-2076 (2007).
[15] D.R. O’Loughlin, N.J. English, Prediction of Henry’s Law Constants via group-specific quantitative structure property relationships,
Chemosphere, 127, 1-9 (2015).
[16] S. Sahoo, S. Patel, B.K. Mishra, Quantitative structure property relationship for Henry’s law constant of some alkane isomers,
Thermochim. Acta, 512 (1-2), 273-277 (2011).
[17] D. Mackay, W.S. Shiu, K.C. Ma, Henry’s law constant. In: R.S. Boethling, D. Mackay, (Eds.), Handbook of Property Estimation
Methods for Chemicals: Environmental and Health Sciences. Lewis, Boca Raton, FL, USA, pp. 69-87, 2000.
[18] A. Chapoy, A.H. Mohammadi, D. Richon, B. Tohidi, Gas solubility measurement and modeling for methane-water and methane-
ethane-n-butane-water systems at low temperature conditions, Fluid Phase Equilib., 220,113-121(2004).
[19] J.H. Gary, G.E. Handwerck, Petroleum Refining Technology and Economics, 2001.
[20] S. Mokhatab, W.A. Poe, J.G. Speight, Handbook of Natural Gas Transmission and Processing, 2006.
[21] A. Chapoy, S. Mokraoui, A. Valts, D. Richon, A.H. Mohammadi, B. Tohidi, Solubility measurement and modeling for the system
propane-water from 277.62 to 368.16 K, Fluid Phase Equilib., 226, 213-220 (2004).
[22] J. Ghasemi, S. Saaidpour, QSPR prediction of aqueous solubility of drug-like organic compounds, Chem. Pharm. Bull., 55(4),
669-674 (2007).
[23] A.R. Katritzky, L. Mu, A QSPR Study of the Solubility of Gases and Vapors in Water, J. Chem. Inf. Comput. Sci., 36 (6), 1162-1168
(1996).
[24] P.R. Duchowicz, A. Talevi, C. Bellera, L.E.B. Blanch, E.A. Castro, Application of descriptors based on Lipinski’s rules in the QSPR
study of aqueous solubilities, Bioorgan. Med. Chem., 15, 3711-3719 (2007).
[25] P.D.T. Huibers, A.R. Katritzky, Correlation of the Aqueous Solubility of Hydrocarbons and Halogenated HydrocarbonsMolecular Structure, J. Chem. Inf. Comput. Sci., 38, 283-292 (1998).
[26] P.V. Khadikar, D. Mandloi, A.V. Bajaj, Sh. Joshi, QSAR Study on Solubility of Alkanes in Water and Their Partition CoefficientsDifferent Solvent Systems Using PI Index, Bioorg. Med. Chem. Lett., 13, 419-422 (2003).
[27] J. Ghasemi, S. Saaidpour, Quantitative structure-property relationship study of n-octanol-water partition coefficients of somediverse drugs using multiple linear regression, Anal. Chim. Acta, 604, 99-106 (2007).
[28] J. Ghasemi, S. Saaidpour, S.D. Brown, QSPR study for estimation of acidity constants of some aromatic acids derivatives using
multiple linear regression (MLR) analysis, J. Mol. Struct. (Theochem), 805, 27-32 (2007).
[29] J. Ghasemi, S. Saaidpour, QSPR modeling of stability constants of diverse 15-crown-5 ethers complexes using best multiple linear
regression, J. Incl. Phenom. Macrocycl. Chem., 60(3), 339-351 (2008).
[30] J. Ghasemi, S. Saaidpour, QSRR prediction of the chromatographic retention behavior of painkiller drugs, J. Chromatogr. Sci., 47(2),
156-163 (2009).
[31] S. Saaidpour, Prediction of drug lipophilicity using back propagation artificial neural network modeling, Orient. J. Chem., 30(2),
793-802 (2014).
[32] S. Saaidpour, Prediction of the Adsorption Capability onto Activated Carbon of Liquid Aliphatic Alcohols using Molecular Fragments
Method, Iranian J. Math. Chem., 5(2), 127-142 (2014).
[33] S. Saaidpour, S.A. Zarei, F. Nasri, QSPR study of molar diamagnetic susceptibility of diverse organic compounds using multiple
linear regression analysis, Pak. J. Chem., 2(1), 6-17 (2012).
[34] S. Saaidpour, Quantitative Modeling for Prediction of Critical Temperature of Refrigerant Compounds, Phys. Chem. Res., 4(1),
61-71(2016).
[35] S. Saaidpour, S. Khaledian, Quantitative Structure-property Relationship Modelling of Distribution Coefficients (logD7.4) of Diverse
Drug by Sub-structural Molecular Fragments Method, Orient. J. Chem., 31(4), 1969-1976 (2015).
[36] S. Saaidpour, Computational Model For Chromatographic Relative Retention Time of Polychlorinated Biphenyls Using Sub-structural
Molecular Fragments, CMST, 22(1) 41-53 (2016).
[37] C.L. Yaws, Handbook of Physical Properties for Hydrocarbons and Chemicals, Houston: Gulf Publishing Co., 2005.
[38] ChemOffice 15.0, PerkinElmer, Inc., Waltham, MA, USA, 2015, http://www.cambridgesoft.com
[39] HyperChem (TM) Professional 8.0, Hypercube, Inc., 2011, Gainesville, Florida, USA, http://www.hyper.com.
[40] VolSurf+, Version 1.0.4, Molecular Discovery Ltd., 2008, http://www.moldiscovery.com.
[41] Molegro Data Modeller (MDM 2011.2.6.0), Molegro ApS., 2011, C.F. Mřllers Alle, Building 1110, DK-8000 Aarhus C, Denmark,
http://www.molegro.com/mdm-product.php.
[42] G. Cruciani, P. Crivori, P.A. Carrupt, B. Testa, molecular fields in quantitative structure permeation relationships: the VolSurf+
approach, J. Mol. Struct. (Theochem), 503, 17-30 (2000).
[43] G. Cruciani, M. Pastor, W. Guba, VolSurf+, a new tool for the pharmacokinetic optimization of lead compounds, Europ. J. Pharm.
Sci., 11, S29-S39 (2000).
[44] S. Clementi, G. Cruciani, P. Fifi, D. Riganelli, R. Valigi, G. Musumarra, A New Set of Principal Properties for Heteroaromatics
Obtained by GRID, Quant. Struct. Act. Relat., 15, 108-120 (1995).
[45] D.C. Young, Computational Chemistry, John Wiley & Sons Inc., 2001.
[46] M.J. Sharma and Y.S. Jin, Stepwise regression data envelopment analysis for variable Reduction, Appl. Math. Comput., 253, 126-134
(2015).
[47] K. Baumann, Cross-validation as the objective functions for variable-selection techniques, TrAC – Trends Anal. Chem., 22, 395-406
(2003).
[48] S. Weisberg, Applied Linear Regression, 3rd edn. Wiley, New York, 2005.
[49] D.C. Montgomery, E.A. Peck, G.G. Vining, Introduction to Linear Regression, 4th edn. Wiley, New York, 2006.
[50] K. Baumann, N. Stiefl, Validation tools for variable subset regression, J. Comput. Aided. Mol. Des., 18, 549-562 (2004).
[51] N. Chirico, P. Gramatica, Real External Predictivity of QSAR models: How to evaluate it? Comparison of Different Validation
Criteria and Proposal of Using the Concordance Correlation Coefficient, J. Chem. Inf. Model., 51, 2320-2335 (2011).
[52] N. Chirico, P. Gramatica, Real External Predictivity of QSAR Models. Part2. New intercomparable thresholds for different validation
criteria and the need for scatter plot inspection, J. Chem. Inf. Model., 52, 2044-2058 (2012).
[53] C. Rücker, G. Rücker, M. Meringer, y-Randomization and Its Variants in QSPR/QSAR, J. Chem. Inf. Model., 47, 2345-2357 (2007)
The quantitative structure-property relationship (QSPR) method is used to develop the correlation between structures of crude oil hydrocarbons and their physical properties. In this study, we used VolSurf+ descriptors for QSPR modeling of the boiling point, Henry law constant and water solubility of eighty crude oil hydrocarbons. A subset of the calculated descriptors selected using stepwise regression (SR) was used in the QSPR model development. Multivariate linear regressions (MLR) are utilized to construct the linear models. The prediction results agree well with the experimental values of these properties. The comparison results indicate the superiority of the presented models and reveal that it can be effectively used to predict the boiling point, Henry law constant and water solubility values of crude oil hydrocarbons from the molecular structures alone. The stability and predictivity of the proposed models were validated using internal validation (leave one out and leave many out) and external validation. Application of the developed models to test a set of 16 compounds demonstrates that the new models are reliable with good predictive accuracy and simple formulation.
Key words:
boiling point, crude oil hydrocarbons, Henry’s law constant, volsurf+ descriptors, water solubility
References:
[1] L.M. Egolf, P.C. Jurs, Prediction of boiling points of organic heterocyclic compounds using regression and neural network techniques,
J. Chem. Inf. Comp. Sci., 33, 616-625(1993).
[2] L.H. Hall, L.B. Kier, Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence
State Information, J. Chem. Inf. Comput. Sci., 35, 1039-1045 (1995).
[3] O. Ivanciuc, T. Ivanciuc, A.T. Balaban, Quantitative structure-property relationship study of normal boiling points for halogen-/
oxygen-/ sulfur-containing organic compounds using the CODESSA program, Tetrahedron, 54, 9129-9142(1998).
[4] D. Plavsic, N. Trinajstic, D. Amic, et al., Comparison between the structure-boiling point relationships with different descriptorscondensed benzenoids, New J. Chem., 22, 1075-1078 (1998).
[5] D. Sola, A. Fer, M. Banchero, L. Manna, S.Sicardi, QSPR prediction of N-boiling point and critical properties of organic compounds
and comparison with a group-contribution method, Fluid Phase Equilib., 263(1), 33-42 (2008).
[6] D. Yi-min , Z.Z. ping , Z.Z. Cao, Y.F. zhang, J.I. Zeng, X. Li, Prediction of boiling points of organic compounds by QSPR tools,Mol. Graph. Model., 44,113-119 (2013).
[7] D. Abooali , M.A. Sobati , Novel method for prediction of normal boiling point and enthalpy of vaporization at normal boiling point
of pure refrigerants: A QSPR approach, Int. J. Refrig., 40, 282-293(2014).
[8] I. Oprisiu, G. Marcou , D. Horvath, D.B. Brunel, F. Rivollet, A.Varnek, Publicly available models to predict normal boiling pointorganic compounds, Thermochim. Acta, 553, 60-67 (2013).
[9] K. Panneerselvam, C.V.S. Brahmmananda Rao, M.P. Antony, Correlation of normal boiling points of dialkylalkyl phosphonatestopological indices on the gas chromatographic retention data, Thermochim. Acta, 600, 77-81(2015).
[10] J. Ghasemi, S. Saaidpour, Artificial Neural Network Based Quantitative Structural Property Relationship for Predicting Boiling
Points of Refrigerants, QSAR Comb. Sci., 28, 1245-1254 (2009).
[11] S. Saaidpour, A. Bahmani, A. Rostami, Prediction the Normal Boiling Points of Primary, Secondary and Tertiary Liquid Amines
from their Molecular Structure Descriptors, CMST, 21(4), 201-210 (2015).
[12] M. Goodarzi , E.V. Ortiz, L.D.S. Coelho, P.R. Duchowicz, Linear and non-linear relationships mapping the Henry’s law parameters
of organic pesticides, Atmos. Environ., 44(26), 3179-3186 (2010).
[13] P.R. Duchowicz , J.C.M. Garro, E.A. Castro, QSPR study of the Henry’s Law constant for hydrocarbons, Chemom. Intell. Lab.91(2), 133-140 (2008).
[14] H. Modarresi , H. Modarress, J.C. Dearden, QSPR model of Henry’s law constant for a diverse set of organic chemicals basedgenetic algorithm-radial basis function network approach, Chemosphere, 66(11), 2067-2076 (2007).
[15] D.R. O’Loughlin, N.J. English, Prediction of Henry’s Law Constants via group-specific quantitative structure property relationships,
Chemosphere, 127, 1-9 (2015).
[16] S. Sahoo, S. Patel, B.K. Mishra, Quantitative structure property relationship for Henry’s law constant of some alkane isomers,
Thermochim. Acta, 512 (1-2), 273-277 (2011).
[17] D. Mackay, W.S. Shiu, K.C. Ma, Henry’s law constant. In: R.S. Boethling, D. Mackay, (Eds.), Handbook of Property Estimation
Methods for Chemicals: Environmental and Health Sciences. Lewis, Boca Raton, FL, USA, pp. 69-87, 2000.
[18] A. Chapoy, A.H. Mohammadi, D. Richon, B. Tohidi, Gas solubility measurement and modeling for methane-water and methane-
ethane-n-butane-water systems at low temperature conditions, Fluid Phase Equilib., 220,113-121(2004).
[19] J.H. Gary, G.E. Handwerck, Petroleum Refining Technology and Economics, 2001.
[20] S. Mokhatab, W.A. Poe, J.G. Speight, Handbook of Natural Gas Transmission and Processing, 2006.
[21] A. Chapoy, S. Mokraoui, A. Valts, D. Richon, A.H. Mohammadi, B. Tohidi, Solubility measurement and modeling for the system
propane-water from 277.62 to 368.16 K, Fluid Phase Equilib., 226, 213-220 (2004).
[22] J. Ghasemi, S. Saaidpour, QSPR prediction of aqueous solubility of drug-like organic compounds, Chem. Pharm. Bull., 55(4),
669-674 (2007).
[23] A.R. Katritzky, L. Mu, A QSPR Study of the Solubility of Gases and Vapors in Water, J. Chem. Inf. Comput. Sci., 36 (6), 1162-1168
(1996).
[24] P.R. Duchowicz, A. Talevi, C. Bellera, L.E.B. Blanch, E.A. Castro, Application of descriptors based on Lipinski’s rules in the QSPR
study of aqueous solubilities, Bioorgan. Med. Chem., 15, 3711-3719 (2007).
[25] P.D.T. Huibers, A.R. Katritzky, Correlation of the Aqueous Solubility of Hydrocarbons and Halogenated HydrocarbonsMolecular Structure, J. Chem. Inf. Comput. Sci., 38, 283-292 (1998).
[26] P.V. Khadikar, D. Mandloi, A.V. Bajaj, Sh. Joshi, QSAR Study on Solubility of Alkanes in Water and Their Partition CoefficientsDifferent Solvent Systems Using PI Index, Bioorg. Med. Chem. Lett., 13, 419-422 (2003).
[27] J. Ghasemi, S. Saaidpour, Quantitative structure-property relationship study of n-octanol-water partition coefficients of somediverse drugs using multiple linear regression, Anal. Chim. Acta, 604, 99-106 (2007).
[28] J. Ghasemi, S. Saaidpour, S.D. Brown, QSPR study for estimation of acidity constants of some aromatic acids derivatives using
multiple linear regression (MLR) analysis, J. Mol. Struct. (Theochem), 805, 27-32 (2007).
[29] J. Ghasemi, S. Saaidpour, QSPR modeling of stability constants of diverse 15-crown-5 ethers complexes using best multiple linear
regression, J. Incl. Phenom. Macrocycl. Chem., 60(3), 339-351 (2008).
[30] J. Ghasemi, S. Saaidpour, QSRR prediction of the chromatographic retention behavior of painkiller drugs, J. Chromatogr. Sci., 47(2),
156-163 (2009).
[31] S. Saaidpour, Prediction of drug lipophilicity using back propagation artificial neural network modeling, Orient. J. Chem., 30(2),
793-802 (2014).
[32] S. Saaidpour, Prediction of the Adsorption Capability onto Activated Carbon of Liquid Aliphatic Alcohols using Molecular Fragments
Method, Iranian J. Math. Chem., 5(2), 127-142 (2014).
[33] S. Saaidpour, S.A. Zarei, F. Nasri, QSPR study of molar diamagnetic susceptibility of diverse organic compounds using multiple
linear regression analysis, Pak. J. Chem., 2(1), 6-17 (2012).
[34] S. Saaidpour, Quantitative Modeling for Prediction of Critical Temperature of Refrigerant Compounds, Phys. Chem. Res., 4(1),
61-71(2016).
[35] S. Saaidpour, S. Khaledian, Quantitative Structure-property Relationship Modelling of Distribution Coefficients (logD7.4) of Diverse
Drug by Sub-structural Molecular Fragments Method, Orient. J. Chem., 31(4), 1969-1976 (2015).
[36] S. Saaidpour, Computational Model For Chromatographic Relative Retention Time of Polychlorinated Biphenyls Using Sub-structural
Molecular Fragments, CMST, 22(1) 41-53 (2016).
[37] C.L. Yaws, Handbook of Physical Properties for Hydrocarbons and Chemicals, Houston: Gulf Publishing Co., 2005.
[38] ChemOffice 15.0, PerkinElmer, Inc., Waltham, MA, USA, 2015, http://www.cambridgesoft.com
[39] HyperChem (TM) Professional 8.0, Hypercube, Inc., 2011, Gainesville, Florida, USA, http://www.hyper.com.
[40] VolSurf+, Version 1.0.4, Molecular Discovery Ltd., 2008, http://www.moldiscovery.com.
[41] Molegro Data Modeller (MDM 2011.2.6.0), Molegro ApS., 2011, C.F. Mřllers Alle, Building 1110, DK-8000 Aarhus C, Denmark,
http://www.molegro.com/mdm-product.php.
[42] G. Cruciani, P. Crivori, P.A. Carrupt, B. Testa, molecular fields in quantitative structure permeation relationships: the VolSurf+
approach, J. Mol. Struct. (Theochem), 503, 17-30 (2000).
[43] G. Cruciani, M. Pastor, W. Guba, VolSurf+, a new tool for the pharmacokinetic optimization of lead compounds, Europ. J. Pharm.
Sci., 11, S29-S39 (2000).
[44] S. Clementi, G. Cruciani, P. Fifi, D. Riganelli, R. Valigi, G. Musumarra, A New Set of Principal Properties for Heteroaromatics
Obtained by GRID, Quant. Struct. Act. Relat., 15, 108-120 (1995).
[45] D.C. Young, Computational Chemistry, John Wiley & Sons Inc., 2001.
[46] M.J. Sharma and Y.S. Jin, Stepwise regression data envelopment analysis for variable Reduction, Appl. Math. Comput., 253, 126-134
(2015).
[47] K. Baumann, Cross-validation as the objective functions for variable-selection techniques, TrAC – Trends Anal. Chem., 22, 395-406
(2003).
[48] S. Weisberg, Applied Linear Regression, 3rd edn. Wiley, New York, 2005.
[49] D.C. Montgomery, E.A. Peck, G.G. Vining, Introduction to Linear Regression, 4th edn. Wiley, New York, 2006.
[50] K. Baumann, N. Stiefl, Validation tools for variable subset regression, J. Comput. Aided. Mol. Des., 18, 549-562 (2004).
[51] N. Chirico, P. Gramatica, Real External Predictivity of QSAR models: How to evaluate it? Comparison of Different Validation
Criteria and Proposal of Using the Concordance Correlation Coefficient, J. Chem. Inf. Model., 51, 2320-2335 (2011).
[52] N. Chirico, P. Gramatica, Real External Predictivity of QSAR Models. Part2. New intercomparable thresholds for different validation
criteria and the need for scatter plot inspection, J. Chem. Inf. Model., 52, 2044-2058 (2012).
[53] C. Rücker, G. Rücker, M. Meringer, y-Randomization and Its Variants in QSPR/QSAR, J. Chem. Inf. Model., 47, 2345-2357 (2007)