项目作者: UAMCAntwerpen

项目描述 :
The QED (Quantitative Estimation of Drug-likeness) molecular descriptor to use in combination with RDKit
高级语言: Python
项目地址: git://github.com/UAMCAntwerpen/qed.git
创建时间: 2020-11-26T13:45:24Z
项目社区:https://github.com/UAMCAntwerpen/qed

开源协议:The Unlicense

下载


QED package

QED stands for quantitative estimation of drug-likeness and the concept has been introduced by Richard Bickerton and coworkers Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, A.L. (2012) ‘Quantifying the chemical beauty of drugs’, Nature Chemistry, 4, 90-98. This module relies on RDKit as a chem-informatics toolkit.

Introduction

This section is about installing and using QED within your own Python scripts or as a standalone Python tool.

The empirical rationale of the QED measure reflects the underlying distribution of molecular properties including molecular weight, logP, topological polar surface area, number of hydrogen bond donors and acceptors, the number of aromatic rings and rotatable bonds, and the presence of unwanted chemical functionalities.

The QED results as generated by this module are not completely identical to those from the original publication. These differences are a consequence of differences within the underlying calculated property calculators used in both methods. For example, discrepancies can be noted in the results from the logP calculations, nevertheless despite the fact that both approaches (Pipeline Pilot in the original publication and RDKit in this implementation) mention to use the Wildmann and Crippen methodology for the calculation of their logP-values. In this respect, Gregory Gerebtzoff has been so kind to perform a refitting of the QED parameters with logP values generated by RDKit (see rdkit-discuss). These refitted values have been implemented in QED as the default values; however, the original publication values can still be used if desired.
This section assumes you have installed RDKit correctly and that you are familiar with the basic functions of it. It is also recommended to have read the original QED publication.

Installation

There are two ways to install QED. The first one is by download from the GitHub repository and installing, and the second way is using Python’s standard pip approach. Both methods are explained below.

1. Installing with pip

This is probably the easiest way to install QED. First make sure you have pip installed:

  1. > pip -V
  2. pip 20.0.2 from /Users/hans/anaconda3/envs/rdkit/lib/python3.7/site-packages/pip (python 3.7)

If pip is not installed, then install it first by following these pip installation notes.

Now you are ready to install QED:

  1. > pip install uamc-qed

Depending on your python location, it might be that you need admin rights. In that case, type:

  1. > sudo pip install uamc-qed

2. Alternative manner: download from Github and run setup.py install

Download QED from GitHub. In this section we assume you have downloaded the file into your ~/Downloads directory) and untar this file into this directory:

  1. > cd ~/Downloads
  2. > sudo tar -xvf uamc-qed-1.0.2.tar.gz
  3. > cd uamc-qed-1.0.2

You should now have a number of files in your ~/Downloads/uamc-qed-1.0.2 directory:

  1. > ls -l
  2. qed-1.0.2/
  3. qed-1.0.2/LICENSE
  4. qed-1.0.2/README.md
  5. qed-1.0.2/dist/
  6. qed-1.0.2/how_to_make_distribution.txt
  7. qed-1.0.2/qed/
  8. qed-1.0.2/setup.py

Move into the dist directory and untar again:

  1. > cd dist
  2. > tar -xvf qed-1.0.2.tar.gz
  3. > cd qed-1.0.2

Now install with Python:

  1. > python setup.py install

This process creates a qed folder with all the qed files into your default Python site-package install directory. It might be that you need to get root permissions:

  1. > sudo python setup.py install

Check your installation by launching Python:

  1. >>> from qed import qed
  2. >>> from rdkit import Chem
  3. >>> m = Chem.MolFromSmiles('c1ccccc1')
  4. >>> qed.default(m)
  5. 0.44619898311523704

Using QED

The qed() function takes as argument a RDKit molecule and returns the corresponding QED value calculated from it.
The qed() function comes in three flavors, each differing in the relative weight that is imposed on the underlying molecular descriptors. These three flavors correspond to the three different QED measures that were described in the original publication:

  • QEDw,max using the set of weights that give maximal information content.
  • QEDw,mo using the mean weights of the optimal 1,000 weight combinations that give the highest information content.
  • QEDw,u with all weights as unity, hence unweighted.

Specifying the required QED weighting scheme in QED is done using the corresponding function:

  • qed.weights_mean() uses the mean weighting scheme and corresponds to QEDw,mo. Another name for this function is qed.default().
  • qed.weights_max() specifies the max weighting scheme and corresponds to QEDw,max.
  • qed.weights_none() specifies unit weights and corresponds to QEDw,u.

and exemplified below:

  1. >>> from qed import qed
  2. >>> from rdkit import Chem
  3. >>> m = Chem.MolFromSmiles('c1ccccc1')
  4. >>> qed.default(m)
  5. 0.44619898311523704
  6. >>> qed.weights_mean(m)
  7. 0.44619898311523704
  8. >>> qed.weights_max(m)
  9. 0.4733526950948539
  10. >>> qed.weights_none(m)
  11. 0.3047153431243204

As already mentioned above, the current implementation of QED uses the refitted logP parameters from Gregory Gerebtzoff. However, the original values can still be used by specifying False as second argument to the appropriate function call:

  1. >>> qed.default(m, False)
  2. 0.4426283718993647
  3. >>> qed.weights_mean(m, False)
  4. 0.4426283718993647
  5. >>> qed.weights_max(m, False)
  6. 0.4706596045646091
  7. >>> qed.weights_none(m, False)
  8. 0.3021185764176498

The QED module also provides its own test set:

  1. >>> test = qed.test_data()
  2. >>> for name in test: print(test[name], name)
  3. ...
  4. Nc1nc(NC2CC2)c2ncn([C@@H]3C[C@H](CO)C=C3)c2n1 Abacavir
  5. CC(=O)NCCCS(O)(=O)=O Acamprosate
  6. CCCC(=O)Nc1ccc(OCC(O)CNC(C)C)c(c1)C(C)=O Acebutolol
  7. CC(=O)Nc1ccc(O)cc1 Acetaminophen
  8. CC(=O)Nc1nnc(s1)S(N)(=O)=O Acetazolamide
  9. CC(=O)c1ccc(cc1)S(=O)(=O)NC(=O)NC1CCCCC1 Acetohexamide
  10. CC(=O)c1ccc2Sc3ccccc3N(CCCN3CCN(CCO)CC3)c2c1 Acetophenazine
  11. Fc4ccc(C1CCNCC1COc3ccc2OCOc2c3)cc4 Paroxetine
  12. Cc1oncc1C(=O)Nc2ccc(C(F)(F)F)cc2 Leflunomide
  13. CN1C4CCCC1CC(NC(=O)c2nn(C)c3ccccc23)C4 Granisetron
  14. CCCN2CC(CSC)CC1c3cccc4[nH]cc(CC12)c34 Pergolide
  15. CCc3c(C)[nH]c2CCC(CN1CCOCC1)C(=O)c23 Molindone
  16. CCCCCCCCCCCCCCCC(=O)OCC(NC(=O)C(Cl)Cl)C(O)c1ccc([N+]([O-])=O)cc1 ChloramphenicalPalmitate
  17. CCCCCCCCCCCCCCCOC(=O)C2C(O)C(O)C(C(NC(=O)C1CC(CCC)CN1C)C(C)Cl)OC2SC ClindamycinPalmitate
  18. CCOc3nc2cccc(C(=O)OC(C)OC(=O)OC1CCCCC1)c2n3Cc6ccc(c4ccccc4c5nn[nH]n5)cc6 CandesartanCilexetil
  19. CN(C)CCC=c2c1ccccc1sc3ccc(Cl)cc23 Chlorprothixene
  20. O=c3c(O)c(C2CCC(c1ccc(Cl)cc1)CC2)c(=O)c4ccccc34 Atovaquone
  21. CN(C)CCCN3c1ccccc1CCc2ccc(Cl)cc23 Clomipramine
  22. CN4CCCC(CC3c1ccccc1Sc2ccccc23)C4 Methixene
  23. CCN(CC)C(C)Cn3c1ccccc1sc2ccccc23 Ethopropazine
  24. N=C(CCSCc1csc(N=C(N)N)n1)NS(N)(=O)=O Famotidine
  25. CNC(=NCCSCc1nc[nH]c1C)NC#N Cimetidine
  26. CCCCCNC(=N)NN=Cc1c[nH]c2ccc(CO)cc12 Tegaserod
  27. C=CC3=C(C(=O)O)N2C(=O)C(NC(=O)C(=NO)c1csc(N)n1)C2SC3 Cefdinir
  28. CC5(C)SC4C(NC(=O)C(C(=O)Oc2ccc1CCCc1c2)c3ccccc3)C(=O)N4C5C(=O)O CarbenicillinIndanyl

One can inspect the individual properties that are used to calculate the Qed value by calling the properties() function:

  1. >>> for name in test:
  2. ... mol = Chem.MolFromSmiles(test[name])
  3. ... p = qed.properties(mol)
  4. ... print("%6.2f\t%6.3f\t%6d\t%6d\t%6.2f\t%6d\t%6d\t%6d\t%6.3f\t%-s" % (p[0],p[1],p[2],p[3],p[4],p[5],p[6],p[7],qed.default(mol),name))
  5. ...
  6. 286.34 1.092 6 3 101.88 4 2 1 0.737 Abacavir
  7. 181.21 -0.600 4 2 83.47 4 0 2 0.467 Acamprosate
  8. 336.43 2.365 5 3 87.66 10 1 1 0.571 Acebutolol
  9. 151.16 1.351 2 2 49.33 1 1 1 0.602 Acetaminophen
  10. 222.25 -0.856 5 2 115.04 2 1 1 0.662 Acetazolamide
  11. 324.40 2.210 4 2 92.34 4 1 1 0.833 Acetohexamide
  12. 411.57 3.492 6 1 47.02 7 2 1 0.688 Acetophenazine
  13. 329.37 3.327 4 1 39.72 4 2 0 0.917 Paroxetine
  14. 270.21 3.254 3 1 55.13 2 2 0 0.896 Leflunomide
  15. 312.42 2.318 3 1 50.16 2 2 0 0.927 Granisetron
  16. 314.50 4.271 2 1 19.03 4 2 0 0.871 Pergolide
  17. 276.38 1.963 3 1 45.33 3 1 0 0.923 Molindone
  18. 561.55 6.941 6 2 118.77 21 1 5 0.056 ChloramphenicalPalmitate
  19. 663.41 6.279 8 3 108.33 22 0 3 0.071 ClindamycinPalmitate
  20. 610.67 6.319 10 1 143.34 10 5 2 0.141 CandesartanCilexetil
  21. 315.87 5.188 2 0 3.24 3 3 0 0.629 Chlorprothixene
  22. 366.84 5.505 3 1 54.37 2 2 0 0.741 Atovaquone
  23. 314.86 4.528 2 0 6.48 4 2 0 0.782 Clomipramine
  24. 309.48 5.015 2 0 3.24 2 2 0 0.735 Methixene
  25. 312.48 5.020 3 0 6.48 5 2 0 0.734 Ethopropazine
  26. 337.46 -0.558 6 5 173.33 7 1 3 0.263 Famotidine
  27. 252.35 0.597 5 3 88.89 5 1 5 0.239 Cimetidine
  28. 301.39 2.298 3 5 96.29 7 2 4 0.235 Tegaserod
  29. 395.42 -0.172 8 4 158.21 5 1 4 0.239 Cefdinir
  30. 494.57 2.496 7 2 113.01 6 2 4 0.274 CarbenicillinIndanyl
  31. >>> print(" MW\t ALOGP\t HBA\t HBD\t PSA\t ROTB\t AROM\tALERTS\t QED\tNAME")
  32. MW ALOGP HBA HBD PSA ROTB AROM ALERTS QED NAME

Revision history

Version 1.0.2

  • Added to the PyPi repository to enable installation using pip.

Version 1.0.1

  • Incorporation of the refitted logP parameters of Gregory Gerebtzoff and making these values default (rdkit-discuss).
  • Modification of the qed() function to enable the selection of the original parameters, if desired.

DOI identifier

DOI:10.5281/zenodo.4293730