项目作者: sebotic

项目描述 :
A Python wrapper for the Chemistry Development Kit (CDK)
高级语言: Python
项目地址: git://github.com/sebotic/cdk_pywrapper.git
创建时间: 2016-08-26T09:23:39Z
项目社区:https://github.com/sebotic/cdk_pywrapper

开源协议:GNU Affero General Public License v3.0

下载


Python Wrapper for the Chemistry Development kit

tl;dr

  • A Python wrapper for the CDK (which is written in Java)
  • Primary purpose:
    • Generate diverse chemical compound identifiers (SMILES, InChI)
    • Inter-convert between these identifiers
  • Fully compatible to Python 3.x

Motivation

The chemistry world only has a small number of open tools, e.g. OpenBabel and the
Chemistry Development Kit (github).

I have been using OpenBabel for some time now, and it is a great tool offering many options,
I found several issues which make it hard to use:

  • Generating InChI (keys) from SMILES often either does not work or struggles with stereochemistry.
  • InChI cannot be used as input format.

Installation

  1. git clone https://github.com/sebotic/cdk_pywrapper.git
  2. cd cdk_pywrapper
  3. pip install .

This will install the package on your local system, it will download the CDK and it will build the cdk_bridge.java.
So after that, cdk_pywrapper should be ready to use, like in the example below.

Don’t forget to use e.g. sudo for global installation or pip3 for Python 3.

I will also host this on pypi soon, so no repo cloning will be required. I have tested it on Linux and MacOS, not sure if it would work on Windows.

Example

  1. from cdk_pywrapper.cdk_pywrapper import Compound
  2. smiles = 'CCN1C2=CC=CC=C2SC1=CC=CC=CC3=[N+](C4=CC=CC=C4S3)CC.[I-]'
  3. cmpnd = Compound(compound_string=smiles, identifier_type='smiles')
  4. ikey = cmpnd.get_inchi_key()
  5. print(ikey)

Output: ‘MNQDKWZEUULFPX-UHFFFAOYSA-M’