PyPI官网下载|rdkit_to_params-1.1.9.tar.gz_药效团模型模型考虑的不仅是分子拓扑学相似性,还包括了官能团的功能相似性,并通资源-CSDN文库

版权申诉

168 浏览量 2022-02-01 23:12:39 上传评论收藏 43KB GZ 举报

共24个文件

py：16个

txt：4个

pkg-info：2个

《PyPI官网下载：rdkit_to_params-1.1.9.tar.gz——探索Python库在分布式环境中的应用》 PyPI（Python Package Index）是Python开发者的重要资源库，它为全球的Python用户提供了丰富的第三方库，便于软件开发和分享。在PyPI官网下载的“rdkit_to_params-1.1.9.tar.gz”是一个特定版本的Python库，用于处理与化学信息学相关的任务。这个压缩包包含了名为“rdkit_to_params-1.1.9”的子目录，这通常是Python项目的标准结构，包含源代码、文档和可能的测试用例。 RDKIT（Rapid Development Kit for Cheminformatics）是一个开源的化学信息学工具包，专为处理分子数据而设计。它提供了一系列强大的功能，如分子结构的解析、建模、模拟以及各种化学计算。RDKIT支持多种格式的数据输入输出，并且拥有强大的图形化能力，可以方便地可视化化学结构。在“rdkit_to_params-1.1.9.tar.gz”中，我们可能会找到以下组成部分： 1. **setup.py**：这是Python项目的核心配置文件，它定义了项目的元数据（如名称、版本、作者等），并描述了如何构建、安装和打包项目。 2. **rdkit_to_params**：这个目录通常包含Python模块和类，是实际实现rdkit_to_params库功能的地方。其中可能包括`.py`文件，这些文件包含了Python代码，实现了与RDKIT相关的化学参数转换功能。 3. **tests**：测试目录，包含单元测试和集成测试，确保库的功能正确无误。通过运行这些测试，开发者可以验证代码是否按预期工作。 4. **docs**：文档目录，包含项目的说明、教程和其他帮助材料，通常以Markdown或reStructuredText格式编写，便于生成HTML或PDF格式的文档。 5. **requirements.txt**：列出项目依赖的Python库及其版本，确保在不同环境中能正确安装和运行。 6. **LICENSE**：包含库的许可信息，规定了该库可以被如何使用和分发。 “rdkit_to_params”库可能被广泛应用于药物发现、材料科学、环境化学等领域，特别是在数据分析和机器学习场景中。结合分布式计算框架，如Apache Zookeeper，可以在大规模数据集上高效处理化学计算任务。Zookeeper作为分布式协调服务，可以管理和维护分布式系统的配置信息，确保在云原生环境下，rdkit_to_params库的多个实例能够协同工作，实现高可用性和一致性。在云原生（Cloud Native）环境中，Python库如rdkit_to_params可以轻松地部署到容器化平台（如Docker和Kubernetes），以实现灵活扩展和资源优化。通过这种方式，科学家和工程师可以在不牺牲性能的前提下，快速迭代和部署化学信息学应用。总结来说，“rdkit_to_params-1.1.9.tar.gz”是RDKIT扩展库的一个版本，它利用Python的便利性，为化学信息学研究提供了一种工具，支持在分布式系统和云原生环境中进行分子数据的处理和分析。通过了解其内部结构和与Zookeeper的整合，我们可以更好地利用这个库来解决实际问题，推动科研和工业界的技术进步。

资源详情

资源评论

资源推荐

收起资源包目录

rdkit_to_params-1.1.9.tar.gz （24个子文件）

rdkit_to_params-1.1.9

PKG-INFO 17KB

rdkit_to_params.egg-info

PKG-INFO 17KB

SOURCES.txt 721B

entry_points.txt 65B

top_level.txt 22B

dependency_links.txt 1B

tests

__init__.py 2KB

rdkit_to_params

constraint.py 17KB

__init__.py 16KB

_io_mixin.py 3KB

rdkitside

utilities.py 2KB

_rdkit_prep.py 29KB

__init__.py 877B

_rdkit_convert.py 20KB

_rdkit_inits.py 6KB

_rdkit_rename.py 9KB

_pyrosetta_mixin.py 3KB

entries.py 20KB

version.py 217B

_init_mixin.py 3KB

cli

__init__.py 1KB

setup.cfg 38B

setup.py 2KB

README.md 14KB

# RDKit to params Create or modify Rosetta params files (topology files) from scratch, RDKit mols or another params file. > RDKit and Pyrosetta are optional module, but most of the useful functionality comes from the former! To install from pip type: pip install rdkit-to-params To install the latest version (probably the same) from GitHub git clone https://github.com/matteoferla/rdkit_to_params.git pip install . (To install rdkit, `conda install -c conda-forge rdkit` or `apt-get`). ## Website For a web app using this see [https://direvo.mutanalyst.com/params](https://direvo.mutanalyst.com/params). For the code running the website, see: * [templates](https://github.com/matteoferla/DirEvo_tools/tree/master/direvo/templates/params) * [views](https://github.com/matteoferla/DirEvo_tools/blob/master/direvo/views/params.py) ## Legal thingamabob The author, Matteo Ferla, is not affiliated with either Rosetta or RDKit and the presence of the latter's name in the package's title is completely coincidental. And yes, I am copying my legal mumbojumbo from South Park. ## Rationale This is a fresh rewrite of ``mol_to_params.py``. For three reasons: * I cannot share my 2to3 port and modd ed module-version of ``mol_to_params.py`` due to licence. * I want to modify `params` files and more as opposed to use a standalone script. * RDKit does not save ``mol2`` files, yet knows about atom names and Gasteiger-Massilli charges and more... It sounds mad, but did not actually take too long. ## Roundtrip Native amino acid params files can be found in the Rosetta folder `rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/l-caa` Let's do a roundtrip changing an atomname: import pyrosetta pyrosetta.init(extra_options='-mute all') # required for test from rdkit_to_params import Params p = Params.load('PHE.params') p.IO_STRING[0].name3 = 'PHX' p.IO_STRING[0].name1 = 'Z' p.AA = 'UNK' #If it's not one of the twenty (plus extras), UNK! del p.ROTAMER_AA[0] p.rename_atom(' CB ', ' CX ') # this renames p.dump('fake.params') p.test().dump_pdb('test.pdb') `p.test()` returns a pyrosetta pose. The static method `params_to_pose('something.params', name3)` accepts a params file import nglview pose = Params.params_to_pose('some_topology_I_found.params', name3) view = nglview.show_rosetta(pose) view ## From mol object ### Requirements For the sake of sanity, `EmbedMolecule`, `Chem.AddHs(mol)` or any other operation is assumed to have been done beforehand. And that the user is going to do `Chem.MolToPDBFile(params.mol)` or `Chem.MolToPDBBlock(params.mol)` or use the bound methods of `Params`, `dump_pdb` and `dump_pdb_conf` (see below). The molecule should preferably be **not** Kekulised. 3letter name of residue is either from the title row (``_Name``) if a 3letter word or from the PDBInfo or 'LIG'. Dummy atom (*/R) is assumed to be a CONNECT —ligand only atm. Here is a conversion to an amino acid from a SMILES (quickest way): import pyrosetta pyrosetta.init(extra_options='-mute all') from rdkit_to_params import Params p = Params.from_smiles('*C(=O)C(Cc1ccccc1)[NH]*', #recognised as amino acid. name='PHX', #optional. atomnames={3: 'CZ'} #optional, rando atom name as see in previous edit ) print(p.is_aminoacid()) # True p.dump('fake.params') p.test().dump_pdb('test.pdb') Chem.MolToPDBFile(mol, 'ref.pdb') Here is a conversion to a ligand the circuitous way, just for fun: import pyrosetta pyrosetta.init(extra_options='-mute all') # note that pyrosetta needs to be started before rdkit. from rdkit_to_params import Params # make the molecule in RDKit or chemdraw or download it or whatever. mol = Chem.MolFromSmiles('NC(C(=O)O)Cc1ccccc1') mol = AllChem.AddHs(mol) AllChem.EmbedMolecule(mol) AllChem.MMFFOptimizeMolecule(mol) # add names to the mol beforehand Params.add_names(mol, names=['N', 'CA', 'C', 'O', 'OXT', 'CB'], name='PHZ') # parameterise p = Params.from_mol(mol, name='PHZ') p.test().dump_pdb('test.pdb') Chem.MolToPDBFile('ref.pdb') The class method `add_names` is based upon atom index (which is derived from the SMILES or sdf/mol file unless atoms have been replaced). The instance method `rename_by_substructure` accepts a substructure and a list of atom names in the order they are in the substructure. Note that conformer generation is not fully automatic and is not done by default. # make your conformers as you desire AllChem.EmbedMultipleConfs(mol, numConfs=10) # or whatever you choose. This is a somewhat important decision. AllChem.AlignMolConformers(mol) # I do not know if the conformers need to be aligned for Rosetta # params time! p = Params.from_mol(mol, name='LIG') p.dump_pdb_conf('LIG_conf.pdb') p.PDB_ROTAMERS.append('LIG_conf.pdb') p.dump('my_params.params') Note `dump_pdb` and `dump_pdb_conf` will save the molecule(s) without the dummy atoms, to stop this add `stripped=False`. ## From SMILES string The above is actually a bit convoluted for example purposes as `Params.from_smiles`, accepts a SMILES string. ## From SMILES string and PDB for names In some cases one has a PDB file with a ligand that needs parameterising. Assuming one has also the smiles of the ligand (PubChem has an super easy search), one can do p = Params.from_smiles_w_pdbfile(pdb_file, smiles, 'XXX') # the name has to match. The smiles does not need to match full. It can contain more atoms or even one`*` (CONNECT). The smiles gets parameterised. So be suse to add correct charges properly —hydrogens are added. It could be used for scaffold hopping, but if position matters so much, you may be interested in [Fragmenstein](https://github.com/matteoferla/Fragmenstein). For more see [autogenerated documentation](sphinx-documentation.md). Sphinx with markdown cannot deal with typehinting, so checking the code might be clearer. ## Rename A key part is the atom names ——this can happen at . The following renaming methods are present: * `p.rename(???)`: "overloaded" method that directs to the others * `p.rename_from_str('XX,YY,ZZ')` or `p.rename_from_str('0:XX,3:YY')` * `p.rename_from_list(['XX','YY', 'ZZ'])` * `p.rename_from_dict({0:'XX',3:'YY'})` * `p.rename_from_template(Chem.Mol)` * `p.rename_by_substructure(Chem.Mol, ['XX','YY', 'ZZ'])` where the list is the atom idx in substructure Note, ``retype_by_name`` does not have all these options (only atomname -> Rosetta atomtype). The class method ``add_names`` simply uses these, but returns a mol ### DIY If you have two mol objects from whatever routes, the basic operation is: p = Params.load_mol(mol, generic=False, name='LIG') p.rename_from_template(template) # or whatever middle step p.convert_mol() Note that `convert_mol` should be called once and is already called in the two `from_XXX` classmethods. p = Params.from_mol(...) p.convert_mol() # No!!! p.mol # is the mol... p2 = Params.load_mol(p.mol) p2.convert_mol() # Yes ## Constraints The selfstanding class `Constraints` is for generating constraint files, which are a must with covalent attachments in order to stop janky topologies. The class is instantiated with a pair of SMILES, each with at least a real atom and with one attachment point, the first is the ligand and the second is its peptide target. The names of the heavy atoms and the Rosetta residue "numbers". from rdkit_to_params import Constraints c = Constraints(smiles=('*C(=N)', '*SC'), names= ['*', 'CX', 'NY', '*', 'SG', 'CB'], ligand_res= '1B', target_res='145A') c.dump('con.con') # individual strings can be accessed c.atom_pair_constraint c.angle_constraint c.dihedral_constaint c.custom_constaint # if you want to add your own before `str`