# Augmenting genetic algorithms with deep neural networks for exploring the chemical space
This repository contains code for the paper: [Augmenting genetic algorithms with deep neural networks for exploring the chemical space](https://arxiv.org/abs/1909.11655).
A video summary of the paper can be found here: https://www.youtube.com/watch?v=9VilhlEXm9w&t=16s
Here is a visualization of molecular progress:
<img align="center" src="./readme_docs/mol_view.gif"/>
## Prerequisites
For cloning the repository, please have a look at the Branch Navigator section.
Before running the code, please ensure you have the following:
- [SELFIES (any version)](https://github.com/aspuru-guzik-group/selfies) -
The code was run with v0.1.1 (which is the fastest), however, the code is compatible with any version.
- [RDKit](https://www.rdkit.org/docs/Install.html)
- [tensorboardX](https://pypi.org/project/tensorboardX/)
- [Pytorch v0.4.1](https://pytorch.org/)
- [Python 3.0 or up](https://www.python.org/download/releases/3.0/)
- [numpy](https://pypi.org/project/numpy/)
Please note: that the Synthetic Accesability calculater (i.e. directory SAS_calculator) comes from - [ https://github.com/EricTing/SAscore]( https://github.com/EricTing/SAscore).
## How to run the code? :
We highly recommend using the following version for running your experiments.
```
python ./core_GA.py
```
The following settings can be customized (found at the end of the file 'core_GA.py'):
- num_generations: Number of generations to run the GA
- generation_size: Molecular population size encountered in each generation
- starting_selfies: Initial population of molecules
- max_molecules_len: Length of the largest molecule string
- disc_epochs_per_generation: Number of epochs of training the discriminator neural network
- disc_enc_type: Type of molecular encoding shown to the discriminator
- disc_layers : Discriminator architecture
- training_start_gen: generation after which discriminator training begins
- device: Device the discriminator is trained on
- properties_calc_ls: Property evaluations to be completed for each molecule of the GA
- num_processors: Number of cpu cores to parallelize calculations over
- beta: Value of parameter beta
- impose_time_adapted_pen: Boolean variable to indicated use of a time-adapted discriminator penalty
## How are the results saved? :
All the results are savents in the 'results' directory. Our results are saved as (Note: 'i' is the run iteration):
1. images_generation_0_i:
Images of the top 100 molecules of each generation. Below each molecule are the Fitness, logP, SA, ring penalty and discriminator scores
2. results_0_i:
Each sub-directory is named by the generation. The smile strings (ordered by fitness) and corresponding molecular properties are provided as text
files: 'smiles_ordered.txt', 'logP_ordered.txt', 'sas_ordered.txt', 'ringP_ordered.txt', 'discrP_ordered.txt'.
Outside the sub-directories is the information about the best molecules of a generation.
3. saved_models_0_i:
The trained discriminators after each generation. Please Note: We did not make use of the discriminator predictions in the Fitness for this experiment (beta is set to 0).
## Branch Navigator:
The code for this repository is arranged based on the experiments of the paper. Particularly:
The code for the paper (arranged by experiment) can be found in the [paper_results branch](https://github.com/akshat998/GA/tree/paper_results). The experiments are arranged as follows:
- [Experiment 4.1: ](https://github.com/akshat998/GA/tree/paper_results/4.1) Unconstrained optimization and comparison with other generative models
- [Experiment 4.2: ](https://github.com/akshat998/GA/tree/paper_results/4.2) Long term experiment with a time-dependent adaptive penalty
- [Experiment 4.3: ](https://github.com/akshat998/GA/tree/paper_results/4.3) Analysis of molecule classes explored by the GA
- [Experiment 4.4: ](https://github.com/akshat998/GA/tree/paper_results/4.4) Constrained optimization
- [Experiment 4.5: ](https://github.com/akshat998/GA/tree/paper_results/4.5) Simultaneous logP and QED optimization
- [Experiment 4.6: ](https://github.com/akshat998/GA/tree/paper_results/4.6) Modification of the hyperparameter beta
Instructions on running the experiments of the paper are provided in the above links. Please note that the code has been parallelized based on the number of CPU cores for quick property evaluations.
To run the code quickly, we recommend the following command:
```
git clone -b master --single-branch https://github.com/aspuru-guzik-group/GA.git --depth 1
```
This contains the raw GA code, without any results from the paper. Above is very quick for cloning, and has a small file size.
Due to the large size of the repository, we have created a seperate branch that contains outputs from all the eperiment. For this option, please run (note: this is a 4GB branch, and needs 20mins of cloning time):
```
git clone --single-branch --branch paper_results https://github.com/akshat998/GA.git
```
## Questions, problems?
Make a github issue ����. Please be as clear and descriptive as possible. Please feel free to reach
out in person: (akshat[DOT]nigam[AT]mail[DOT]utoronto[DOT]ca & pascal[DOT]friederich[AT]kit[DOT]edu)
## License
[Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/)
没有合适的资源?快使用搜索试试~ 我知道了~
论文代码:用深度神经网络增强遗传 算法以探索化学空间_python_代码_下载
共18个文件
py:7个
txt:5个
mp4:1个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 120 浏览量
2022-06-18
23:14:06
上传
评论
收藏 22.63MB ZIP 举报
温馨提示
用深度神经网络增强遗传算法以探索化学空间 这是分子进展的可视化: https://github.com/aspuru-guzik-group/GA/raw/master/readme_docs/mol_view.gif 如何运行代码?: 我们强烈建议使用以下版本来运行您的实验。 python ./core_GA.py 可以自定义以下设置(在文件“core_GA.py”的末尾找到): num_generations:运行 GA 的代数 generation_size:每一代遇到的分子种群大小 starting_selfies:分子的初始种群 max_molecules_len:最大分子串的长度 disc_epochs_per_generation:训练判别器神经网络的 epoch 数 disc_enc_type:向鉴别器显示的分子编码类型 disc_layers : 鉴别器架构 training_start_gen:鉴别器训练开始的生成 设备:训练鉴别器的设备 properties_calc_ls:要为 GA 的每个分子完成的属性评估 num_processors:并行计算的
资源推荐
资源详情
资源评论
收起资源包目录
GA-master (2).zip (18个子文件)
GA-master
readme_docs
mol_view.gif 2.32MB
mol_view.mp4 1017KB
README.md 5KB
evolution_functions.py 25KB
discriminator.py 5KB
datasets
zinc_dearom.txt 10.79MB
smiles_qm9.txt 2.93MB
non_fullerene_accceptors_dearomatized_incl_scscore.csv 21.22MB
SELFIES_qm9.txt 9.05MB
2RGSMILES_NF.txt 3.39MB
convert_smiles_dataset.py 508B
SELFIES_zinc.txt 47.74MB
core_GA.py 6KB
SAS_calculator
fpscores.pkl.gz 3.67MB
sascorer.py 5KB
__pycache__
sascorer.cpython-36.pyc 2KB
generation_props.py 25KB
dataset_prop_calculator.py 2KB
共 18 条
- 1
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9149
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 妈妈!再也不用花钱冲会员了!爱某艺,腾某视频,优某酷,B某站
- android中音频视频开发教程(含代码)中文最新版本
- 1599730581319-申请家庭不动产登记情况承诺表-1.pdf
- Vue2全家桶仿微信App项目,支持多人在线聊天和机器人聊天.zip
- Vue2.0实现简单豆瓣电影webApp.zip
- 数据分析案例- Netflix 电影和电视节目数据集可视化分析(数据集+代码).rar
- vue2.0+router+vuex+express 构建淘票票的全栈demo.zip
- 日常练习前端代码手写笔记图片
- JAVA多线程讲解和多个开发实例
- Vue2 的 datepicker , datetimepicker 组件.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功