【免费】AlphaFold2论文及其补充文档资源-CSDN文库

共2个文件

pdf：2个

毕业设计

需积分: 0 178 浏览量 2023-10-05 19:25:13 上传评论收藏 8.68MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

alphafold2.zip （2个子文件）

alphafold正文.pdf 3.5MB

alphafold补充文档.pdf 7.32MB

Highly accurate protein structure prediction

with AlphaFold

In the format provided by the

authors and unedited

Nature | www.nature.com/nature

Supplementary information

https://doi.org/10.1038/s41586-021-03819-2

Supplementary Information for: Highly accurate

protein structure prediction with AlphaFold

John Jumper

1*+

, Richard Evans

, Alexander Pritzel

, Tim Green

, Michael

Figurnov

, Olaf Ronneberger

, Kathryn Tunyasuvunakool

, Russ Bates

Augustin Žídek

, Anna Potapenko

, Alex Bridgland

, Clemens Meyer

, Simon

A A Kohl

, Andrew J Ballard

, Andrew Cowie

, Bernardino Romera-Paredes

Stanislav Nikolov

, Rishub Jain

, Jonas Adler

, Trevor Back

, Stig Petersen

David Reiman

, Ellen Clancy

, Michal Zielinski

, Martin Steinegger

, Michalina

Pacholska

, Tamas Berghammer

, Sebastian Bodenstein

, David Silver

, Oriol

Vinyals

, Andrew W Senior

, Koray Kavukcuoglu

, Pushmeet Kohli

, and Demis

Hassabis

1*+

DeepMind, London, UK

School of Biological Sciences and Artiﬁcial Intelligence Institute, Seoul National

University, Seoul, South Korea

These authors contributed equally

Corresponding authors: John Jumper, Demis Hassabis

Contents

1 Supplementary Methods 4

1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Data pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Genetic search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.3 Template search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.4 Training data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.5 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.6 MSA block deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.7 MSA clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.8 Residue cropping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.9 Featurization and model inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Self-distillation dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 AlphaFold Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Input embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.6 Evoformer blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.6.1 MSA row-wise gated self-attention with pair bias . . . . . . . . . . . . . . . . . . . . 14

Suppl. Material for Jumper et al. (2021): Highly accurate protein structure prediction with AlphaFold 2

1.6.2 MSA column-wise gated self-attention . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.6.3 MSA transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.6.4 Outer product mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.6.5 Triangular multiplicative update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.6.6 Triangular self-attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.6.7 Transition in the pair stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.7 Additional inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.7.1 Template stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.7.2 Unclustered MSA stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.8 Structure module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.8.1 Construction of frames from ground truth atom positions . . . . . . . . . . . . . . . . 26

1.8.2 Invariant point attention (IPA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.8.3 Backbone update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1.8.4 Compute all atom coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1.8.5 Rename symmetric ground truth atoms . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.8.6 Amber relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.9 Loss functions and auxiliary heads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

1.9.1 Side chain and backbone torsion angle loss . . . . . . . . . . . . . . . . . . . . . . . 32

1.9.2 Frame aligned point error (FAPE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

1.9.3 Chiral properties of AlphaFold and its loss . . . . . . . . . . . . . . . . . . . . . . . 34

1.9.4 Conﬁgurations with FAPE(X,Y) = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

1.9.5 Metric properties of FAPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.9.6 Model conﬁdence prediction (pLDDT) . . . . . . . . . . . . . . . . . . . . . . . . . 37

1.9.7 TM-score prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1.9.8 Distogram prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1.9.9 Masked MSA prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1.9.10 “Experimentally resolved” prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1.9.11 Structural violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.10 Recycling iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

1.11 Training and inference details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

1.11.1 Training stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

1.11.2 MSA resampling and ensembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

1.11.3 Optimization details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

1.11.4 Parameters initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

1.11.5 Loss clamping details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

1.11.6 Dropout details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

1.11.7 Evaluator setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

1.11.8 Reducing the memory consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

1.12 CASP14 assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1.12.1 Training procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1.12.2 Inference and scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1.13 Ablation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.13.1 Architectural details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.13.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

1.13.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

1.14 Network probing details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1.15 Novel fold performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

1.16 Visualization of attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

1.17 Additional results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Suppl. Material for Jumper et al. (2021): Highly accurate protein structure prediction with AlphaFold 3

List of Supplementary Figures

1 Input feature embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 MSA row-wise gated self-attention with pair bias . . . . . . . . . . . . . . . . . . . . . . . . 15

3 MSA column-wise gated self-attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 MSA transition layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Outer product mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6 Triangular multiplicative update using “outgoing” edges . . . . . . . . . . . . . . . . . . . . 18

7 Triangular self-attention around starting node . . . . . . . . . . . . . . . . . . . . . . . . . . 19

8 Invariant Point Attention Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

9 Accuracy distribution of a model trained with dRMSD instead of the FAPE loss . . . . . . . . 36

10 Accuracy of ablations depending on the MSA depth . . . . . . . . . . . . . . . . . . . . . . . 50

11 Performance on a set of novel structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

12 Visualization of row-wise pair attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

13 Visualization of attention in the MSA along sequences . . . . . . . . . . . . . . . . . . . . . 55

14 Median all-atom RMSD

on the CASP14 set . . . . . . . . . . . . . . . . . . . . . . . . . . 56

15 RMSD histograms on the template-reduced recent PDB set. . . . . . . . . . . . . . . . . . . . 57

List of Supplementary Tables

1 Input features to the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Rigid groups for constructing all atoms from given torsion angles . . . . . . . . . . . . . . . . 24

3 Ambiguous atom names due to 180

◦

-rotation-symmetry . . . . . . . . . . . . . . . . . . . . . 31

4 AlphaFold training protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Training protocol for CASP14 models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Quartiles of RMSD distributions on the template-reduced recent PDB set. . . . . . . . . . . . 57

List of Algorithms

1 MSABlockDeletion MSA block deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Inference AlphaFold Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 InputEmbedder Embeddings for initial representations . . . . . . . . . . . . . . . . . . . . . 13

4 relpos Relative position encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 one_hot One-hot encoding with nearest bin . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6 EvoformerStack Evoformer stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

7 MSARowAttentionWithPairBias MSA row-wise gated self-attention with pair bias . . . . . . 15

8 MSAColumnAttention MSA column-wise gated self-attention . . . . . . . . . . . . . . . . . . 16

9 MSATransition Transition layer in the MSA stack . . . . . . . . . . . . . . . . . . . . . . . 17

10 OuterProductMean Outer product mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

11 TriangleMultiplicationOutgoing Triangular multiplicative update using “outgoing” edges 18

12 TriangleMultiplicationIncoming Triangular multiplicative update using “incoming” edges 18

13 TriangleAttentionStartingNode Triangular gated self-attention around starting node . . . . 19

14 TriangleAttentionEndingNode Triangular gated self-attention around ending node . . . . . 20

15 PairTransition Transition layer in the pair stack . . . . . . . . . . . . . . . . . . . . . . . . 20

16 TemplatePairStack Template pair stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

17 TemplatePointwiseAttention Template pointwise attention . . . . . . . . . . . . . . . . . . 21

18 ExtraMsaStack Extra MSA stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

19 MSAColumnGlobalAttention MSA global column-wise gated self-attention . . . . . . . . . . 22

20 StructureModule Structure module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Suppl. Material for Jumper et al. (2021): Highly accurate protein structure prediction with AlphaFold 4

21 rigidFrom3Points Rigid from 3 points using the Gram–Schmidt process . . . . . . . . . . . 26

22 InvariantPointAttention Invariant point attention (IPA) . . . . . . . . . . . . . . . . . . . 28

23 BackboneUpdate Backbone update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

24 computeAllAtomCoordinates Compute all atom coordinates . . . . . . . . . . . . . . . . . . 30

25 makeRotX Make a transformation that rotates around the x-axis . . . . . . . . . . . . . . . . . 30

26 renameSymmetricGroundTruthAtoms Rename symmetric ground truth atoms . . . . . . . . . 31

27 torsionAngleLoss Side chain and backbone torsion angle loss . . . . . . . . . . . . . . . . . 33

28 computeFAPE Compute the Frame aligned point error . . . . . . . . . . . . . . . . . . . . . . 34

29 predictPerResidueLDDT Predict model conﬁdence pLDDT . . . . . . . . . . . . . . . . . . 37

30 RecyclingInference Generic recycling inference procedure . . . . . . . . . . . . . . . . . . 41

31 RecyclingTraining Generic recycling training procedure . . . . . . . . . . . . . . . . . . . 41

32 RecyclingEmbedder Embedding of Evoformer and Structure module outputs for recycling . . 42

1 Supplementary Methods

1.1 Notation

We denote the number of residues in the input primary sequence by N

res

(cropped during training), the number

of templates used in the model by N

templ

, the number of all available MSA sequences by N

all_seq

, the number of

clusters after MSA clustering by N

clust

, the number of sequences processed in the MSA stack by N

seq

(where

seq

= N

clust

+ N

templ

), and the number of unclustered MSA sequences by N

extra_seq

(after sub-sampling,

see subsubsection 1.2.7 for details). Concrete values for these parameters are given in the training details

(subsection 1.11). On the model side, we also denote the number of blocks in Evoformer-like stacks by N

block

(subsection 1.6), the number of ensembling iterations by N

ensemble

(subsubsection 1.11.2), and the number of

recycling iterations by N

cycle

(subsection 1.10).

We present architectural details in Algorithms, where we use the following conventions. We use capitalized

operator names when they encapsulate learnt parameters, e.g. we use Linear for a linear transformation with

a weights matrix W and a bias vector b, and LinearNoBias for the linear transformation without the bias

vector. Note that when we have multiple outputs from the Linear operator at the same line of an algorithm,

we imply different trainable weights for each output. We use LayerNorm for the layer normalization [85]

operating on the channel dimensions with learnable per-channel gains and biases. We also use capitalized

names for random operators, such as those related to dropout. For functions without parameters we use lower

case operator names, e.g. sigmoid, softmax, stopgrad. We use  for the element-wise multiplication, ⊗

for the outer product, ⊕ for the outer sum, and a

b for the dot product of two vectors. Indices i, j, k

always operate on the residue dimension, indices s, t on the sequence dimension, and index h on the attention

heads dimension. The channel dimension is implicit and we type the channel-wise vectors in bold, e.g. z

Algorithms operate on sets of such vectors, e.g. we use {z

} to denote all pair representations.

In the structure module, we denote Euclidean transformations corresponding to frames by T = (R,

t), with

R ∈ R

3×3

for the rotation and

t ∈ R

for the translation components. We use the ◦ operator to denote

application of a transformation to an atomic position

x ∈ R

result

= T ◦

= (R,

t) ◦

= R

x +

t .

The ◦ operator also denotes composition of Euclidean transformations:

result

= T

◦ T

result

) = (R

) ◦ (R

)

= (R

, R

)

评论收藏

内容反馈

AAA码农宝哥.

粉丝: 801
资源: 4

AlphaFold2论文及其补充文档

AlphaFold2文档和61页技术补充材料

AlphaFold overview and recent developments.pdf

论文研究 - 电子烟补充液中的尼古丁含量及其在气雾剂中的释放

伪CT重建：一个代码包，作为论文的补充文档“使用MiSpinner和标准小动物成像仪从有限投影中重建伪CT图像”

论文研究-对文“新型指数稳定性定理及其初步应用”的补充 .pdf

介绍了AlphaFold2的原理和简单的安装过程！

Python库 | alphafold2_pytorch-0.0.67-py3-none-any.whl

Python库 | alphafold2_pytorch-0.4.18-py3-none-any.whl

alphafold2：随着架构细节的发布，最终成为Alphafold2的非官方Pytorch实现复制

Python库 | alphafold2_pytorch-0.3.3-py3-none-any.whl

PyPI 官网下载 | alphafold2_pytorch-0.0.23-py3-none-any.whl

Python库 | alphafold2_pytorch-0.0.57-py3-none-any.whl

机器学习论文精读

alphafold1的t1919s2的数据

【哈佛大学】使用AlphaFold估算蛋白质模型精度的最新技术

alphafold1的model

Python库 | alphafold2-pytorch-0.0.76.tar.gz

Python库 | alphafold2-pytorch-0.0.98.tar.gz

Python库 | alphafold2-0.0.2-py3-none-any.whl

34个经典javaweb项目实例.zip

毕业设计 springBoot人力资源管理系统+毕业论文+前后端源代码

项目源码：基于Hadoop+Spark招聘推荐可视化系统 大数据项目 计算机毕业设计

基于spring boot的小区物业管理系统源码+论文+答辩ppt

毕业设计：舆情监测系统（SpringBoot+NLP）

计算机毕业设计：Flask股票数据采集分析可视化系统 python+爬虫+金融数据

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计 项目源码 毕业设计

毕业设计-基于JAVA的springboot超市进销存系统(源代码+论文）

基于51单片机的智能电子秤系统设计(含代码仿真及论文)无需积分！

最新资源

项目源码：基于Hadoop+Spark招聘推荐可视化系统大数据项目计算机毕业设计

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计项目源码毕业设计