zlacker

[return to "AlphaFold reveals the structure of the protein universe"]
1. COGlor+JD[view] [source] 2022-07-28 15:03:35
>>MindGo+(OP)
Before my comment gets dismissed, I will disclaim I am a professional structural biologist that works in this field every day.

These threads are always the same: lots of comments about protein folding, how amazing DeepMind is, how AlphaFold is a success story, how it has flipped an entire field on it's head, etc. The language from Google is so deceptive about what they've actually done, I think it's actually intentionally disingenuous.

At the end of the day, AlphaFold is amazing homology modeling. I love it, I think it's an awesome application of machine learning, and I use it frequently. But it's doing the same thing we've been doing for 2 decades: pattern matching sequences of proteins with unknown structure to sequences of proteins with known structure, and about 2x as well as we used to be able to.

That's extremely useful, but it's not knowledge of protein folding. It can't predict a fold de novo, it can't predict folds that haven't been seen (EDIT: this is maybe not strictly true, depending on how you slice it), it fails in a number of edge cases (remember, in biology, edge cases are everything) and again, I can't stress this enough, we have no new information on how proteins fold. We know all the information (most of at least) for a proteins final fold is in the sequence. But we don't know much about the in-between.

I like AlphaFold, it's convenient and I use it (although for anything serious or anything interacting with anything else, I still need a real structure), but I feel as though it has been intentionally and deceptively oversold. There are 3-4 other deep learning projects I think have had a much greater impact on my field.

EDIT: See below: https://news.ycombinator.com/item?id=32265662 for information on predicting new folds.

◧◩
2. flobos+lG[view] [source] 2022-07-28 15:17:29
>>COGlor+JD
> AlphaFold is amazing homology modeling

If it is homology modelling, then how can it work without input template structures?

◧◩◪
3. COGlor+BJ[view] [source] 2022-07-28 15:30:08
>>flobos+lG
It has template structures. AlphaFold uses the following databases:

    BFD,
    MGnify,
    PDB70,
    PDB (structures in the mmCIF format),
    PDB seqres – only for AlphaFold-Multimer,
    Uniclust30,
    UniProt – only for AlphaFold-Multimer,
    UniRef90.
◧◩◪◨
4. flobos+kK[view] [source] 2022-07-28 15:33:10
>>COGlor+BJ
Those databases are used to derive the evolutionary couplings and distance matrices used by the algorithm. Several of those databases aren’t even structural ones. Furthermore, AlphaFold can function with only a MSA as an input, without retrieving a single PDB coordinate.
◧◩◪◨⬒
5. COGlor+ZL[view] [source] 2022-07-28 15:40:09
>>flobos+kK
It's all about boosting signal by finding other proteins that are similar, until you get to the point that you can identify a fold to assign to a region of the protein. That's why some are structural, and some are not.

>Furthermore, AlphaFold can function with only a MSA as an input, without retrieving a single PDB coordinate.

Yes, it has a very nice model of what sequences should look like in 3D. That model is derived from experimental data. So if I give AlphaFold an MSA of a new, unknown protein fold (substantively away from any known fold), it cannot predict it.

[go to top]