zlacker

[return to "AlphaFold reveals the structure of the protein universe"]
1. biffta+q[view] [source] 2022-07-28 11:22:35
>>MindGo+(OP)
How do they know their structures are correct?
◧◩
2. lrem+X6[view] [source] 2022-07-28 12:17:47
>>biffta+q
Disclaimer: I work in Google, organizationally far away from Deep Mind and my PhD is in something very unrelated.

They can't possibly know that. What they know is that their guesses are very significantly better than the previous best and that they could do this for the widest range in history. Now, verifying the guess for a single (of the hundreds of millions in the db) protein is up to two years of expensive project. Inevitably some will show discrepancies. These will be fed to regression learning, giving us a new generation of even better guesses at some point in the future. That's what I believe to be standard operating practice.

A more important question is: is today's db good enough to be a breakthrough for something useful, e.g. pharma or agriculture? I have no intuition here, but the reporting claims it will be.

◧◩◪
3. f38zf5+hb[view] [source] 2022-07-28 12:47:33
>>lrem+X6
The press release reads like an absurdity. It's not the "protein universe", it's the "list of presumed globular proteins Google found and some inferences about their structure as given by their AI platform".

Proteins don't exist as crystals in a vacuum, that's just how humans solved the structure. Many of the non-globular proteins were solved using sequence manipulation or other tricks to get them to crystallize. Virtually all proteins exist to have their structures interact dynamically with the environment.

Google is simply supplying a list of what it presumes to be low RMSD models based on their tooling, for some sequences they found, and the tooling is based itself on data mostly from X-ray studies that may or may not have errors. Heck, we've barely even sequenced most of the DNA on this planet, and with methods like alternative splicing the transcriptome and hence proteome has to be many orders of magnitude larger than what we have knowledge of.

But sure, Google has solved the structure of the "protein universe", whatever that is.

◧◩◪◨
4. lrem+Nc[view] [source] 2022-07-28 12:56:43
>>f38zf5+hb
I recognize your superior knowledge in the topic and assume you're right.

But you also ignore where we're at in the standard cycle:

https://phdcomics.com/comics/archive_print.php?comicid=1174

;)

◧◩◪◨⬒
5. f38zf5+ge[view] [source] 2022-07-28 13:05:51
>>lrem+Nc
That's exactly what this is, but it's embarrassing that it's coming from somewhere purported to be a lab. Any of the hundreds or more of labs working in protein structure prediction for the past 50 years could have made this press release at any time and said, "look, we used a computer and it told us these are the structures, we solved the protein universe!"

It's not to diminish the monumental accomplishment that was the application of modern machine learning techniques to outpace structure prediction in labs, but other famous labs have already moved to ML predictions and are competitive with DeepMind now.

◧◩◪◨⬒⬓
6. Viking+7m[view] [source] 2022-07-28 13:47:20
>>f38zf5+ge
> but other famous labs have already moved to ML predictions and are competitive with DeepMind now.

That's great! AlphaFold DB mas made 200 million structure predictions available for everyone. How many structure predictions have other famous labs made available for everyone?

◧◩◪◨⬒⬓⬔
7. f38zf5+Kn[view] [source] 2022-07-28 13:55:59
>>Viking+7m
As many as you wanted to throw at them, considering the vast majority are open source and could be run on your own server cluster. CASP15 is ongoing so by the end of the year we will know how much absolute progress has been made by others.

Google has the advantage of the biggest guns here: the fastest TPUs with the most memory in the biggest clusters, so running inference with a massive number of protein sequences is much easier for them.

[go to top]