Environment & Energy

NNadir

(36,196 posts)

9. Oh. I see. We have to make distinctions about how servers are used. I'm a bad guy, I guess. I use Google Scholar...

Thu Nov 21, 2024, 06:26 PM

Nov 2024

...often to support my scientific work and given the richness of the literature, its vast scope, I certainly wish I had something like CCU-Lama, which I described in the science forum: CCU-Llama.

Who's going to monitor the "correct use" of servers? The Trump administration?

It's funny, because just the other day I was having a conversation with another scientist about whether we should always trust the sophisticated software we use, both on line and in house, to interpret mass spec data. I'm so old, of course, that I remember sitting with a pencil and paper and calculating the mass of potential fragments and then looking at the data to see if such a mass was there visually. I could spend a week or more with a complex compound in that way. Now, in less than a few minutes, I can see all the PTMs and sequences from a very large protein, no trouble at all. I almost never find a result that seems to be invalidated by experiment, unless it involves an isobaric species, and now their are ways around that as well. The public servers, like Uniprot, do, must do something very much like AI, although honestly I don't know how it works, just that it's as fast as hell and I have direct experience with it being perfectly correct on multiple occasions, for example, finding the exact correct species of associated with a highly conserved protein found across many living things when I was blinded. And trust me, the protein in question is highly conserved across a wide range of species, from single cell organisms to human beings.

However, we can and do, set false discovery rates in the use of the software, and that is designed to establish the error parameters. The fact that there is a "false discovery rate," means that we have to be careful with the data, it is not determinative so much as (highly) suggestive.

Your link, by the way, refers to an article referring to a paper, this one: Terwilliger, T.C., Liebschner, D., Croll, T.I. et al. AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination. Nat Methods 21, 110–116 (2024). It certainly doesn't discount the value of Alphafold, remarking that it often produces results that are remarkably similar to crystallized proteins, which, as the authors note does not necessarily correspond to protein structure in vivo.

To wit:

...Both experimentally determined protein structures and predicted models have important limitations11,13,14. Proteins are flexible and dynamic, and their distributions of conformations depend on temperature, solution conditions and binding of ligands or other proteins (including crystal contacts in the case of crystallography)15. A model of a high-resolution crystal structure can accurately represent the dominant conformation(s) present in a crystal in a particular environment11, but the structure may differ under another set of conditions14. Artificial intelligence (AI)-based models can in many cases be very accurate; however, they do not yet take into account the presence of ligands, covalent modifications or environmental factors, and take protein–protein interactions and multiple conformations into account in a limited way1,2,16,17...

From the conclusion of the paper:

...Despite their limitations, AlphaFold predictions are already changing the way that hypotheses about protein structures are generated and tested1,2,5,6. Indeed, even though not all parts of AlphaFold predictions are accurate, they provide plausible hypotheses that can suggest mechanisms of action and allow designing of experiments with specific expected outcomes. Using these predictions as starting hypotheses can also greatly accelerate the process of experimental structure determination27,34,35. AlphaFold predictions often have very good stereochemical characteristics, making them excellent hypotheses for local structural features. For example, for the 102 structures analyzed here, the mean percentage of residues with ‘favored’ Ramachandran configurations was 98%, greater than that of the corresponding deposited models (97%), and the mean percentage of side-chain conformations classified as outliers was just 0.2%, compared with 1.5% for deposited models31. Such AlphaFold predictions with highly plausible geometry could be used in later stages of experimental structure determination as potential conformations for segments of structure that are not fully clear in experimental density maps...

To me, this doesn't read like a dismissal of Alphafold, but rather a wise cautionary suggestion as to how it should be used.

Of course, in the history of science, there have been many calculations that proved to not hold up to experimental data. Experiment always prevails over theory, or should anyway. One should always check results against theory, and in fact, that is what automated machine learning does, compares data with theory to determine whether theory holds, and adjusts the theory accordingly, but yes, the output of this process needs human review.

None of this means that there is something corrupt or illegitimate with data centers. My remark about my son's work was intended not to be "right" or "wrong," but rather to suggest that we ought to be careful with how we judge technologies. Sure there are kids who produce term papers on ChatGPT. That doesn't mean that ChatGPT is evil. I have an assistant, not a scientist, who brings me text from it regularly, with my knowledge. It often fails the Turing test, but recognizing that it fails the Turing test often, it can help unblock writer's block. We never use it directly in our reports, but it suggests, not defines, a path.

Edit history

Please sign in to view edit histories.

Recommendations

0 members have recommended this reply (displayed in chronological order):

10 replies

= new reply since forum marked as read

Highlight:

Projected 10-30% Increase In Natural Gas Power Generation Because Muh AI Datacenters Must Be Fed!!! Oh, And Coal, Too [View all] hatrack Nov 2024 OP

Disgusting. And this is mostly about that nearly worthless genAI used for student cheating and other highplainsdem Nov 2024 #1

Yes, let us all kneel and praise Shiny New Thing!! hatrack Nov 2024 #2

My son's Ph.D work involves convolutional neural networks for... NNadir Nov 2024 #3

Don't misread my post. I said nearly worthless.There are some good uses for AI, but they don't excuse highplainsdem Nov 2024 #4

It would be interesting to hear of any technology that is immune... NNadir Nov 2024 #5

GenAI models are almost all built on abuse - theft of intellectual property - and the dissemination and highplainsdem Nov 2024 #6

Well, should I assume you are an expert and can provide... NNadir Nov 2024 #7

DU doesn't use genAI. There are lots of articles out there on genAI increasing power demands, and highplainsdem Nov 2024 #8

Oh. I see. We have to make distinctions about how servers are used. I'm a bad guy, I guess. I use Google Scholar... NNadir Nov 2024 #9

TED: AI is dangerous, but not for the reasons you think OKIsItJustMe Nov 2024 #10