Potato Riff Study: The Quicksort of science?

In which I finally get to use Computer Science

Dec 02, 2023

Error

There’s an old joke about Computer Science: it’s not a science, and it’s not about computers.

This is largely correct. Computer science isn’t about chips or electronics or transistors; it’s mostly about algorithms and data structures. Sorting datasets, path finding, searching datasets, depth-first, breadth-first, Dijkstra, A*, bubble sort, Quicksort, heap sort, merge sort, insertion sort, linked lists, red-black-trees, tries, hash maps..

It’s more a branch of applied mathematics or logic, I’d say. In fact, computer programmers out-logic pretty much anybody: my friend and I went to a philosophy lecture on logic in college, showing up completely drunk, and we out-logicked all the philosophy majors with ease.

Programming is to logic what carpentry is to nails. It’s a daily tool that you get to know inside-out if you work in the field.

Unfortunately, almost all of the stuff you learn in “Computer Science” you will never use again in your career as a programmer. Again, most programming is a lot like carpentry or plumbing: if it doesn’t work, hit it with a mallet and wiggle until it fits, or hope nobody will notice it behind the cladding.

The opportunities to use an elegant algorithm are few and far between.

That’s why I was quite intrigued when I saw a recent post by Slime Mold Time Mold about their..

Potato Diet Riff Trial

Potato Diet Riff Trial, sign up now, lol!

In their infamous serious and solemn style, the 3 raccoons in a lab coat (allegedly!) have come up, as far as I can tell, with a new sort of scientific trial.

The Riff trial. I searched for this term on the internet, but couldn’t find anything. The first hit is their post on Hacker News, the second hit is somebody called Riff Raff being cleared of charges in a brothel in Nevada. This is apparently a rapper. Good times.

So what is this “Riff Trial?” It’s a new way of doing research, essentially making a trade-off on certainty to gain volume and generate more hypotheses, faster.

The idea is that you do a diet that is largely about potatoes in some sense, but then you riff on it - adding anything you feel like. E.g. if I were to do one, I would likely do a potato + cream trial, making mashed potatoes a lot or just adding whipped cream dessert to my potatoes.

Somebody else could run a french fries trial, and somebody could specifically only fry the fries in beef tallow to see if that makes a difference.

To disprove the BCAA theory, somebody could run a potato trial but add lots of BCAA protein supplement powder.

Of course, in a sense, this “proves less” because not everybody was doing as controlled a diet, there are lots of different ones.

The trade-off is that we get a lot more hypotheses tested.

Say you have 35 experiments you’d like to do (the current number of entries in my `hypotheses` file). If each of them involves a group of 30 people locked up in a metabolic chamber, fed a controlled diet of just potatoes + X, it’d take you 6 months just to organize each of these. Collecting and processing the data would likely take another 6. Even on highly derivative (aka “we had one hunch and now we’re following it down the rabbit hole further”) trials with mice, scientists often only put out 1 study per year per lab. Not per person, per lab (!), involving a whole team of researchers.

That means it’ll likely take you at least 35 years to run these experiments. And then you might still be wrong about all of them.

The idea of the Riff Trial is to farm out different ideas. Slime Mold Time Mold are not just running a number of trials highly concurrently (a whole other Computer Science idea!), they’re also leaving the point of the trial up to the individuals. That means that if Suzy from Minnesota or Paul from Arizona come up with a unique idea that Slime Mold Time Mold had never thought of, there’s going to be at least one run of it.

Of course, if Suzy from Minnesota loses 30lbs eating potato + garlic and Paul from Arizona doesn’t lose anything eating potato + onions that doesn’t necessarily prove garlic rules and onions can suck it.

But it’s a hint. It’s a trial to qualify hypotheses. Anybody can come up with an interesting diet experiment, but if you do it exclusively for a month you either get a signal or you don’t. If you lost over 5lbs in that month, that’s a pretty good qualifier. If you lose 10lbs, like the average person on the potato diet, that’s a REALLY good qualifier. Most people don’t lose 10lbs ever.

All the diets that got somebody to lose 10lbs in a month are probably worth looking at, and many of the 5lbs ones might be, too. Although we’re getting danger close to water weight levels even in normal people (=not me), here, so it’s much less of a signal.

How to Science in the Internet Age

I think the internet is changing science. Ain’t nobody got time to run a million-dollar RCT on any old crazy idea you read on Twitter. But hey, if a person somehow loses a bunch of weight drinking cocoa or supplementing potassium or eating ad-lib watermelon, or eating heavy cream lol… you get the idea.

It’s very easy to come up with a fun idea (“what if I only ate chocolate truffle for 2 weeks?!”), but 99% of them will be wrong. That’s, unfortunately, the nature of sciencing.

If we want to science successfully, we can’t spend a million dollars and a year on each crazy idea. It’s great that the internet generates so many fun experiments for us to try. But we need to qualify them gradually.

We need to lower the cost of trying things. What if we can qualify/disqualify an idea in 30 days, at basically zero cost?

If you think potato + peanut butter is great, try it. We’ll know if you lost weight 30 days later. If you didn’t, well, that’s a sign. Maybe there’s somebody out there who would lose weight on potato + peanut butter, but it’s not off to a great start.

Should you lose 10lbs on potato + peanut butter, that doesn’t prove it’s the end-all, be-all diet of course. But it’s a good start. Now we can try it with a handful of people. Maybe cajole your friends into trying it for a few weeks.

Eventually, you’ll be confident enough that it might work for more people in general. Then, you can assemble a rag-tag team of volunteers and run an n=small experiment for a month, like I did with the ex150 trial.

I wouldn’t even classify SM TM’s original potato trial as an n=small experiment: their n was over 200. That’s already a significant amount of people, effort, and time. Imagine, they could’ve run 20+ different n=10 experiments in the same amount of time. Of course, it wouldn’t have proved that any of them work as well as the n>200 potato experiment did. But it would’ve covered more ground.

In this case they got “lucky” that the potato diet produced results in almost everybody - but not really, since the potato idea didn’t come from nothing. There were plenty of pre-qualifying anecdotes about people magically losing weight on it.

Still, it was a gamble - one that paid off. Yet, as is the sad nature of sciencing, most gambles do not pay off. At least not if they’re not highly qualified beforehand.

Divide & Conquer

Now to bring the 2 together: how is the Potato Diet Riff Trial like a Computer Science algorithm?

First a little bit of CS background: sorting things is pretty annoying. If I gave you a deck of 52 shuffled cards and told you to sort them by card value, how would you do it? You’d likely take the first card, then the second card. If the second was of higher value, you’d put it behind. Otherwise, in front. You’d take the third card and slot it in at the right spot.

By the 40th card, if say, it was of higher value than the 35th but lower than the 36th you already had, you’d slot it in there.

This is called “insertion sort” - we insert each new value in the correct (so far) place. When we reach the last card, all cards are ordered correctly.

Here’s a visualization of insertion sort:

undefined — By Simpsons contributor - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=17512147

The problem with this approach is that it becomes very slow the more cards I give you. You can see it in the animation above: the first few spots are near-instant, but the last third takes forever, because every other “card” has to be moved. (This isn’t actually the case with physical cards, because you can slot a new one in without “moving” all the other ones, so the analogy breaks down a bit.)

52 cards might take you only a few minutes, or under a minute if you’re really fast. But the time it takes grows geometrically. In Computer Science, we’d express the worst-case performance as O(n²), where O means “runtime” and n is the number of cards.

In small cases (52 card deck) it doesn’t really matter. But if you’re sorting millions of things, you really don’t want your runtime going up by the power of 2.

This is where an algorithm like Quicksort comes in. Quicksort is of the “divide & conquer” family of algorithms, and the name says it all. It was invented in 1959 by Tony Hoare, an absolute genius of Computer Science.

To give you an idea of how much of a genius Hoare is, I had totally forgotten that he invented Quicksort. When I just looked up who did, I went, “Wait, he invented Quicksort, too?!” Because he’s also known for many other great ideas (CSP) and not so great ones (the null reference, because he basically invented references).

Quicksort begins with sort of a meta-question: we have a problem, an unordered deck of 1,000,000,000 cards. The problem of sorting that many cards is very difficult.

What if, as a first step, we decided to make it into 2 smaller problems instead?

Quicksort makes an educated guess about a “pivot point” and basically splits the deck in half, separating all the cards to the left and right of that point. No ordering has been done yet.

Then, Quicksort goes: this is pretty neat. We now have 2 easier problems. But what if we made those into 4 even easier problems?

So it goes ahead and splits each of the new “problems” again, making educated guesses about the pivot point of each one.

Now it has 4 much easier problems to solve.

You see where this is going, don’t you?

Modern versions of Quicksort don’t split all the way down until there’s only 2 cards per “deck,” turns out it’s actually faster to do the last 10 or so cards with a regular sort.

And there are now other sort algorithms that are fine-tuned, or offer other advantages.

But QuickSort is still taught in CS courses, because the idea is just so revolutionary: don’t solve hard problems. Turn hard problems into easy problems, and then solve easy problems. Why would you solve a hard problem if you could avoid it?!

Here’s Quicksort visualized. Notice how the “deck” of values is split into smaller and smaller parts.

Riff Trials are Divide & Conquer

I propose that the Potato Diet Riff Trial is like Quicksort because it divides the hard problem of “which diet will make all people, of any sort, reliably lose weight?” into smaller problems.

The idea of a deck of cards with 1 billion cards may sound ridiculous. But let’s seriously think about obesity: what causes it? It could be a food (cake?). It could be an element (lithium, anyone?). It could be a pesticide (glyphosate?). It could be a certain gene.

There are millions of foods, 118 known chemical elements, thousands or tens of thousands of pesticides and other environmental toxins. How many genes are there in a human? Let me answer that by asking another question: how do you define gene? Depending on the definition, it’s apparently either around 25,000-30,000, or 200,000-300,000 genes.

And, of course, it could be any combination of these. So while we can “only” come with a couple hundred thousand to million potential suspects, the combinatorial complexity explodes quickly, and we understand why obesity hasn’t solved yet: with existing methods, or “algorithms,” it’s essentially unsolvable. Unless some researcher got supremely lucky, there’s no way you’d sort this complex of a problem using a linear algorithm like Big Science is using in any reasonable amount of time.

Quickscience

What if, first, we decided to iteratively qualify/disqualify hypotheses? And what if we decided to distribute the work of not just running the experiment, but even coming up with the experiment, to random, interested people on the internet?

The role of Slime Mold Time Mold in this is basically simply that of instigators and a bit of cat herding. They added the potato part, but you’re totally free to come up with the second half of the experiment. You’ll run it yourself, on yourself, on your own time.

All you do is let them know your hypothesis, how you will test it, and the results.

If I’m honest, I’m pretty jealous I didn’t come up with this myself.

And in case this wasn’t clear, if you’re even remotely interested in potatoes or losing weight, you should absolutely sign up for the Potato Diet Riff Trial.

As another famous potato enthusiast once said:

20 Likes∙

1 Restack

John Lawrence Aspden

Dec 6, 2023

I do hope that the analogy with quicksort vs insertion sort is not supposed to be too close. Perhaps 'some algorithms are better than others' could be supported?

But generally I agree. SMTM have come up with an excellent technique for massively parallel hypothesis generation, and I think it will yield interesting results quickly.

Expand full comment

Like (2)

Dude all links to your blog are now broken, including the top search engine hit for 'experimental fat loss'. I had trouble finding it.

You need to tell your domain provider that http://exfatloss.com should redirect/also point to http://www.exfatloss.com.

4 replies by Experimental Fat Loss and others

6 more comments...

Experimental Fat Loss

Discussion about this post

Cookie Policy