sorv

sample of randomized voters

Home

About

Understanding the "pile-on lottery"

Advantages of the sorv system

How to implement scalable fact-checking

Why transparency at the algorithm level is not enough

Using sorv to fight social-media-induced depression

With sorv, government wouldn't need special privileges to report rules violations

Transparency at the Algorithm Level Is Not Enough

One of the arguments in favor of the sorv algorithm -- release your content to a random sample of the target audience, get feedback from the target audience, and the content is or isn't promoted to the rest of the audience based on that feedback -- is transparency: you know whether the audience sample liked it relative to other content, and how that determined whether or not the site showed it to the rest of that audience.

This raises the question of whether existing measures of "transparency" are enough -- for example, Reddit has released the source code of the entire site, and on Twitter/X, even though the company has not released their source code, you can trace the history of when a particular tweet was first posted, who liked and retweeted it, and when it gained traction as a result.

The answer, unfortunately, is no -- because transparency at the algorithm level, or at the level of revealing the likes/shares that led to a piece of content going viral, only shows you what happened, not why. In particular: It doesn't tell you how much had to do with the quality of the post itself, and how much was due to randomness and/or the pile-on effect.

If I post 10 silly jokes on Reddit in a row, and most of them get 0-10 votes and one of them gets 50,000, I know the wildly disproportionate number is probably due to the pile-on lottery, but I know that because of common sense, not because their source code is public. And with a more subtle difference (say one joke gets 500 votes and another gets 3,000), it might be tempting to believe that it's because the average user actually likes the second joke better -- but with the randomness of the pile-on effect, we can't conclude that.

But with sorv, you have complete transparency about why one piece of content got the score that it did -- it's released to a representative random sample of the target audience, and the site tells you exactly what percent of the target audience "liked" or "upvoted" it. Depending on the model used to collect initial feedback, the feedback may even tell the creator why people did or didn't like it. Then that score directly translates to how widely the content is shown to the rest of the target audience.