Advantages of the sorv system

We consider the use cases of the sorv system that would be most common for social media sites: (a) using sorv to evaluate the merit of new content (to determine how widely it should be promoted to the target audience); (b) using sorv to adjudicate abuse reports; and (c) using sorv to adjudicate fact-checks.

It's scalable. The problem with a non-sorv system is that if abuse reports are all reviewed by the staff of a company, then if the user base of a site doubles, and the average number of abuse-reports-per-user remains constant, then the staff will have twice as many abuse reports to read.

However, consider a sorv system where jurors have signed up to review abuse reports. If the number of abuse reports doubles, then, holding all other things constant, each juror will have twice as many abuse reports to review. However, if the number of jurors doubles, then, holding all other things constant, each juror will have half as many abuse reports to review, because the workload is divided among a larger group. So, as long as both the number of abuse reports and the number of jurors grows in proportion to the number of users of the site, the workload per-juror per-time-period will remain constant.

It's welcoming. When joining an online community (either a new subreddit of a site like Reddit, or an entirely new site like StackExchange), one frustrating experience is to read the rules and guidelines and try to participate accordingly, only to find out that your post gets removed or penalized for breaking "unwritten rules" which are different from the rules shown to new users. Alternatively, even if you are not penalized for actual rulebreaking, you can find that your contribution gets rated as a low-quality post because the "official" community guidelines are different from the unwritten rules about what content the members like in practice. (This is apart from the fact that your content might get a low rating for other reasons, like the randomness produced by the "pile-on effect".)

The sorv system mitigates the first problem, as long as you read the written rules and try to follow them. If your post is flagged for violating a rule, a random sample of jurors will be asked to vote on whether your post violated the rule, and they will be shown the specific text of the rule you allegedly violated. If a typical person would not think that your actions violated a common-sense reading of the rule, then most jurors should not think so, either. This makes it less likely for a new user to be penalized for violating the "unwritten rules" of a community, because any flagging of a post is required to cite one of the written rules.

Note that this only works if the community actually wants the rules to be understandable by new members. Sometimes, a community's rules may be deliberately obscure, like the rules of the card game "Mao", so that new members have to discover them as a form of benign "hazing" and part of the fun (or, perhaps, an informal price you have to pay to prove that you're serious about wanting to join the community).

Alternatively, a community's owners might prefer to keep the "real rules" secret from their users. In 2017, internal documents were leaked from Facebook describing the rules that Facebook's internal censors used to determine whether to remove a post -- rules that had been kept hidden from the general public and were much more specific than Facebook's published "Community Standards". Perhaps Facebook had been worried that the detailed rules would prove too controversial if the public saw them. (That does appear to be what happened -- the resulting article was titled, accurately, "Facebook’s Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children." Facebook's official rationale was that "white men" is an intersection of two protected groups -- "white" and "men -- whereas "black children" is an intersection with a group, "children", that is not protected.)

Similarly, Reddit CEO Steve Huffman said at RightsCon 2017 in Toronto that Reddit designed their rules to be "specifically vague." In this case, the given reason was not that users would find the real rules offensive, but rather: "If users knew where the line was, they would walk right up to it."

But even in cases where a community doesn't want to be transparent about all of its rules, it could still use sorv to enforce some of them. In Reddit's case, even if the company's policy is to be "specifically vague" about some rules, the rule against doxxing (publishing another person's private contact information) is pretty clear-cut. By using sorv "peer jurors" to adjudicate a doxxing complaint, Reddit could still get the content removed much faster than the normal abuse desk could process it.

Finally, even if a particular community wants to be transparent about all of their rules, and wants new members to be able to understand them, there is the possibility of "interpretation drift," where the community's interpretation of a written rule changes over time to something different from how a new user would understand it. (This applies to both "rules" about what constitutes rule-breaking, and looser "rules" about what content is considered "good content" in the community that should be upvoted and promoted to the most members.) The default sorv implementation will not fix this problem, because when members of the community are adjudicating a rule violation or voting on the quality of new content, they're voting based on the drifted interpretation of the rules.

One way to prevent "interpretation drift", if the community is a subset of a larger site (for example, a subreddit which is a subset of users on Reddit), is to have community rule violations adjudicated by a random sample of users outside the community, who could be expected to interpret the rules in the same way as a new user would. (Here we are talking about rules specific to that community, not the site-wide rules. Site-wide rule violations would already be adjudicated by a random sample of jurors chosen from the broader community.) For example:

User X user joins a subreddit like /r/todayilearned and posts a link to a news story 10 days old.

User Y different user flags the post as violating the subreddit's rule against posting news stories less than 60 days old.

If User X doesn't immediately concede the point and withdraw the post (which saves the trouble of convening a "jury"), then a random sample of jurors are chosen from all users across the Reddit site. Even users who have never heard of /r/todayilearned can clearly understand the rule -- "Any sources (blog, article, press release, video, etc.) more recent than two months are not allowed" -- so they vote to uphold the report that the post broke a rule.

This would only work if the community itself were open to having community rule violations reviewed by non-members; some private communities might not want outsiders to have access to any community posts (much less the controversial ones that get flagged for rulebreaking). And of course, for "outside review" to be possible, there have to be users who are outside the community but are still part of the same system (so that the system can prompt them to vote on a rule violation); it wouldn't work if your "community" comprises all users of YouTube, because there are no users outside that set that you can turn to. On the other hand, it's reasonable to assume that "interpretation drift" is more likely to happen in a small group of users who reinforce and feed off each other's shifting interpretation of the rules, and not a massive group like the hundreds of millions of users of YouTube.

If "outside review" is not possible, you could also prevent "interpretation drift" by having rule violations voted on by a random sample of relatively new members of the community. This does depend on new members joining the community at a fast enough rate (before they have been around long enough to be subject to the "drift" themselves). This also makes it potentially easier for malicious users to game the system. This is because one way that sites prevent "sockpuppet voting" -- where a single person creates multiple user accounts to influence votes and ratings -- is to give more weight to the votes of accounts that have been around longer or accumulated a higher score through their past actions. If a community gives disproportionate weighting to the votes of new accounts, then it becomes easier for an attacker to create multiple accounts and influence voting (especially if the total pool of new accounts is small). A possible countermeasure here is to give weight to members who are new to that community but have longstanding accounts with higher scores on the site as a whole -- for example, on Reddit, giving votes to users who are new to a particular subreddit (like /r/movies) but have accumulated a high "karma" score through their actions elsewhere on the site.

Whether you use the votes of outside users or new users to prevent "interpretation drift", in either case you don't have to use them to adjudicate every rule violation that goes to a vote. If you want to guard against interpretation drift, you can take a small random subset of rule violations that were adjudicated by the community, and have outsiders and/or new users vote on the same rule violations, to see if they reach the same conclusion. In your random sample of re-votes, if many new-or-outsider votes reach a different conclusion from the regular-community jury votes, that's a sign that "interpretation drift" is occurring.
It's non-gameable. In existing abuse-reporting systems, if many users report the same content simultaneously, the system assumes the reports are valid and removes the content. (See for example this story about an 'alternative medicine' community that mass-reported a pro-GMO page on Facebook until Facebook shut the page down.)

Similarly, users can round up "voting cliques" to upvote their own content (or downvote a rival's content) even without filing official abuse reports. Virtually all social media sites prohibit this, but if the system sees 10 upvotes for a particular piece of content, there's no reliable way for the system to know if the votes are "real" or if they are there is some secret agreement between those 10 users.

However, in a sorv system, in both cases the voters don't self-select; rather, the system selects them from the eligible population. If 1 million Reddit users have opted in as jurors, and 100,000 of them are online at a given moment, and a user files an abuse report, the system randomly picks (say) 10 users from the entire population of 100,000 online jurors, and asks them to vote on the validity of the report. It is very hard for an adversary to "stack the deck" and control half of the votes of the 10-person jury, unless they control about half of the 100,000 accounts of jurors who are eligible to be picked. Of course, it's much harder for an adversary to create 50,000 fake accounts or coordinate the efforts of 50,000 users without getting caught.
It's extremely fast, when it needs to be. Unlike existing systems, where it may take hours or days before you get a response to your abuse report, if you file an abuse report using the sorv system, and the system selects "jurors" from users who are currently online, all of them can vote on your abuse report at the same time. If, for example, you've reported a post for "doxxing" someone -- publishing their private phone number and address (on a site like Reddit where this is not allowed) -- it should take less than 60 seconds for a juror to determine if the report is valid, and since all jurors can make their decision at the same time, the results should be in within 60 seconds. You can even watch in real time as the "Yes" and "No" votes are tallied, along with any comments that jurors have appended to their votes.

This quick turnaround time is especially important for doxxing cases, because the longer a person's private phone number and address remains online in a public post, the more quickly it will be shared around.

For evaluating content quality, the quick turnaround time might not always apply. If you've submitted a new joke to a joke-sharing community, then each member of your randomly chosen jury can usually decide within 60 seconds if they think the joke is any good. However, if you've submitted a musical piece, a poem, a short story, or anything else whose merit cannot be evaluated quickly, the turnaround time will be slower.

But even in that case, the system is still "as fast as it can be". If you're submitting a short story which takes a reader, say, one hour to evaluate on hits merits, then one hour is the hard minimum time that it would take any community-based system to give the story a rating. But if the system selects content raters who are currently online, and they all evaluate the content at the same time, then they can submit their ratings and you can have your results in about an hour.
It's transparent (i) -- at the level of the individual abuse reports and content ratings. In most existing abuse-report systems, after you file an abuse report, you have no information about how it was evaluated (by a person? an automated algorithm?), how many people reviewed the decision (if any humans were involved at all), and whether any human evaluators had any comments about the report. In the sorv jury system, you would get real-time feedback about how each juror voted, and you could see any comments they added about why they agreed or disagreed with the abuse report.

Similarly, if you submit a piece of content to be rated by members of a community, you get transparent feedback about why it received the rating that it did (and therefore why the system did or didn't promote it to the rest of the community). Each randomly chosen content-rater can submit a rating along with their own comments about why they gave the rating.

(At that point, the content author can even exchange messages with the individual jurors about their ratings, and perhaps attempt to change their minds -- however, the system probably should not allow the jurors to change their votes as a result of any conversations with the content author. This is because the jurors' votes should be representative of how the community as a whole would evaluate the content, and if the jurors are changing their votes after discussions with the author, then their votes are no longer representative of how a regular user would view it. Whatever information the author is giving the jurors that is causing them to change their minds, if the author wants that information reflected in the rating, then author should make edits to the content itself to incorporate that information, and the content should be re-submitted to be evaluated by a new jury.)
It's transparent (ii) -- at the level of the algorithm as a whole. A maxim of computer security is that for an algorithm to be considered secure, an adversary should be able to know all the details of the algorithm and still not be able to defeat it. In this case, the adversary can know everything about the sorv process, and it is still very difficult to game the system unless the adversary manages to control a majority of all accounts that are eligible to be selected as jurors.
It gives users a sense of ownership in the proper functioning of the site. On most existing social media sites, of course, users can already "participate" in content moderation by filing abuse reports against posts that break a rule. But it fosters cynicism about the system when users file abuse reports and never receive a response, or when their own content (or their friend's content) is removed for "violating a rule" even if they never receive any information about what rule it allegedly broke. If users can sign up as jurors to adjudicate abuse complaints, they can feel like they are playing a role in keeping the site from being overrun with spam and abusive posts, while having complete transparency into why a particular complaint was upheld or rejected.
It makes it easier to accept criticism when multiple peers separately and independently disagree with you. Everyone should be willing to accept some criticism, but a rational person knows that a single person criticizing you might still be wrong. Even if a group of four people have gathered at a table to discuss your work with you, and all four of them are criticizing the same aspect of your work, it's still possible that the first person who spoke up was wrong, but that they unduly influenced the opinions of the other three. Whether or not this actually is the case, the fact that it could be the case makes it possible for the recipient of the criticism to dismiss it as the invalid opinion of one person who steered everyone else the same way.

However, if multiple people separately and independently reach the same conclusion about something you've submitted, and the sample is too large for this to be a coincidence, at this point a rational person should accept that the group's opinions are representative of what the target audience would think. Of course you could still decide that you like your creation the way it is, and you don't care that the intended audience wouldn't appreciate it. But it's no longer possible to claim that your target audience would like your creation and it's just an inefficiency in the system that is preventing them from seeing it.

Of course, this only makes it easier for a rational person to accept criticism. Many people are not rational, especially when it comes to accepting criticism! But the best you can possibly do is to present someone with evidence that multiple people independently came to the same conclusion.
It's meritocratic and non-arbitrary. This is perhaps the most subtle and least-appreciated advantage of the sorv system, but in many ways the most important.

Transparency, fast turnaround time, scalability, and hack-proofing are all beneficial properties of the algorithm. But if you watch a popular conspiracy-theory video on YouTube and you post a rebuttal video explaining the correct facts -- or, if you find a highly viewed tutorial video that gives the wrong directions for solving a chemistry problem, and you make an alternate video with the right directions -- or, if you find a how-to video that isn't actually wrong, but you are able to make one that most people agree (in a double-blind test) is actually better -- in all of those cases, your "better alternative" probably won't get more than 1% of the hits of the content that you were trying to improve on, and the reason has nothing to do with Russian bots, people gaming the system, or overburned abuse-reporting departments. It's because of the pile-on effect.

In the pile-on effect, when a post is made, several people see it in a short time frame and like/share/upvote it. This causes the system to share the post more widely, causing more users to see it, and if enough of those users like/share/upvote it within a short time frame as well, it will be promoted to a wider audience, and the cycle continues.

This sounds innocuous enough, but because so much depends on luck (a critical mass of users seeing and liking the post at each stage), it produces outcomes that are highly skewed (a small number of posts "go through the roof", while most of them languish with vastly fewer views) and highly non-meritocratic.

The highest-voted post that I ever submitted to Reddit was a submission on /r/todayilearned/ which said: 'TIL that years before she was famous, Keira Knightley played Natalie Portman's double in "The Phantom Menace", and when the girls were in full makeup, even their mothers had trouble telling them apart.' This received 66,100 upvotes (if you're not familiar with Reddit, that's a lot) and got to the top of the front page, but many of the comments were along the lines of: "This is a pretty boring 'fact'; why the hell did it get 66,000 upvotes?" I replied to as many of the comments as possible, saying, "Yes, the reason a boring post like this can get 66,000 upvotes is because of the pile-on effect, here is the alternative system that I've been promoting" and proceeding to type out an early description of the sorv algorithm, before ran out of time to respond to all of the people complaining. But out of all the times I've posted to /r/todayilearned/ , the far more common outcome was for the post to get less than 10 upvotes and fizzle out, even if the fact was posting was objectively probably more interesting than this piece of Keira Knightley trivia.

The first-order effect of this randomness is that the posts that become most popular are not necessarily the ones that users would most like to see. But the second-order effect is that once content creators realize that the outcome is dominated by luck, there is much less incentive to create good content in the first place.

By releasing the content to a representative random sample of users, and averaging their votes, sorv avoids these problems. The votes are not subject to the "pile-on effect" because users are voting without seeing each other's votes. As long as the random sample is large enough, the outcome is non-arbitrary in the sense that if you submitted the same content twice, the average rating would be about the same.