It's welcoming. When joining an online community (either a new subreddit of a site
like Reddit, or an entirely new site like StackExchange), one frustrating experience is to read the
rules and guidelines and try to participate accordingly, only to find out that your post gets
removed or penalized for
breaking "unwritten rules" which are different from the rules shown to new users.
Alternatively, even if
you are not penalized for actual rulebreaking, you can find that your contribution gets rated as
a low-quality post because the
"official" community guidelines are different from the unwritten rules about what content the
members like in practice. (This is apart from the fact that your content might get a low
rating for other reasons, like the randomness produced by the "pile-on effect".)
The sorv system mitigates the first problem, as long as you read the written rules and try to
follow them. If your post is flagged for violating a rule,
a random sample of jurors will be asked to vote on whether your post violated the rule, and they
will be shown the specific text of the rule you allegedly violated. If a typical person would
not think that your
actions violated a common-sense reading of the rule, then most jurors should not think so, either.
This makes it less likely for a new user to be penalized for violating the "unwritten rules" of
a community, because any flagging of a post is required to cite one of the written rules.
Note that this only works if the community actually wants the rules to be understandable
by new members. Sometimes, a community's rules may be deliberately obscure,
like the rules of the card game "Mao",
so that new members have to discover them as a form of benign "hazing" and part of the fun
(or, perhaps, an informal price you have to pay to prove that you're serious about wanting to join the
community).
Alternatively, a community's owners might prefer to keep the "real rules" secret from their users.
In 2017, internal documents were
leaked from
Facebook describing the rules that Facebook's internal censors used to determine whether to
remove a post -- rules that had been kept hidden from the general public and were much more
specific than Facebook's published
"Community Standards". Perhaps Facebook had
been worried that the detailed rules would prove too controversial if the public saw them. (That does appear
to be what happened -- the resulting article was titled, accurately,
"Facebookâs
Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children." Facebook's
official rationale
was that "white men" is an intersection of two protected groups -- "white" and "men -- whereas "black children"
is an intersection with a group, "children", that is not protected.)
Similarly,
Reddit CEO Steve Huffman said at RightsCon 2017 in Toronto that Reddit designed their rules to be "specifically
vague." In this case, the given reason was not that users would find the real rules offensive, but
rather: "If users knew where the line was, they would walk right up to it."
But even in cases where a community doesn't want to be transparent about all of its rules, it could still use sorv
to enforce some of them. In Reddit's case, even if the company's policy is to be "specifically vague" about some rules,
the rule against doxxing (publishing another person's private contact information) is pretty clear-cut. By using sorv
"peer jurors"
to adjudicate a doxxing complaint, Reddit could still get the content removed much faster than the normal abuse
desk could process it.
Finally, even if a particular community wants to
be transparent about all of their rules, and wants new members to be able to understand them, there is
the possibility of "interpretation drift," where the community's interpretation of a written rule changes over time to
something different from how a new user would understand it. (This applies to both "rules" about what constitutes
rule-breaking, and looser "rules" about what content is considered "good content" in the community that should
be upvoted and promoted to the most members.)
The default sorv implementation will not fix this problem, because when members of the community are adjudicating
a rule violation or voting on the quality of new content, they're voting based on the drifted interpretation of the
rules.
One way to prevent "interpretation drift", if the community is a subset of a larger site
(for example, a subreddit which is a subset of users
on Reddit), is to have community rule violations adjudicated by a random sample of users outside the community, who
could be expected to interpret the rules in the same way as a new user would.
(Here we are talking about rules specific to that community, not the site-wide rules. Site-wide rule violations would already
be adjudicated by a random sample of jurors chosen from the broader community.) For example:
- User X user joins a subreddit like /r/todayilearned and posts a link
to a news story 10 days old.
-
User Y different user flags the post as violating the subreddit's
rule against
posting news stories less than 60 days old.
- If User X doesn't immediately concede the point and withdraw the post
(which saves the trouble of convening a "jury"), then a random sample
of jurors are chosen from all users across the Reddit site. Even users who have never heard of /r/todayilearned
can clearly understand the rule -- "Any sources (blog, article, press release, video, etc.) more recent than two months are not
allowed" -- so they vote to uphold the report that the post broke a rule.
This would only work if the community itself were open to having community rule violations reviewed by non-members;
some private communities might not want outsiders to have access to any community posts (much less the controversial
ones that get flagged for rulebreaking).
And of course, for "outside review" to be possible, there
have to be users who are outside the community but are still part of the same system (so that the system can
prompt them to vote on a rule violation); it wouldn't work if your "community" comprises all users of YouTube, because
there are no users outside that set that you can turn to. On
the other hand, it's reasonable to assume that "interpretation drift" is more likely to happen in a small
group of users who reinforce and feed off each other's shifting interpretation of the rules, and not a massive group
like the hundreds of millions of users of YouTube.
If "outside review" is not possible, you could also prevent "interpretation drift" by having rule violations voted on by
a random sample of relatively new members of the community. This does depend on new members joining the community
at a fast enough rate (before they have been around long enough to be subject to the "drift" themselves). This also makes it
potentially easier for malicious users to game the system.
This is because one way that sites prevent "sockpuppet voting" -- where a single person creates multiple
user accounts to influence votes and ratings -- is to give more weight to the votes of accounts that have been around longer
or accumulated a higher score through their past actions. If a community gives disproportionate weighting
to the votes of new accounts,
then it becomes easier for an attacker to create multiple accounts and influence voting (especially if the total pool of new
accounts is small). A possible countermeasure here is to give weight to members who are new to that community but
have longstanding accounts with higher scores on the site as a whole -- for example, on Reddit, giving votes to users
who are new to a particular subreddit (like /r/movies) but have accumulated a high
"karma" score through their actions elsewhere on the site.
Whether you use the votes of outside users or new users to prevent "interpretation drift", in either case you don't have
to use them to adjudicate every rule violation that goes to a vote. If you want to guard against interpretation drift,
you can take a small random subset of rule violations that were adjudicated by the community, and have outsiders and/or new users vote on
the same rule violations, to see if they reach the same conclusion. In your random sample of re-votes, if
many new-or-outsider votes reach a different conclusion from the regular-community jury votes, that's a sign that "interpretation drift"
is occurring.
It's non-gameable.
In existing abuse-reporting systems, if many users report the same content simultaneously, the system assumes the reports
are valid and removes the content. (See for example
this story
about an 'alternative medicine' community that mass-reported a pro-GMO page on Facebook until Facebook shut the page down.)
Similarly, users can round up "voting cliques" to upvote their own content (or downvote a rival's content) even without filing
official abuse reports. Virtually all social media sites prohibit this, but if the system sees 10 upvotes for a particular piece of content,
there's no reliable way for the system to know if the votes are "real" or if they are there is some secret agreement between those
10 users.
However, in a sorv system, in both cases the voters don't self-select; rather, the system selects them from the
eligible population. If 1 million Reddit
users have opted in as jurors, and 100,000 of them are online at a given moment,
and a user files an abuse report, the system randomly picks (say) 10 users from the entire
population of 100,000 online jurors, and asks them to vote on the validity of the report. It is very hard for
an adversary to "stack the
deck" and control half of the votes of the 10-person jury, unless they control about half of the 100,000 accounts of jurors who
are eligible to be picked.
Of course, it's much harder for an adversary to create 50,000 fake accounts or coordinate the efforts of 50,000
users without getting caught.
It's extremely fast, when it needs to be. Unlike existing systems, where it may take hours or days before you get a response
to your abuse report, if you file an abuse report using the sorv system, and the system selects "jurors" from users
who are currently online, all of them can vote on your abuse report at the same time. If, for example, you've reported
a post for "doxxing" someone -- publishing their private phone number and address (on a site like Reddit where this is not allowed)
-- it
should take less than 60 seconds for a juror to determine if the report is valid, and since all jurors can make their
decision at the same time, the results should be in within 60 seconds. You can even watch in real time as the "Yes"
and "No" votes are tallied, along with any comments that jurors have appended to their votes.
This quick turnaround time is especially important for doxxing cases, because the longer a person's private phone number
and address remains online in a public post, the more quickly it will be shared around.
For evaluating content quality, the quick turnaround time might not always apply. If you've submitted a
new joke to a joke-sharing community, then each member of your randomly chosen jury can usually decide within 60 seconds
if they think the joke is any good. However, if you've submitted a musical piece, a poem, a short story, or anything else
whose merit cannot be evaluated quickly, the turnaround time will be slower.
But even in that case, the system is still "as fast as it can be". If you're submitting a short story which takes a reader,
say, one hour to evaluate on hits merits, then one hour is the hard minimum time that it would take any community-based
system to give the story a rating. But if the system selects content raters who are currently online, and they all
evaluate the content at the same time, then they can submit their ratings and you can have your
results in about an hour.
It's transparent (i) -- at the level of the individual abuse reports and content ratings.
In most existing abuse-report systems, after you file an abuse report, you have no information about how it was evaluated
(by a person? an automated algorithm?), how many people reviewed the decision (if any humans were involved at all),
and whether any human evaluators had any comments about the report. In the sorv jury system, you would get real-time
feedback about how each juror voted, and you could see any comments they added about why they agreed or disagreed
with the abuse report.
Similarly, if you submit a piece of content to be rated by members of a community, you get transparent
feedback about why it received the rating that it did (and therefore why the system did or didn't promote it to
the rest of the community). Each randomly chosen content-rater can submit a rating along with their own comments
about why they gave the rating.
(At that point, the content author can even exchange messages with the individual jurors about their ratings, and perhaps attempt
to change their minds -- however, the system probably should not allow the jurors to change their votes as a result
of any conversations with the content author. This is because the jurors' votes should be representative of how the
community as a whole would evaluate the content, and if the jurors are changing their votes after discussions with the
author, then their votes are no longer representative of how a regular user would view it. Whatever information the author
is giving the jurors that is causing them to change their minds, if the author wants that information reflected in the rating,
then author should make edits to the content itself to incorporate that information, and the content should be re-submitted to be
evaluated by a new jury.)
It's transparent (ii) -- at the level of the algorithm as a whole.
A maxim of computer security is that for an algorithm to be considered secure,
an adversary should be able to know all the details of the algorithm and
still not be able to defeat it. In this case, the adversary can know everything about the sorv process, and
it is still very difficult to game the system unless the adversary manages to control a majority of all accounts
that are eligible to be selected as jurors.
It gives users a sense of ownership in the proper functioning of the site.
On most existing social media sites, of course, users can already "participate" in content moderation
by filing abuse reports against posts that break a rule. But it fosters cynicism about the system when
users file abuse reports and never receive a response, or when their own content (or their friend's
content) is removed for "violating a rule" even if they never receive any information about
what rule it allegedly broke. If users can sign up as jurors to adjudicate abuse complaints, they can
feel like they are playing a role in keeping the site from being overrun with spam and abusive posts,
while having complete transparency into why a particular complaint was upheld or rejected.
It makes it easier to accept criticism when multiple peers separately and independently disagree
with you. Everyone should be willing to accept some criticism, but a rational person knows that
a single person criticizing you might still be wrong. Even if a group of four people have gathered at a table
to discuss your work with you, and all four of them are criticizing the same aspect of your work, it's still
possible that the first person who spoke up was wrong, but that they unduly influenced the opinions of the
other three. Whether or not this actually is the case, the fact that it could be the case makes it
possible for the recipient of the criticism to dismiss it as the invalid opinion of one person who steered
everyone else the same way.
However, if multiple people separately and independently reach the same conclusion about
something you've submitted, and the sample is too large for this to be a coincidence, at this point a rational
person should accept that the group's opinions are representative of what the target audience would think.
Of course you could still decide that you like your creation the way it is, and you don't care that the intended
audience wouldn't appreciate it. But it's no longer possible to claim that your target audience would
like your creation and it's just an inefficiency in the system that is preventing them from seeing it.
Of course, this only makes it easier for a rational person to accept criticism. Many people are not
rational, especially when it comes to accepting criticism! But the best you can possibly do is to present someone
with evidence that multiple people independently came to the same conclusion.
It's meritocratic and non-arbitrary.
This is perhaps the most subtle and least-appreciated advantage of the sorv system, but
in many ways the most important.
Transparency, fast turnaround time, scalability, and hack-proofing are all beneficial properties
of the algorithm. But if you watch a popular conspiracy-theory video on YouTube and you
post a rebuttal video explaining the correct facts -- or, if you find a highly
viewed tutorial video that gives the wrong directions
for solving a chemistry problem, and you make an alternate video with the right directions -- or,
if you find a how-to video that isn't actually wrong, but you are able to make one that
most people agree (in a double-blind test) is actually better -- in all of those cases, your
"better alternative" probably won't get more than 1% of the hits of the content that you were trying
to improve on, and the reason has nothing to do with Russian bots, people gaming the system,
or overburned abuse-reporting departments. It's because of the pile-on effect.
In the pile-on effect, when a post is made, several people see it in a short time frame and like/share/upvote
it. This causes the system to share the post more widely, causing more users to see it,
and if enough of those users like/share/upvote it within a short time frame as well, it will be promoted to
a wider audience, and the cycle continues.
This sounds innocuous enough, but because so much depends on luck (a critical mass of users seeing
and liking the post at each stage), it produces outcomes that are highly skewed (a small number
of posts "go through the roof", while most of them languish with vastly fewer views) and highly
non-meritocratic.
The highest-voted post that I ever submitted to Reddit was a submission on
/r/todayilearned/ which said:
'TIL
that years before she was famous, Keira Knightley played Natalie Portman's double in "The Phantom
Menace", and when the girls were in full makeup, even their mothers had trouble telling them apart.'
This received 66,100 upvotes (if you're not familiar with Reddit, that's a lot) and got to the top of the
front page, but many of the comments were along the lines of: "This is a pretty boring 'fact'; why
the hell did it get 66,000 upvotes?" I replied to as many of the comments as possible, saying, "Yes,
the reason a boring post like this can get 66,000 upvotes is because of the pile-on effect,
here is the alternative system that I've been promoting" and
proceeding to type out an early description of the sorv algorithm, before ran out of time to respond
to all of the people complaining. But out of all the times I've posted to /r/todayilearned/ , the far more common outcome
was for the post to get less than 10 upvotes and fizzle out, even if the fact was posting was objectively
probably more interesting than this piece of Keira Knightley trivia.
The first-order effect of this randomness is that the posts that become most popular are not necessarily
the ones that users would most like to see. But the second-order effect is that once content creators
realize that the outcome is dominated by luck, there is much less incentive to create good content
in the first place.
By releasing the content to a representative random sample of users, and averaging their votes, sorv avoids these
problems. The votes are not subject to the "pile-on effect" because users are voting without seeing each other's
votes. As long as the random sample is large enough, the outcome is non-arbitrary in the sense that if you
submitted the same content twice, the average rating would be about the same.