Home
About
Understanding the
"pile-on lottery"
Advantages of the sorv system
How to implement scalable
fact-checking
Why transparency
at the algorithm level is not enough
Using sorv to fight social-media-induced depression
With sorv, government wouldn't need special privileges to report rules violations
|
Beyond fact-checking - using sorv to handle other types of misleading posts
Suppose we succeed beyond our wildest dreams in getting X/Twitter (for example) to implement
sorv for their Community Notes system
as described here), so that for the first time ever,
any account with a medium-or-bigger
following that publishes a provable falsehood will probably see a Community Note attached to it, possibly
within a few minutes after it being published.
One consequence is that posters might now to evade Community Note labels by avoiding easily provable
lies, but deceiving users in other ways (for example, publishing statements without a source, so that the
burden is on the Community Notes volunteer to disprove it).
To mitigate this, X/Twitter could expand Community Notes criteria to include other types of posts that
lower the quality of the discourse:
-
Statements where a source should obviously exist, without providing one. Under the current rules,
a statement can only have a Community Note attached if you can show that it's false. This leads to
asymmetric warfare, since ss long as the burden of proof is on the person submitting the Community Note
to show that the tweet is false, a deliberate liar can create false tweets much faster than
volunteers can disprove them.
For example, on September 2, 2024, Twitter user @akafacehots sent a tweeted
to their 370,000 followers:
BREAKING: CNN just released their fact check of Kamala Harris' speech at the DNC Convention.
It turns out Kamala Harris lied 113 times within her 37 minute speech. Retweet so all Americans see this
vital fact check on Kamala Harris.
Several people (including me) posted replies
saying this was false, but none of the replies got close to the 4.6 million views for the original false tweet.
And it still took me a few minutes of research just to post a reply saying it was false -- and the only
reason it was that easy, was because the author of the original tweet decided to use the
correct duration (37 minutes) for Harris's speech, which made it easy to Google CNN's article about it. Even if
that only took a few minutes of work, that's still much longer than the 20 seconds or so that it would have taken
the author to write the original hoax tweet. (That tweet did eventually get
Community Noted, but not until after millions of people had
already seen it.)
Compare that to the situation where Community Notes rules say that if you make a statement that should obviously have
a source (e.g. a claim about a CNN article), you have to link to the source. Then as soon as a fact-checker sees the
hoax tweet, they just click a button to report: Makes a claim without a source, boom.
-
Statements that are not literally false but are at least partly negated by additional information.
One thing that I think Twitter got exactly right is that it's Community Notes labels are applied with the words
"Readers added context:". Sometimes the "context" is literally calling out the tweet as false, but other times,
the tweet is not literally false but the additional context would cause most people to change their perception
of the statement.
On
September 25 and
October 3, 2024, content creator Benny Johnson
tweeted out videos of Trump-supporting crowds on college campuses with slightly differently worded captions,
both ending with, "Trump is winning the youth vote." The videos are almost certainly "real". On the other hand,
an October poll
of college students showed Harris beating Trump by among college voters, 57% to 19%.
Now perhaps most people wouldn't be silly enough to think that the conclusion "Trump is winning the youth vote" follows
logically from a video of a crowd. But misinformation like this, even if it's easily debunkable, can be part of an information
bubble that isolates the user ever further from the truth, and a simple fact-check label can mitigate that.
-
Plausibly deniable non-endorsement endorsements of a viewpoint. Radio host Larry Elder once
took a call
from a vaccine skeptic who said, among other things, that Bill Gates knew the COVID-19 vaccine was harmful but was pushing
it on minorities in the name of "population control". Elder posted it to his YouTube channel with the title
"You'll want to hear this physician’s take on the vaccines." (Elder is widely known to have conservative positions, but is
not usually associated with views as fringe as this.)
But the wording "You'll want to hear..." is weaselly enough that a literalist could claim he never actually said he agreed,
he was just saying "You will want to hear this, and then make up your own mind!" To prevent people from evading a fact-checking
penalty with these kinds of word games, the rules can simply say that if a reasonable person would interpret a phrase as
an endorsement, then it counts as an endorsement.
-
Accounts promoting only stories about crimes committed by a specific demographic. This applies to
accounts like @MrAndyNgo and @LibsOfTikTok which overwhelmingly promote news stories about crimes committed
by immigrants and/or gay and trans people. Some of their claims are blatantly false (see
this post from @LibsOfTikTok owner
@ChayaRaichik10) but a lot of them are true stories that, collectively, still present an overwhelmingly
false narrative, by disproportionately blaming crime on the targeted groups. (In real life, most
studies have found that undocumented immigrants commit
less violent
crime than people
born in the U.S.). In their own way, these highly skewed selections of stories are just as deceptive as outright lies.
This could be mitigated by simply amending Community Notes policy to allow Notes in those situations -- a note which
says, "This is a true story, but this account exists to publicize crimes or incidents involving certain demographics,
which has the effect of misleading the audience." This also means the Notes can stop being applied if the account
ever starts focusing on other types of content. (Theoretically, the Notes could also stop being applied if the
account starts publishing accounts of crimes committed by all demographics, but I've never seen that happen,
probably because a feed of stories about regular crimes would be boring to almost everyone without the racism or
transphobia angle.)
Alternatively, the system could be modified to apply a Community-Note-type warning to the account itself (which would
then be displayed on all tweets from that account) -- essentially a "credibility banishment", at least to people
who give less credibility to tweets with Community Notes on them. And it can be revoked if the account ever changes
their practices. But this would require not just a policy change in Community Notes but also a code change in Twitter,
to allow a fact-check warning to be applied to an entire account.
The goal is not to catch every type of deceptive post that a user might try to make after the Community Notes
rules have been tightened. The goal is to create an environment where crafting a dishonest statement takes
enough effort that it's actually less trouble to just go ahead and write a good-faith argument in the first place.
|