How to implement scalable handling of abuse reports, using sorv

(Note this is very similar to the recommendation for handling scalable fact-checking.)

This is how the sorv algorithm could be applied to handling of abuse reports:

Some subset of the site's users opt in as "jurors" to handle abuse reports.
When a user submits an abuse report (which includes the content being reported, and the rule that it allegedly violates), the abuse report is released to a random sample of (for example) 10 people who have opted in as "jurors".
Jurors separately and independently vote on whether they think the content violated the rule.
If more than (for example) 7 out of 10 jurors agree that the content violated the rule, then it gets removed. (Depending on the implementation, the creator of the content may have the option to appeal, and the appeal could be handled further by a sorv-type algorithm as well.)
Also depending on the implementation, jurors could add comments explaining why they agreed or disagreed that the content violated the rule. (The votes and the comments could be anonymized, if jurors are afraid of retaliation from people whose content got taken down.)

This has all the advantages that sorv has when used for any other purpose: (1) it's scalable (the more users sign up for the site, the more people will opt in as jurors, assuming the "juror opt-in rate" remains constant over time; (2) it's non-gameable (you can't get all of your friends to "go vote that this content broke a rule", because the system selects the jurors from the entire available population; (3) it's fast (the system can ping jurors who are currently online, which means the abuse report could be handled in real time); (4) it's optimized for users to be able to accept criticism (basically: nobody likes to be told they broke a rule, but if you tell them that a group of people separately and independently concluded that they broke a rule, that's the best you can do as far as convincing them).

Now, you might reasonably ask why anyone would opt in to be a "juror" in the first place. The best indicator we have is that lots of people did sign up to be volunteers for Community Notes on X/Twitter -- working for free, even for the benefit of a company (later) owned by a billionaire that a lot them didn't particularly like -- perhaps because they thought it would be interesting, perhaps because they believed it served the greater good even if was for someone else's profit. And of course, people already go through the multi-step process of reporting rule-violating content even when there's nothing in it for them, and even knowing that their report might not even get looked at. The sorv system would give people a chance to participate in rules enforcement knowing that it would work (and it's reasonable to assume this would increase people's willingness to participate).

This does however depend on the rules being written in a way that the average person can understand them, so that jurors are able to enforce them as intended.

Finally, there is the question of whether the social media site finds it acceptable to show "jurors" content that may violate the Terms of Service. Suppose you are weighing two choices -- having the social media company adjudicate complaints themselves, vs. having the complaints adjudicated by volunteer jurors -- and consider this scenario:

An account post some content that violates the TOS. As long as it remains up, it will be viewed by 10 people per minute.
Once a complaint is submitted, if the social media site adjudicates the complaint using their own staff, it will take 1 hour for the complaint to get through the backlog and someone looks at it.
On the other hand, if the social media site uses volunteer jurors to adjudicate the complaint, it will be adjudicated within 5 minutes.

If the site uses its own employees to handle complaints, then the post will be viewed by 600 users (60 minutes x 10 users/minute) before being taken down (plus 1 if you count the employee who looks at it). If the site uses volunteer jurors to handle the complaint, the post will be viewed by 60 people (50 regular users in the 5 minutes that the post remains up, plus the 10 jurors who are summoned to vote on it) before being taken down. So from a pure "trolley problem" point of view, if you're just trying to minimize the numbrer of people who view bad stuff, using the jurors would appear to be the better option. (If you assume jurors are less bothered by potentially TOS-violating content because they signed up to moderate it, then that option looks even better.)

But just like with the original trolley problem, things change if you look at it as less of a pure accounting problem and more of a philosophical problem. In particular: The regular users who see the illicit content are just "running into" it, but in the jury vote, the site is intentionally showing the content to those users. This may feel different to the jurors themselves (even if it's what they signed up for), it may feel different to other users of the site just knowing that their preferred site is running things that way, and it may indeed feel different to the company's lawyers, who believe there's a legal difference between "leaving something up too long where people see it" versus "showing it to people". This probably wouldn't make much difference for low-level violations like N-bombs (which are not illegal, and which few people find truly traumatic, and those people shouldn't sign up as jurors anyway). But it could make a bigger difference in the case of illegal content like child porn, or abusive content that is legal but potentially still traumatizing to viewers, even jurors who signed up for it. I can imagine a lawyer arguing the trolley problem logic in court: "Your Honor, we don't have the resources to review every complaint immediately, so by using the juror system, we are able to reduce the number of people who view the content by 90%." But a judge might arbitrarily agree or disagree.

The solution, I think, is to have separate handling for truly illegal or potentially traumatizing material, versus more "mundane" TOS violations. If a user submits a report about illegal/traumatizing material, it can be routed to the actual company employees for review. If a user submits a report about content that is not be illegal and not traumatizing to most people (racial slurs, casual threats of violence, etc.), then it can get routed to the volunteer jurors. Volunteer jurors can adjudicate a complaint much faster, so this would give users an incentive to use that option for low-level TOS violations, if they want a quick turnaround.