sorvsample of randomized voters |
|
Understanding the "pile-on lottery" How to implement scalable fact-checking Why transparency at the algorithm level is not enough Using sorv to fight social-media-induced depression With sorv, government wouldn't need special privileges to report rules violations |
The pile-on lottery (aka the Salganik effect)A post a lot of silly jokes and fun facts on Reddit. Most of them get between 0-5 upvotes, and once in a while, one of them will blow up to about 50,000. If I show people the list of jokes, most people can't guess based on "quality" which were the ones that blew up. One would be tempted to think this is a quirk of the Reddit algorithm. But in 2004, Matthew Salganik and some colleagues at Princeton carried out an experiment that showed how such a result can happen "organically". Salganik approached a music-sharing website and got them to divide their user base randomly into eight "artificial worlds". Users in all eight worlds would all have access to the same songs. Users could rate songs, and they could see the average rating that each song had - but the average would only be the average of ratings given by other users in the same "world". And they could recommend songs to other users, but only to other users in the same world. For comparison, they also released each song to a random sample of users and asked them to give it a rating, without telling them what other users thought -- this was considered the "merit" rating of the song. They then looked at the ratings and downloads that each song received in each of the eight different worlds. Your intuition might be that in each of the worlds, downloads and ratings would be distributed roughly evenly among the songs, or at least the songs that were "good enough" according to their merit score, with higher-merit songs getting a bump. But that intuition turns out to be wildly wrong. In the author's words: "T]he 'best' songs never do very badly, and the 'worst' songs never do extremely well, but almost any other result is possible." Some songs that spiked to superstardom in one world would fizzle completely in all of the others. What appeared to happen is that if some critical mass of people happened to like a song in one world, then they would recommend it to other friends in the same world, who would see that the song was already popular and would be more inclined to check it out and in turn recommend it to more users, creating a snowball effect that we could call the "pile-on lottery", or the "Salganik effect". The eerie implication, of course, is that we're living in one of those "artificial worlds" -- where the songs, beliefs, individuals, and products have spiked to "superstardom" did so as a result of a random process separate from any independent measure of their "merit". The things that broke through to superstardom all had to be "good enough" to be eligible for the pile-on lottery, but not the "best". The Salganik study was a smoking gun, but there was always circumstantial evidence toward this conclusion as well:
Conversely, there are a number of fallacies that might lead a person to underestimate the role that the "pile-on lottery" plays in social media (and other areas of life):
Pile-on lottery vs. network effects (the "rational pile-on effect")Economists have long understood the "network effect", where users are incentivized to use the same product that everyone else is using, because the more other people are using it, the more useful it is. Obvious examples would be PayPal and Venmo (you can only send money to other people who are using the same platform), and social media sites like Facebook and Instagram. This could be considered a "rational pile-on effect" because it makes sense to use the same product that everyone else is using. By contrast, the "pile-on lottery" that took place in the Salganik experiments is less intuitive. If you are sitting in a room by yourself choosing songs to listen to, there is no particular benefit to listening to the same song as lots of other people -- and yet the Salganik experiment shows that's what happens anyway (even though there's a lot of randomness in which songs become popular, so that which "popular song you are listening to" depends on which of Salganik's artificial worlds you happen to be in). Reconciling with Econ theoryThe pile-on lottery is also counterintuitive because it appears to contradict the lessons of Economics 101 -- if a seller has a product that a buyer wants to buy at a mutually agreed price (where the "seller" is a social media creator and the "buyer" is a user paying with their attention), then in Econ 101 theory the "sale" will always take place, unless either the buyer or seller finds a better deal elsewhere. If post X and post Y both exist on Instagram, and in a double-blind test, 60% of users prefer post X and 40% prefer post Y, then in Econ 101 it makes no sense that post X would have 100,000 views and post Y would only have 10. The difference is that in Econ 101 theory, the cost of sorting through your options is considered negligible compared to the cost of taking action (making a purchase): [cost of sorting options] << [cost of taking action]e.g. in a standard textbook Econ 101 problem, if a bag of apples costs $10 at one store and $12 at another store, the cost to the buyer of sorting through options (driving between stores to compare prices, comparing the quality of the apples to make sure they are really comparable, etc.) is considered to be $0 (and the buyer will always buy the $10 bag instead of the $12 bag). Obviously the cost of sorting options is never truly $0, but as long as this assumption holds approximately true, then the assumptions of Econ 101 will be a useful approximation of reality, such as the "Law of One Price", which states that in a free market, the price of similar goods will be about equal (because if people can sort through options with $0 effort, nobody will ever pay the higher price for an identical good). In reality, a laptop in one store might cost $1,000 while a very similar laptop in a store in the same city might cost $1,100, because a buyer might not consider it worth $100 worth of effort to drive between two stores and compare all the options. But it would be rare to see the same laptop priced at $3,000. However, social media flips that logic on its head because generally the cost of "taking action" -- viewing a post/video, or "liking" or "sharing" it -- is small compared to the cost of sorting through options. If something is shown to a user by default by the Reddit or YouTube or Twitter algorithm, the user could spend hours searching through far more obscure posts looking for something that the user would enjoy even more in a "true double-blind test". But that would be vastly more effort than viewing the default selection, so that's what most people do. The pile-on lottery is only possible when: [cost of sorting options] >> [cost of taking action] The solutionThe pile-on lottery is highly random, highly unequal, and highly non-meritocratic. But sorv solves all of this with its three-step process:
The point of sorv is to create a system where a creator can create content without relying on the pile-on lottery. A person can create a song performance, a recipe video, or a political essay, and be assured that if the target audience likes it -- as measured by the response from the initial random sample -- they'll get content views, new connections/subscribers, and (depending on the system) even revenue in proportion to the quality, without having to rely on luck, or favors from other high-profile users, or algorithmic manipulation. (Where "algorithmic manipulation" includes pushing out content frequently for months or years because the algorithm favors high-volume creators independent of quality -- something that would no longer be a factor with sorv.) It also means that if the content isn't successful, the creator can be assured that it was because of the random sample's average opinion (which, depending on the implementation, may also come with constructive feedback), and not because the creator got screwed by the pile-on lottery or "the algorithm". |