If Social Media Keeps Moderating Harmful Content, Why Does Harm Still Spread?

Figure 1. Harmful content remains a common feature of everyday social media use despite platforms’ claims of safety and moderation. Source: iStock/Getty Images.

Spend any appreciable amount of time scrolling through Instagram, TikTok, Facebook or X, and chances are you’ll come across something disturbing, whether it’s racism, misogyny, homophobic comments, bullying, dogwhistling, or material meant to evoke a fearful reaction. This would seem odd if one took social media platforms at their word. All the time, they insist on having policies in place, moderators, machine learning and safety features. But then why does such content manage to spread so easily?

Why harmful content is still everywhere

According to this blog post, there is no lack of policies within social media networks to stop toxic content from being circulated on their sites, yet the underlying problem within these social media organizations lies deeper. This is due to the fact that social media firms promote themselves as spaces of neutrality while serving as economic infrastructures where speech is ranked, promoted, and amplified for the sake of making money. Thus, toxic content cannot be considered merely as a matter of content moderation, but rather as a matter of governance. (Flew, 2022; Massanari, 2017; Sinpeng et al., 2021). In particular, recent developments regarding policy at Meta highlight the contradiction, in that “Meta aims for more speech and fewer mistakes, yet the way of thinking implies that harmful speech is acceptable in a world with freedom and efficiency.”

Platforms Are Not Neutral

In order to grasp the reasons for the existence of such content, it is necessary to shift the perspective away from seeing platforms as mere hosts. The insights provided by Terry Flew on the regulation of digital platforms can be helpful in this regard, as he argues that platforms are more than mere intermediaries.(Flew, 2022). It is significant to note that on social media platforms, speech is always more than “just there.” It is arranged, mediated, and governed by recommendations, engagement metrics, rankings, reporting mechanisms, governance systems, trust and safety, and advertisements. This means that social media platforms don’t only host speech but also regulate its flow. This is what some have termed as platform or algorithmic governance.

With the recognition of platforms as governors, the continuity of harm is no longer unexpected. The moderation policies could be well-defined within the platform while the overall system continues to incentivize controversial or polarizing content. This is precisely what makes Adrienne Massanari’s work on Reddit so relevant.

Toxic culture[s] are not simply maintained through user action but also because of platform design, governance structures, and visibility mechanisms.” (Massanari, 2017).

Aggressive content flourishes where the design of the platform makes it easy for users to collect, distribute, praise and normalize aggression. What matters here is not whether the company was able to remove the nasty post. The real issue is whether the design of the platform continues to enable such content to flourish.

Another factor that contributes to the persistence of the problem is the tendency of these platforms to enforce global regulations with very broad definitions. According to a study conducted by Sinpeng et al. on Facebook in the Asia-Pacific region, this can lead to significant mistakes as the definition of hate speech becomes universal. Specifically, the study discovered that although Facebook’s policies became more rigorous, and automation was used to regulate content, Facebook was allowing vilification and discrimination to flourish in public spaces because their system lacked sufficient context from local communities. Many page owners within the LGBTQ+ community reported feeling vulnerable against hate speech (Sinpeng et al., 2021). This is an important lesson to be learned: the nature of abusive speech depends on context. Words, irony, code, insults, politics, and majority threats do not always generalize to universal guidelines or machine-readable policies. What may work well in one nation could backfire completely in another.

Meta’s ‘More Speech, Fewer Mistakes’ Shift

The reason this is important is that the platform loves the idea of scalability. Scalability sounds efficient, but it usually implies that platforms have flattened cultural and political distinctions into a global rule book, leading to a potential double failure of over-blocking of safe speech and under-blocking of dangerous speech. That is precisely how Meta describes their own moderating process now. As part of an announcement made in January 2025, the firm explained that, beginning in the United States, it would replace third-party fact-checking with Community Notes, increase the amount of speech permitted around certain controversial subjects, and enforce illegal and high-severity content only. The firm also said that it wanted to make fewer “mistakes” resulting from automation, which it then claimed was accomplished through reducing enforcement errors in the United States by approximately 50% without changing prevalence rates.

Figure 2. Meta’s January 2025 policy shift framed reduced intervention as “more speech and fewer mistakes.” Source: Meta-related media image.

On the surface level, that seems fair enough. Who would not want to avoid censorship and misidentification? But that is precisely where things become tricky with digital policy issues. The less content a social network moderates in the name of avoiding “mistakes,” the fewer risks there are for false positives; but the risks for real harms increase as well. Meta’s statement reveals how problematic moderation is being presented in terms of balancing free expression against mistakes in moderation, rather than looking at the role the rest of the amplification system is still playing within the company’s content governance framework. This is why harmful content still appears – despite any rhetoric about principles of free expression, the system itself continues ranking provocative content, tailoring users’ feeds, and optimizing for engagement. Making it easier for the company to moderate its content can lead to worse consequences, especially for marginalized communities who have been historically subjected to the most abuse on those platforms. Meta’s internal Oversight documents also prove that the Hateful Conduct policies’ January 7, 2025 amendments generated additional recommendations regarding human rights due diligence processes.

It is precisely the discrepancy between such regulations and real-life harms that makes a current case study at Meta especially relevant. As one of the most prominent social media companies in the world, Meta’s decisions have implications way beyond its platforms. When it says that it will increase freedom of speech in political discussions and reduce active moderation in certain spheres, it actually redefines the public environment where users communicate and can become the target of actions. For those users who are rarely targeted by hate campaigns, it could appear as a principled position towards freedom of expression. However, for other people who regularly encounter racial, sexual, or homophobic harassment, it might be interpreted as an indication that the company will retreat from any interference until harm is clear and obvious.

Moderation is not only seen through the lens of having a policy page by regulators anymore. The EU’s Digital Services Act takes a wider approach, obliging platforms to reduce any potential systemic risk, increase transparency of moderation processes, provide appeal possibilities and, where it comes to extremely large platforms, provide non-personalized feeds. The DSA recognizes that extremely large platforms can spread harmful or even illegal content and thus needs more obligation than just notice and takedown. In 2024, the European Commission notified X that its preliminary assessment showed that X violated the DSA in relation to dark patterns, advertising transparency and researcher data access. This is important as it indicates a change in the regulation strategy; accountability of platforms does not only include transparency but also provides data access for independent research. Without researcher access, the public cannot verify corporations’ promises about harm reduction on their platforms.

The Australian approach through eSafety also follows a similar path. With regards to transparency notices issued to platforms regarding online hate content, it is stated that these notices are binding and are aimed at increasing industry transparency and accountability. With regard to X in 2024, eSafety indicated that the organization had “failed to meet their obligations under a previous notice relating to online hate,” and offered responses that were “incorrect, significantly incomplete or irrelevant.” This is significant both in terms of what it means for a single organization, and because it speaks directly to why there is a problem with trusting platform reporting of harm.

Why These Harms Are Not Abstract

The social implications of this are not theoretical. Research by eSafety published in 2025 showed that 96% of Australian children between 10 and 15 years old had accessed at least one social media network, and that seven out of ten had been exposed to online material related to harmful activities, such as misogynistic or other forms of hateful content. The majority of these children indicated that their latest exposure was via a social networking site. Nearly two thirds reported experiences of cyberbullying, while one in seven had reported experiences of grooming behavior. These statistics are important to note since it indicates that the problems posed by online harmful content do not affect just unlucky users. They are a regular feature of the social media landscape.

Figure 3. eSafety research shows that social media use among Australian children is widespread, and exposure to harmful content is common. Source: eSafety Commissioner (2025）

Then what is the problem, exactly? It is: everything. There is first the definitional issue. Harm is relative, cultural and political, but platforms prefer more universal definitions. Then there is the enforcement issue. While the platform can define what kinds of behavior constitute violations, issues arise in terms of reports, staffing, linguistic knowledge, and appeals procedures. There is then the amplification problem. A recommendation algorithm can amplify anger, aggression and novelty despite the existence of policies against such things. Then there is the business model issue. Attention-based business models lead to content that attracts attention, which means content likely to be more emotionally intense. Finally, there is the legitimizing issue. The platform constantly changes the way it presents itself, whether as a neutral party, private company, public space, or protector of free speech, depending on which works best to shield it from criticism.

None of this suggests that moderation is futile. Instead, it shows that moderation alone is too limited. A meaningful approach to online harms requires, at minimum, four considerations. First, platforms need increased knowledge and expertise on a localized and linguistic level, since context makes a difference. Second, they require transparency obligations allowing for external monitoring of the process and practices of moderation and recommendations. Third, they require greater protections for users, including appeal procedures, explanations, and safety guarantees for vulnerable communities. Finally, they require governance that takes into account both amplification and design rather than focusing solely on harmful content. Ultimately, the most important takeaway from the readings in this course is that online harms are not merely a question of “bad speech.”

The existence of harmful content on social media sites is enabled by the nature of these social media sites as being more than merely public places. These are commercial and algorithmic structures that manage visibility while appearing as neutral. While their policies are important, there is much else besides them, including incentives, structure, and politics. Facebook’s moderation shift to be complete by 2025 is one example of the way this can happen quickly when the corporation believes that more speech and fewer moderation efforts are the right message to give out. If we wish to have safer digital publics, it’s time to stop thinking about only one aspect of this issue.

*Figure 4. Online harms are embedded in everyday social media environments rather than isolated to a single platform. Source: Australian Parliament Library page image.*

Reference

Flew, T. (2022). Regulating platforms. Polity. https://www.politybooks.com/bookdetail?book_slug=regulating-platforms–9781509537075

Massanari, A. (2017). #Gamergate and The Fappening: How Reddit’s algorithm, governance, and culture support toxic technocultures. New Media & Society, 19(3), 329–346. https://doi.org/10.1177/1461444815608807

Sinpeng, A., Martin, F. R., Gelber, K., & Shields, K. (2021). Facebook: Regulating hate speech in the Asia Pacific. The University of Sydney & The University of Queensland. https://espace.library.uq.edu.au/view/UQ:68bb3d7

eSafety Commissioner. (2026, February 4). Responses to transparency notices: Online hate. https://www.esafety.gov.au/industry/basic-online-safety-expectations/online-hate

eSafety Commissioner. (2025, July 11). Latest eSafety research reveals social media use is widespread among kids – and so are the harms. https://www.esafety.gov.au/newsroom/media-releases/latest-esafety-research-reveals-social-media-use-is-widespread-among-kids-and-so-are-the-harms

European Commission. (2024, July 12). Commission sends preliminary findings to X for breach of the Digital Services Act. https://digital-strategy.ec.europa.eu/en/news/commission-sends-preliminary-findings-x-breach-digital-services-act

European Commission. (2026, March 10). The Digital Services Act. https://digital-strategy.ec.europa.eu/en/policies/digital-services-act

Meta. (2025, January 7). More speech and fewer mistakes. https://about.fb.com/news/2025/01/meta-more-speech-fewer-mistakes/

Meta Transparency Center. (2025). Oversight Board recommendations. https://transparency.meta.com/oversight/oversight-board-recommendations/

ARIN6902

ARIN6902: Digital Policy and Governance