Cukier mentioned in 2010 that a revolution in science is often preceded by a revolution in measurement technology.As far as the current society is concerned, the technological revolution happens in the Internet, and the revolution in Internet technology happens in algorithms and big data.Big data reframes key questions about what constitutes knowledge, the process of research, what we should do with information, and the nature and classification of reality.Big data itself is objective, while algorithms are not. Especially when algorithms are developed as a technology to compute big data, the main question at this time is, who developed the algorithm? What are the specific details of the algorithm? And so on. Because algorithms based on big data are a mapping of social biases in the age of artificial intelligence, algorithms are not created out of thin air, they are based on the background of the developer and the demands of the audience to be served and the ideology of the society. In the process of data reading, in the process of algorithm calculation, and in the process of algorithm pushing, the result of further deepening the existing prejudice and discrimination may happen. However, at present, the public and the audience of the platform are still indulging in the joy of the new technological development of big data and algorithms, and it is necessary for the academia to explain the hidden problems to the public for ethical and social responsibilities. In this blog, we will first understand big data and algorithms, then delve into algorithms as a technology of platforms and some cases of platform algorithmic bias, and finally understand the governance of platform algorithmic bias through existing platform governance policies and tools.
Big Data is not simply denoted by volume(Kitchin,2014).In fact, data sets are generated all the time in society, and industry, government, and academia are all important sources of data generation.The cognitive aspect of big data needs to keep up with the level of development of the times, such as the national census or find hundreds of respondents for a survey, such a diversity and a relatively large base of data, looks like big data but its essential characteristics are static, if to analyze them, does not require too complex data analysis model and algorithm iteration level. However, if we replace the object with dynamic data that is continuously generated and produced with flexibility and scalability, such as digital TV, retail sales records, user-device interaction records, website click traffic, etc., more complex algorithms and model architectures are required. However, with the massive amount of information and data, how to process and store it, and the cost and technical issues it poses, is not within the reach of the general public.The rise of algorithms stems from new forms of data analysis in response to data enrichment (Kitchin, 2014).New methods of data analysis offer a whole new way for the public to understand the world, and Miller mentioned in 2010 that algorithms do not validate conclusions by analyzing relevant data, but gain new ‘insights’ from the data itself.
Algorithm bias is not only a data problem
From existing experience we can know that developing a completely new algorithm requires some key steps. The first step identifies the problem, the second step the developer needs to specify those parts of the problem that need to be analyzed, the third step the developer has to design an analytical model to satisfy the overall framework of the algorithm, the fourth step fills in the detailed data, and finally the algorithm is re-examined.The two parts of the process that are most likely to introduce bias are the design of the algorithmic model and the bringing in of detailed data.Reasons for possible bias in data collection include sample selection bias, measurement bias, missing data, and data exclusion (Solon, 2016). When data is collected, too few samples will lead to insufficient authority and selection bias will leading to the loss of key factors.But if algorithmic bias is just a data problem, can we solve it? As Hooker mentioned in 2021, the technology and costs available for real-world data collection are insufficient to support comprehensive tagging of all sensitive features. Moreover, based on the diverse, dynamic, and sustainable nature of big data, it is difficult to address all codes in a standard classification.In addition, algorithmic bias is not just a data problem.Bias and discrimination in algorithm are long-standing injustices due to the way people are counted, represented, and categorized in data (Perez, 2019).If we cannot perfectly resolve the bias in the data itself, then the harm to the system as a whole is a product of the interaction between the data and the algorithmic model design.Acknowledging the effects of model design bias can play an important role in curbing harm (Hooker, 2021).Algorithmic model design is imperfect and unjust, and some algorithms are just better than others. Recognizing how algorithmic models can harm the overall system is far less burdensome than collecting perfect, comprehensive data.
Algorithm as a platform technology
In 2016, Walmart generated more than 3.6PB of data per hour, and these data are related to millions of customers (Open Data Center Alliance, 2016), Instagram reports that it processes more than 4 billion pieces of content every day, 4.2 billion times ‘Likes’ and close to 1 billion photo uploads (Constine, 2019).With such a large amount of data, the choice of platform is to design models using algorithms specific to each platform,especially social media platforms.
In some of the known social media platforms, the algorithms they use are almost always related to some key factors of the users, for example, Instagram is more focused on relationships, interests, and relative relationships, while Tiktok is more focused on interactions, behaviors, trends, etc. (Newberry, 2022).Algorithms have seeped into every corner of our lives.The relationship between algorithm and audience is mutually domesticating (Siles, 2019).Each platform has an independent algorithmic design model, which means that the audience is constantly faced with the process of domestication in their lives. The platforms give users algorithmically recommended content based on the information they provide and their own algorithms. Users integrate the algorithmic recommendations into their daily lives, and the platforms complete this domestication process by colonizing users through their algorithms.Every dynamic between the platform algorithm and the user depends on and feeds back to each other (Silverstone, 1994). This is a domestication process with a cyclical nature.So that users are subconsciously influenced by the platform’s algorithmic model, which in turn influences the entire social shaping – you only see what the platform’s algorithm wants you to see.
Why platform algorithm bias cannot be ignored？
In fact there are many other forms of platform algorithm bias prevalent in life besides the above highlighted social platform algorithm bias and over exploitation of users.As one of the largest tech giants in the world, Amazon uses a large number of algorithms for precise control in employee recruitment and job assignment.
Amazon’s recruiting engine is an artificial intelligence algorithm whose purpose is to analyze the resumes of applicants who apply to Amazon and decide which ones will be called for further interviews and selections. Amazon’s algorithm was shown to be biased against women during the hiring process. This may be due to the fact that the recruitment algorithm was trained to analyze the resumes of candidates by studying Amazon’s responses to resumes submitted in the past 10 years (Haikiran, 2022). Since most of the people who analyzed resumes in the past were men, their gender discrimination was The algorithm’s incorporation, combined with inadequacies in the design of the algorithm’s model, eventually led to the algorithm automatically downgrading resumes with the word ‘female’ in them.In terms of work allocation, Amazon has a complete algorithm design model. The algorithm sets work standards for each employee by collecting basic employee data and work records. The employees in the Amazon factory accomplish self-elimination in constant self-exploitation (Molla, 2019). The algorithm continuously aggravates the work target of the employee who accomplishes the target based on his work record until he fails to accomplish the target and is finally dismissed by the algorithm, and the departure of this employee marks the beginning of the next employee’s bias by the algorithm.The convenience of algorithms has brought about an unstoppable technological revolution in all walks of life, but undeniably, it has also brought about bias.
Governance platform algorithm bias
The meaning of the term governance will never refer only to the government’s ability to specify and enforce rules and provide services, but to represent a specific and complex network of interactions across different actors and behaviors(Gorwa,2019).
Gorwa gave us some important advice on platform governance in 2019 as well as the platform governance triangle model.
Figure 1：The Platform Governance Triangle(Robert Gorwa,2019)
Self-governance is currently the dominant governance model for platform and algorithmic bias governance.Under this governance model, transparency is generally voluntary, and most platform decisions are made with minimal external oversight (Suzor, 2019).For example, social platforms believe they can ensure that their algorithms are fair and equitable to users by enriching their content policies and being stricter on issues such as hate speech, racism, and sexism. For example, in the 2021 Facebook Regulating hate speech in the Asia Pacific, a case study of hate speech legislation in five countries around the world was used to establish an ideal definition of hate speech, which was then incorporated into Facebook’s algorithmic recommendations for users. This completes the need for users to reduce bias from the push side, while reducing the biased data from the data side, thus improving the database.
After numerous platform scandals and cases of algorithmic bias, there are increasing calls for external governance, i.e. direct intervention in platform behavior through national legislation.The German Network Enforcement (NetzDG) law, for example, explicitly mandates that platform algorithms be open and transparent to the government (especially for large, non-German platforms).The American Data Privacy and Protection Act (ADPPA), which explicitly requires platforms and big data holders to maintain data records, report annually to the government for review, and conduct algorithmic risk control reports to all users every two years.
Co-governance arises between the first two approaches. The main thrust of this approach is to try to provide some value of democratic accountability without extreme changes to the status quo (Gorwa, 2019).Civil society organizations and platforms work together to establish a third-party cooperative organization to perform a variety of functions, from the source of data to the algorithm design model, provide reference advice and finally work together to complete the algorithm construction and improvement.For example, in June 2020 the Facebook platform launched The Oversight Board.This third-party organization calls itself the Oversight Board, which selects experienced and well-governed members in 27 countries around the world in order to ensure the diversity of its user base. With the addition of many cultural and academic backgrounds, it seeks to make Facebook’s algorithmic model better.
Algorithms, as an emerging technology in the digital information age, have become unstoppable. We cannot resist the convenience that technology brings to productive life, and at the same time, we cannot escape the social problems that technology brings. Each of us is like some virtual data under the augmentation of algorithms. The data is objective, but the algorithm is artificial, just as the platform will make more use of the algorithm to seek benefits. We need better governance of algorithms and platforms to reduce the occurrence of algorithmic bias, and more importantly, when we realize that algorithmic bias cannot be eliminated from the root cause of data, we must adopt better algorithmic model design to achieve the reduction of the number of algorithmic bias.
Christina , N. (2022, November 7). Social Media Algorithms: A 2023 Guide for Every Network. Hootsuite. https://blog.hootsuite.com/social-media-algorithm/
Flew, Terry (2021) Regulating Platforms. Cambridge: Polity, pp. 72-79.
Gorwa, Robert (2019) ‘The platform governance triangle: Conceptualising the informal regulation of online content’, Internet Policy Review 8(2),
Kitchin. (2014). Big Data, New Epistemologies and Paradigm Shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
Molla, R. (2019) Activists are pressuring lawmakers to stop Amazon Ring’s police surveillance partnerships. https://www.vox.com/recode/2019/10/8/20903536/amazon-ring-doorbell-civil-rights-police-partnerships
Terence, S. (2020, June 5). Real-Life Examples of Discriminating Artificial Intelligence. Towards Data Science. https://towardsdatascience.com/real-life-examples-of-discriminating-artificial-intelligence-cae395a90070
Sara, H. (2021). Moving beyond “Algorithmic Bias Is a Data
Problem.” 2(4). https://doi.org/https://doi.org/10.1016/j.patter.2021.100241
Siles. (2019). The Mutual Domestication of Users and Algorithmic Recommendations on Netflix. Communication, Culture & Critique. https://doi.org/10.1093/ccc/tcz025
Susan , L. (2008). Data, Power and Bias in Artiicial Intelligence. https://doi.org/https://doi.org/10.48550/arXiv.2008.07341