Truths and Lies in the Age of Artificial Intelligence Generated Content”

On Friday, March 22, 2024, Catherine, Princess Catherine of Wales (often referred to as Kate) released a video statement via the BBC stating that she was undergoing cancer treatment. The video attracted a lot of attention on social media, and along with concern for the Princess’s condition came a number of opinions that the video was deeply faked (Hunter, 2024). Evidence for this was the fact that one of the rings in the video appeared to have disappeared from Kate’s finger, and that Kate appeared to be wearing the same shirt as she did in a promotional video for a 2016 mental health campaign. And it was not long before high-profile news organizations retracted two photos released by the royal family because of indications that the photos had been doctored (Ott, 2024). The Princess Kate incident has once again raised concerns about the authenticity of content in the age of artificial intelligence, with AI expert Ajder stating that as AI technology and public awareness of what it can do increases rapidly, means people’s “sense of shared reality, I think, is being eroded further or more quickly than it was before.” .

What is Artificial Intelligence Generated Content?

With the widespread use of large-scale Artificial Intelligence (AI) models such as ChatGPT, Artificial Intelligence Generated Content is becoming less familiar to the general public. Artificial Intelligence Generated Content (AIGC) refers to the use of generative AI algorithms to assist or replace humans in creating rich, personalized, and high-quality content based on user inputs or needs, faster and at a lower cost (Wu et al, 2023).AIGC consists of a large amount of synthetic content including text (e.g., poetry), images (e.g., artwork), audio (e.g., music), video (e.g., animation), and interactive 3D content (e.g., virtual avatars, assets, and environments).

How is AIGC generated?

AIGC seems to be automatically generated content by AI tools, but what really backs it up is massive amounts of big data and algorithmic techniques(Just & Latzer, 2016). It works by using Natural Language Processing (NLP) and Natural Language Generation (NLG) algorithms to learn and capture countless valuable information from massive data sets. By training on these massive data, language models can dynamically generate more accurate words, phrases, sentences, and paragraphs based on contextual information. Today, by learning from massive amounts of human language text data, AI systems like ChatGPT are fully capable of creating texts that are similar in style to human writing, and which most people are usually unable to distinguish (Dey, 2022).

Current state of AIGC

The use of artificial intelligence (AI) to help write content and news stories is nothing new at this stage. Back in 2014, the Associated Press began publishing AI-generated financial stories, and since then, media outlets including the Washington Post and Reuters have developed their own AI writing technology (Marr, 2023). But until a few years ago, this technology was completely proprietary and only available to media companies that could afford to buy and operate it. Now with the release of ChatGPT, anyone can use AI to generate an article in seconds and with a minimum of technical knowledge can set up a “content farm” that produces and publishes online content around the clock. A survey by NewsGuard found that 49 websites covering seven languages had content written almost entirely by AI ( NewsGuard, 2023).

And just last month, Sora, a video generator developed by AI company OpenAI, came out of nowhere and dropped a bombshell on the AI world. With just a single cue, Sora generates high-definition videos with multiple characters and specific action types, and largely accurate themes and backgrounds. Wired called the clips “photorealistic,” while the New York Times said they “look as if they were lifted from a Hollywood movie.”

Although there are still some technical flaws in Sora waiting to be broken, the clarity and realism of the detail and superb accuracy of these AI-created videos provoke thought:

Can we still believe that seeing is believing in the age of artificial intelligence?

Advantages and Disadvantages of AIGC

Advantages of AIGC

1. Fast, efficient and user-friendly

The AIGC algorithm can analyze massive data sets and simultaneously generate a large amount of content that meets specific criteria, topics or target audiences. This is especially useful for industries that need to generate content quickly, such as news reporting or social media marketing.

2. Reducing costs

ChatGPT and other similar tools are open to the public and in many cases free to use, so anyone can start using them now. For organizations AI can generate large amounts of content quickly and efficiently without the need for significant additional resources or hiring extra staff. This scalability eliminates the need for massive workforce expansion, which saves on labor costs.

3. Diversity and multimodal support

Multimodal support in AIGC refers to the ability of an AI model to process and generate information from multiple modalities or sources (e.g., text, images, video, and audio) (Wang et al, 2023). As such, it not only allows for the creation of diverse content in multiple modalities, but also improves human-machine interaction (HMI) by generating fully immersive experiences. For creators, AI content generators can work across different languages, facilitating the creation of content in multiple languages. They can also assist with content localization, which involves adapting content to specific cultural differences, preferences, and linguistic variations so that it resonates with the target audience on a deeper level.

Potential threats to AIGC services

Despite the promise of AIGC, a number of privacy, trust, and ethical issues pose significant barriers to its widespread adoption.

1. Privacy threats

The privacy threats that exist in the AIGC service arise from several sources, starting with widespread private data collection. The AIGC service relies heavily on large-scale data collection from a variety of sources, including the Internet, third-party datasets, and private user data. For example, GPT-3 utilizes a total of 45 terabytes of text resources from various domains. The privacy threat posed by commonly collected private data in the AIGC era is even more serious compared to traditional AI models (Kerry, 2020). In order to generate the desired high-quality content, AIGC models typically require users to provide multimodal inputs (e.g., private data and PDF files) that can be private and sensitive. In addition, there are privacy risks associated with utilizing large amounts of personal data publicly available on the Internet for AIGC model training. Second, there is also a risk of privacy leakage during AI interactions. This is because AIGC models (e.g., ChatGPT) are also collecting massive amounts of users’ historical conversation data and multimodal intentions during their conversations with users in order to further train themselves. Using this information, OpenAI can predict users’ preferences and map their profiles during interactions with ChatGPT, potentially compromising user privacy.

For example, in April 2023, a Samsung employee, while using ChatGPT to assist in fixing the source code, inadvertently leaked the company’s trade secrets by entering confidential data, including the source code of a new program, into ChatGPT .

Third, this massive amount of collected data is retained in the AI’s memory, but this storage is not secure. Previous research has shown that LLMs such as ChatGPT are able to store confidential training data within models and can be exploited for output by malicious actors, potentially compromising users’ private information (Li et al, 2023). For example, in March 2023, ChatGPT was revealed that some users were able to view payment information from other users’ conversations, including first and last name, email address, payment address, the last four digits of the credit card number, and credit card information, with expiration dates.

2. Trust threats

Trust threats represent one of the most pressing concerns within the current landscape of AIGC services. With the increasing ease and accessibility of these services, ordinary individuals can fabricate highly convincing fake news at minimal expense. For instance, a Chinese man was apprehended for exploiting ChatGPT to propagate deceptive news concerning a train accident. This false narrative was disseminated across more than 20 blog platforms, amassing over 15,000 views within a matter of hours . Moreover, when such misleading information becomes associated with prominent figures like politicians or celebrities, it further undermines public trust. Former US President Donald Trump, for example, has repeatedly cited AI-generated content to refute accusations against him.

Beyond disrupting public opinion, the misuse of AIGC technology poses a tangible threat to publics’ personal and economic security. By leveraging publicly available data on social networks—such as photos and videos—attackers can employ AIGC services to impersonate specific individuals and engage in fraudulent activities, including identity theft. According to VMware’s findings, cybercriminals are already adept at circumventing security measures by integrating deepfake technology with existing attack methods. Professor Hany Farid, an authority on digital propaganda and misinformation at the University of California, Berkeley, observes that artificial intelligence continues to provide a “liar’s dividend.” In this era, “When you actually do catch a police officer or politician saying something awful, they have plausible deniability” (Verma & Vynck, 2024).

3. Ethical risks and Regulatory

As large-scale AI-generated models continue to advance, the intellectual property rights and responsibilities surrounding AI-generated content face novel legal challenges. While ownership of AIGC models typically rests with the organizations that develop them, the resulting content often blends pre-existing data with innovative elements. This blurred distinction has sparked concerns regarding the ownership of intellectual property rights to AI-generated content. For instance, in 2022, three artists, including Sarah Anderson, filed a lawsuit against multiple AI-generated platforms such as Stability AI for unauthorized usage of their original works (Chen, 2023). These platforms utilized their creations to train artificial intelligence, potentially influencing their existing works.

Moreover, the cross-border regulatory issues and concerns regarding technological monopolies prevalent in AIGC services warrant ongoing attention and discussion.

Conclusion

The flourishing development of ChatGPT and AI-generated content technology has propelled significant strides in content creation. However, it also introduces a range of challenges concerning privacy, trust, and the need for enhanced societal ethics and regulation. Currently, the most advanced methods for safeguarding intellectual property rights and regulating AIGC primarily involve techniques such as watermarking, cryptography, hardware solutions, and blockchain. Nevertheless, these approaches face significant hurdles in managing the escalating volume of AIGC data. Governments worldwide are actively pursuing the formulation of laws and regulations to steer the healthy development of AI-generated content. For example, China released a draft of the Administrative Measures for Generative AI Serviceson April 11, 2023, aiming to foster the responsible development and regulated utilization of AI-generated content technology. Similarly, on May 11, 2023, the European Union drafted the “Artificial Intelligence Act” to provide regulatory guidance for AI-generated content.

Finally, recent discussions on social media concerning topics like royal figures and the use of AIGC services in US elections underscore people’s concerns and skepticism about the era of artificial intelligence. As with any technology, AIGC services present a double-edged sword. This emphasizes the crucial need to consistently address and prioritize the protection of public safety and societal stability alongside the benefits brought by technological advancements.

References

Chen, M. (2023). Artists and illustrators are suing three A.I. Art Generators for scraping and “collaging” their work without consent. Artnet News. https://news.artnet.com/art-world/class-action-lawsuit-ai-generators-deviantart-midjourney-stable-diffusion-2246770

Dey, V. (2022). Deep Dive: How ai content generators work | venturebeat. Venture Beat. https://venturebeat.com/ai/deep-dive-how-ai-content-generators-work/

Hunter, T. (2024). Princess Catherine cancer video spawns fresh round of AI conspiracies. The Washington Post. https://www.washingtonpost.com/technology/2024/03/27/kate-middleton-video-cancer-ai/

Just, N., & Latzer, M. (2016). Governance by algorithms: Reality construction by algorithmic selection on the internet. Media, Culture & Society, 39(2), 238–258. https://doi.org/10.1177/0163443716643157

Kerry, C. F. (2020). Protecting privacy in an AI-Driven World. Brookings. https://www.brookings.edu/articles/protecting-privacy-in-an-ai-driven-world/

Li, H., Guo, D., Fan, W., Xu, M., Huang, J., Meng, F., & Song, Y. (2023). Multi-step jailbreaking privacy attacks on chatgpt. Findings of the Association for Computational Linguistics: EMNLP 2023. https://doi.org/10.18653/v1/2023.findings-emnlp.272

Marr, B. (2023). The danger of Ai Content Farms. Forbes. https://www.forbes.com/sites/bernardmarr/2023/05/16/the-danger-of-ai-content-farms/?sh=1ca281394fca

Ott, H. (2024). Princess Kate admits photo editing, apologizes “for any confusion” as agencies drop image of her and her kids. CBS News. https://www.cbsnews.com/news/kate-princess-of-wales-photo-apologizes-family-picture-editing/

Rise of the newsbots: AI-generated news websites proliferating online. NewsGuard. (2023, August 2). https://www.newsguardtech.com/special-reports/newsbots-ai-generated-news-websites-proliferating/

Verma, P., & Vynck, G. D. (2024, January 22). Ai is destabilizing ‘the concept of truth itself’ in 2024 election. Washingtonpost. https://www.washingtonpost.com/technology/2024/01/22/ai-deepfake-elections-politicians/

Wang, Y., Pan, Y., Yan, M., Su, Z., & Luan, T. H. (2023). A survey on CHATGPT: Ai–generated contents, challenges, and solutions. IEEE Open Journal of the Computer Society, 4, 280–302. https://doi.org/10.1109/ojcs.2023.3300321

Be the first to comment

Leave a Reply