The Boundaries of Artificial Intelligence Creation: Challenges and Governance of AI Generative Models

Jakub Żarkiewicz, 2024

Introduction: The Double-edged Sword of AI Generative Models

Accompanied by a wave of digitization, artificial intelligence has long been widely integrated into all parts of human society, including medical care, education, culture and tourism, urban management, justice, and finance, providing unlimited ease to people’s lives. Among these technologies, generative models, which include language models like GPT and diffusion models like Sora, have been of great significance in AI technology development, playing an essential part in a variety of domains ranging from speech recognition and machine translation to text summarization and content creation. These models can not only generate realistic text and visual material, capture the tone and semantic orientation of the text, and simulate language patterns for discussions, as well as comprehend and answer queries with and without context (AltexSoft, 2023). With deep learning techniques, these models demonstrate incredible flexibility and adaptability in terms of creativity and interaction.

Diffusion models (in this case Stable Diffusion) generate images from noise over many iterations. Stable Diffusion / Benlisquare / Wikimedia, CC BY-SA

However, the huge benefits of technical advancement are often accompanied with dangers and problems, and the widespread usage of AI generative models has raised a number of hidden concerns, which extend beyond the technological and include many ethical, legal, and societal elements. From privacy and security concerns to online disinformation and fake news, from the control of personal data to algorithmic bias and opacity, these issues have wide-ranging impacts on broader social, economic, political, and cultural environment. In this context, large digital platform companies such as Google, Apple, and Amazon, which rely on AI as their core technology, are increasingly at the center of attention of lawmakers, policymakers, and regulators around the world (Flew, 2021). Thus, while AI generative models hold the promise of a road to efficiency and creativity, there is an urgent need for strong governance and control over their possible negative consequences to guarantee that this technology evolves in a safe and responsible context.

Risk Disclosure: Key Challenges in AI Generative Models

Content Authenticity

The most significant risk to AI-generated models is their potential to generate misleading information and deep fake content. Audio and video content, in particular, has reached the point where it can almost be faked, making the distinction between reality and AI-generated content increasingly elusive and posing new challenges to redefine our notion of authenticity in a digital mirage. In this context, how should the public recognize the authenticity of information, and how can the media ensure the authenticity of content and prevent the dissemination of misleading information?

A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage.
Generated by Sora

Several giant wooly mammoths approach treading through a snowy meadow
(Screenshot of a video generated by Sora)

Privacy Vulnerabilities and Ethical Risk

Doubts regarding content authenticity create ethical risk. AI generative models may “memorize” facts from its training data, occasionally providing text and visual output that includes sensitive personal information. When used to process sensitive data, it might have a direct impact on personal privacy. For example, if a model is trained on medical health records, it may output language that implies personal health information. And people may not be aware of this. Once obtained by an attacker, these embeddings can be reverse-engineered to disclose sensitive information about the victim, perhaps leading to more harassment (Pan, Zhang, et al., 2020).

More seriously, deepfake information could even directly endanger the lives of those who are the targets (especially when it comes to pornography), as well as jeopardize public health measures, be used to influence elections, and even burden the justice system with potentially false evidence (Pooryousef & Besançon, 2024). All of these situations could seriously undermine trust in the public media, the public administration, and the political system, which would in turn lead to a degree of social unrest.

Data Bias

Flew (2021) mentions that algorithms and the way they utilize big data to shape individual and societal outcomes have challenges with fairness and transparency, and that the data that informs decision-making often has or reflects bias in terms of sampling practices. AI generative models supported by algorithms and big data suffer from this same problem. A model may unintentionally produce biased information against other pparticular groups if it was trained predominantly on a corpus from that group. In real-world applications, such as a chatbot that misunderstands specific cultures in its discussions or a job screening system that gives preference to a certain gender or ethnicity, this bias can result in biased decision support. According to Abid, Farooqi, and Zou (2021), there is a larger prevalence of violent prejudice against Muslims than against any other religious group, even in the most advanced contextual language model, GPT-3.

Copyright Controversy

There has always been a lot of debates about the copyright problem with AI. The content produced by AI generative models, including writing, music, and visual art, raises questions about copyright ownership. While human artists’ works are protected by traditional copyright law, it has not been clearly defined in the law who should own the content created by AI tools—the AI itself, the developer, or the user. On the other hand, defining the use of unprocessed training data, consent, licensing, etc., is equally challenging. AI generative models require a lot of data to be trained, and this data might include copyrighted text, photos, videos, and more. The AI can be used to create legally uncopyrighted material if the unapproved use of this copyrighted data is prohibited. The content produced by a model may unintentionally reproduce or modify certain copyrighted works if unapproved usage of such copyrighted material is utilized to train the model. This might give rise to copyright infringement concerns. The New York Times’ copyright lawsuit against OpenAI serves as an example. According to the complaint, AI chatbots were trained on millions of Times stories in order to present them as competitors of the news organization (Grynbaum & Mac, 2023). The case highlights the ongoing legal dispute over the unapproved use of available material to train artificial intelligence (AI) systems.

A lawsuit by The New York Times could test the emerging legal contours of generative A.I. technologies.
Sasha Maslov for The New York Times

State of governance: global and regional policy responses

Policymakers around the world have started to adopt a number of focused policy measures and governance frameworks in response to the debates and difficulties surrounding the AI generative models discussed above. The goal is to ensure the healthy development of these cutting-edge technologies while reducing any potential social risks.

The significance of content authenticity is emphasized in the news industry, social media, and political campaigns. With the Digital Services Act (DSA), the EU has imposed particular regulations that underscore the necessity for all online platforms to be in charge of content audits, particularly with regard to information produced by artificial intelligence (AI). Platforms must, among other things, strengthen the transparency of material sources and put in place more stringent content monitoring procedures (EUR- Lex, 2022).

Much more enduring and well-established are the measures taken to prevent and manage privacy violations. The European Union (EU) implemented the General Data Protection Regulation (GDPR) on May 25, 2018, with the intention of creating uniform data standards throughout the EU and safeguarding EU people against any privacy violations (Flew, 2021). The GDPR mandates that all organizations that work with personal data must have the express agreement of the user and utilize the best data protection measures available. This protects users’ informed permission for personal data to be used for AI generative model training, even if it doesn’t specifically point to AI technology. On the other hand, the White House Executive Order, which was published in October 2023, clearly highlights the necessity of protecting citizens’ civil liberties and privacy in the context of artificial intelligence. It also states that federal agencies will make sure that data is gathered, used, and stored in a way that is compliant with the law and is safe.

General Data Protection Regulation (GDPR) Definition and Meaning
Investopedia / Mira Norian

Regarding copyright issues, the U.S. Copyright Office stated in a March 2023 policy statement that “AI generative content should be explicitly excluded from [copyright] claims,” despite the fact that there are now no clear rules or regulations in place. The copyright controversy surrounding artificial intelligence-generated work has momentarily become less difficult thanks to this position, but future developments may call for more international harmonization and legal clarification.

Case study: GPT to Sora in practice

With the development of ChatGPT by OpenAI, generative models in AI entered a completely new age; Sora expands on this development. The most recent in a long series of AI generative models tools, Sora was introduced by OpenAI in February 2024. Unlike ChatGPT, which still generates text, Sora develops dynamic scenes that are both inventive and realistic after receiving commands(OpenAI, 2024).

Screenshot of Official Website of OpenAI

The complexities of technology advancement are being discussed again, along with the risks associated with artificial intelligence (AI) and its governance and regulation. The problem of permission for Sora text-to-video AI has been brought up by legal sources (Canton, 2024). Are people notified when their likenesses or creative efforts are used? Is their permission and consent requested? The use of training data from artists, photographers, performers, and filmmakers without express permission, on the other hand, has drawn the attention of generative AI expert Dr. Dominic Lees (University of Reading, 2024). Lees highlighted the potential legal challenges, copyright issues, and ethical concerns that can arise when AI systems like Sora generate new content.

An contrasting viewpoint is provided by Vox (Piper, 2024), which emphasizes the evolutionary aspect of AI functioning. By addressing the clumsiness and inaccuracy of previous models, recognizing uncertainty and the potential trajectory of progress in generative AI, and emphasizing that we shouldn’t lose sight of the increased functionality that future iterations and competition may bring to the field, Sora opens up new possibilities in the creation of AI videos.

Future directions: innovation and accountability proceed concurrently

Going forward, we might be able to get a conceptual framework for the efficient regulation and governance of AI generation models and the platforms that support them from the Platform Governance Triangle Model (Gorwa, 2019). The interaction of state agencies, platform companies, and non-governmental organizations (NGOs) is highlighted. State agencies offer legal and regulatory oversight, platform companies offer self-regulation practices and innovation, and NGOs represent civil society and promote the public interest.

Platform Governance Triangle (Gorwa, 2019)

The triangle’s seven zones of interaction show the many ways in which these entities collaborate, clash, or work together to carry out governance tasks. These areas include autonomy (platforms managing their operations autonomously), shared governance (cooperative agreements between two or more stakeholder groups), and direct regulation (unilateral application of regulations by State authorities).

While a provisional political agreement was reached in December 2023, the EU Artificial Intelligence Act, formally approved in 2024, provides a concrete legal framework. This is the first binding law on AI that both protects citizens’ fundamental rights, democracy, the rule of law, and environmental sustainability from high-risk AI, while fostering innovation and making Europe a leader in the field (European Parliament, 2024). It mentions the creation of independent experts and advisory forums in the new AI regulator (Dentons, 2023).In conjunction with the Platform Governance Triangle model—which posits that governments have a dual role in establishing and overseeing standards, platform operators are continuously innovating to enhance technological measures, and NGOs (also representing users and citizens) are responsible for improving their understanding of AI technology and engaging in its use and monitoring — the efforts of various stakeholders can collaborate to protect ethical standards, platform operations, and responsibility while also fostering an environment where innovation, accountability, and responsible use of technology can coexist.

References

Abid, A., Farooqi, M., & Zou, J. (2021, July). Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 298-306).

AltexSoft. (2023, January 18). Language models GPT. https://www.altexsoft.com/blog/language-models-gpt/

David Canton (2024, February 21). Consent and deception issues with text-to-video AI. Harrison Pensa. https://www.harrisonpensa.com/consent-deception-issues-with-text-to-video-ai/

Dentons. (2023, December 14). The new EU AI Act: The 10 key things you need to know now. https://www.dentons.com/en/insights/articles/2023/december/14/the-new-eu-ai-act-the-10-key-things-you-need-to-know-now

EUR-Lex. (2022). Regulation (EU) 2022/2065 of the European Parliament and of the Council. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32022R2065

Flew, Terry (2021) Regulating Platforms. Cambridge: Polity.

Gorwa, R. (2019). The platform governance triangle: Conceptualising the informal regulation of online content. Internet Policy Review, 8(2).

Kang, C. (2023, December 27). New York Times sues OpenAI, alleging copyright infringement. The New York Times. https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html

Kelsey Piper. (2024, February 23). Sora: OpenAI, Sam Altman, and AI-generated videos’ disinformation risk. Vox. https://www.vox.com/future-perfect/24080195/sora-openai-sam-altman-ai-generated-videos-disinformation-midjourney-dalle

Pan, X., Zhang, M., Ji, S., & Yang, M. (2020, May). Privacy risks of general-purpose language models. In 2020 IEEE Symposium on Security and Privacy (SP) (pp. 1314-1331). IEEE.

Pooryousef, V., & Besançon, L. (2024, February 20). What is Sora? A new generative AI tool could transform video production and amplify disinformation risks. The Conversation. https://theconversation.com/what-is-sora-a-new-generative-ai-tool-could-transform-video-production-and-amplify-disinformation-risks-223850

Rainie, L. (2018). Americans’ complicated feelings about social media in an era of privacy concerns. Pew Research Center. https://www.pewresearch.org/fact-tank/2018/03/27/americans-complicated-feelings-about-social-media-in-an-era-of-privacy-concerns.

The White House. (2023). Executive order on the safe, secure, and trustworthy development and use of artificial intelligence. https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

United States Copyright Office. (2023). Copyright registration guidance: Works containing material generated by artificial intelligence. Federal Register, 88, 16190-01.

University of Reading. (2024). AI video tool SORA raises big legal questions – expert. https://www.reading.ac.uk/news/2024/Expert-Comment/AI-video-tool-SORA-raises-big-legal-questions—expert