In 2023, a nightmarish video went viral online: an AI-generated Will Smith ravenously wolfing down spaghetti. The resolution was poor, the subject’s appearance was distorted and demonic, and the lighting and physics were a complete mess, bearing almost no resemblance to real footage. At the time, rather than being perceived as a threat from Artificial Intelligence, it felt more like a comical experiment in video generation tools. Consequently, “Will Smith eating spaghetti” became a meme and dramatically evolved into an internet benchmark, serving as a new kind of “Turing Test” for generative video AI.

Over the next three years, numerous models stepped up to challenge this test. With each video iteration, Will Smith’s face become increasingly photoreal, the spaghetti on the fork stopped writhing, and the background details steadily stabilized. In February 2026, ByteDance’s model, Seedance 2.0, formally declared it had passed this test via a short video on X, marking yet another milestone for video generation AI.
After three years, this virtual Will Smith finally ate his spaghetti with perfect realism. It seemed to spark a carnival: shortly after, this “Will Smith” could fight a spaghetti monster, and Jackie Chan could “star” in new movies based on user prompts. Stunningly smooth long videos and near-flawless animations began to emerge. Although hallucinations and incoherent flaws persist, the arrival of completely indistinguishable AI video is becoming an increasingly credible reality.
Accompanying this technological leap, however, is an unfolding crisis centered on identity appropriation. Video generation AI is establishing an “Identity Extraction Pipeline”, fueling social identity usurpation at the output stage through data extraction at the input stage, thereby creating profound new dilemmas for digital governance.
Video Generation AI: An Extractive Industry
Current discussions around artificial intelligence video generation encompass various methods. Tully (2026) explains the mechanics in detail. The approach featured in the “Will Smith eating spaghetti” test combine Large Language Models (LLMs) used for prompt comprehension with diffusion models or transformers used for visual output generation. Through a user’s text input, these models generate visual assets, such as OpenAI’s Sora, Google’s Veo, and Runway’s Gen-3. Currently, these models can mostly only generate short videos lasting a few minutes, and their continuity still leaves room for improvement.
Another type of video generation AI is based on avatar creation, generating an avatar model from a static image and animating it. Since it relies on image references, this generative method is more predictable than text prompts, allowing for more logical motion and temporal progression.
However, regardless of the architectural approach, they share a common foundation: they utilize machine learning models trained on massive datasets to convert inputs (text, images, audio) into video frames. Large-scale data collection and usage are unequivocally the core of their operation.
On this basis, Couldry and Mejias (2019) argue that in the digital age, data created by real users is alienated by capital into freely exploitable “oil.” Users’ daily lives become resources, constructing a new form of data colonialism.
while we enter the era of generative AI, this process has been further expanded:
“this is an expanded view of artificial intelligence as an extractive industry. The creation of contemporary AI systems depends on exploiting energy and mineral resources from the planet, cheap labour, and data at scale.” — Kate Crawford (2021)
Beyond machine learning’s extraction of individual resources, Crawford (2021) further notes that when personal information is treated merely as infrastructure in AI large model training, the individual’s original context is taken away. Massive amounts of material are fed into the machine learning process with their original context erased, yet their utility value is retained, only to be utilized after being “refined” and outputted by the LLMs. Without authorization, this constitutes not only an invasion of personal privacy but also the alienation and deconstruction of individual identity and production.
When it comes to video generation AI, this decontextualized data extraction takes on a more unique manifestation. For instance, when utilizing a celebrity’s appearance to generate a video, even though the content and plot can be arbitrarily rewritten, the individual’s physical appearance itself is preserved as a spectacle. Consequently, the public recognition and social relations tied to that personal image are also co-opted to a certain extent.
Therefore, Will Smith’s fame as an actor constituted a crucial driving force behind the “Will Smith eating spaghetti” test, and AI-generated celebrity media can gain significantly more attention. Compared to still images, the extraction of personal identity achieved by AI-generated video is undoubtedly broader in scope and much more potent. The evolutionary history of “Will Smith eating spaghetti” is, in reality, the process of constantly lowering the cost of identity appropriation by generative AI.
The Weaponization of Pixels: Beyond the Meme
Identity fraud triggered by generative AI has intensified in recent years. It is estimated that the generative AI market will grow by 560% between 2025 and 2031, reaching a valuation of $442 billion . Furthermore, 46% of fraud experts have encountered synthetic identity fraud, with 29% of these instances involving video deepfakes.
Although current technological limitations make many people feel confident about how they can accurately distinguish between real footage and awkward AI-generated videos, a study by Köbis et al. (2021) five years ago already demonstrated that a vast number of people actually cannot identify deepfake videos themselves, even though they believe that they could. Furthermore, Vaccari and Chadwick (2020) point out that the circulation of deepfake videos generates uncertainty, which in turn could diminish societal trust in information acquisition. Therefore, when this technology shifts from mere public entertainment to purposeful identity fraud, its destructive potential can be disastrous.
Numerous deepfake cases and the turmoil caused by other AI-generated video content have repeatedly sounded the alarm for the internet in recent years. For example, the deepfake case that occurred in early 2024 at the Hong Kong branch of the multinational engineering firm Arup demonstrates the severe harm video generation AI can cause in the lack of standardized fact-checking and comprehensive anti-fraud governance.
In this case, a finance employee was induced by scammers to remit 200 million HKD (approximately 25 million USD) in multiple transactions. Initially, the employee did not readily trust the scamming emails from the scammers. Then, to dispel his doubts, the scammers invited him to join a multi-person video-call. During this video-call, which was generated in real time by deepfake technology, the employee lowered his guard while seeing “executives” on his screen. Finally, following instructions from these executives fabricated through deepfake AI, this poor employee ultimately caused massive financial losses.
Astonishingly, this highly convincing identity deepfake did not even require the scammers to breach Arup’s internal databases. Instead, they utilized the executives’ public speeches and interview videos available on YouTube, the corporate website, and industry forums for machine learning and computation. Biometric data carrying personal identity and professional credibility (facial expressions, voiceprints, body language) was extracted and re-encoded at a low cost. When these fabricated physical traits were presented in real-time with unprecedented realism, the traditional, vision-based human trust evaluation mechanisms collapsed.
Current Governance Response in HongKong
Faced with the systemic vulnerabilities exposed by this case, Hong Kong regulatory authorities took swift, defensive countermeasures. The HKMA rolled out a series of intensive policies following the incident, including the establishment of a “GenA.I. Sandbox” to test anti-fraud detection models. It also issued new e-banking security guidelines requiring high-risk transactions to move away from mere visual verification and shift towards mandatory physical hardware binding or face-to-face authentication mechanisms.
While these measures undoubtedly alleviate the pressure of facing AI scams, as we delve deeper into this case, more vulnerabilities lie beneath the surface.
As mentioned earlier, video generation AI can achieve high credibility and preserve the social relations of the imitated subject during the output process. In this case, what AI video weaponised was not just pixels, but the social power structure itself. When the employee was in that fake conference room, he was facing not only indistinguishable imagery but also the social status co-opted by the AI identity theft of company executives, triggering his subconscious obedience as a subordinate. When confronting the structural forgery of AI, individuals exhibit limitations in their response, indicating the necessity for systematic countermeasures.
Furthermore, from a platform perspective, the data scammers used to train the executive models was entirely scraped from public platforms like YouTube and LinkedIn. As public spheres hosting massive amounts of personal data, digital media platforms reap traffic benefits by encouraging users to upload data, yet this heightens the risk of centralized data extraction in the context of datafication. Without strengthening the protection of users’ personal information, platforms can easily become “accomplices” in identity fraud.
Beyond data collection, the governance of the generated video content’s output channels also warrants attention. This deepfake conference, conducted via a video call, utilized synthetic streams.However, current mainstream video conferencing providers still severely lack the capability to identify this mechanism. This governance vacuum at the infrastructure level allows scammers to bypass security checks and inject generated fake signals into legitimate business channels.
The regulatory actions taken in Hong Kong represent a practical response to the immediate threats of AI deepfake fraud. The core of this approach is to reinforce the compromised trust systems through far more rigorous, multi-layered verification protocols. While these defensive measures are crucial for protecting institutional security today, they also highlight the ongoing necessity for a more comprehensive strategy that eventually addresses the input end of identity appropriation.
Global Governance and Our Future with Video Generation AI: What’s Next?
Shifting from a regional focus to a global perspective, various regions have introduced diverse, targeted legislative acts in response to deepfake fraud cases. For example, the core of the EU AI Act is transparency Obligations, which mandate platforms to add machine- and human-readable watermarks to AI-generated audiovisual content. In the US, the federal NO FAKES Act draft proposes establishing an individual’s digital replica right to restrict the abuse of generated content; California’s deepfake laws put direct pressure on platforms, requiring them to bear the responsibility of blocking the spread of political deepfake videos within 72 hours. Meanwhile, in the guidance for training generative AI models released by the Australian OAIC, the authorization of personal and sensitive information collection is emphasized. However, how to translate these guidelines into effective governance and legislation remains a question for ongoing exploration.
Comparing global governance schemes horizontally, it is clear that whether it’s the EU’s transparency watermarks or the US’s spread restrictions, most are reactive remedies targeting the output end of generative AI. However, shifting focus to the input end–the mass scraping of training data–presents a genuine structural dilemma. The architecture of the World Wide Web relies on open protocols designed for information transparency. Therefore, actions such as imposing a blanket ban on web scraping could dismantle the foundations of search engines and open research. Nevertheless, these objective limitations in regulating public data must never become a free pass for the unauthorized exploitation of personal biometric identities.
Therefore, future regulations must approach this crisis from multiple dimensions, elevating the ethical standards and transparency of the machine-learning process. Beyond macro-level legislation, there is a critical need for concrete micro-level actions, such as establishing specialized third-party oversight agencies to rigorously audit AI developers and enforce accountability for data provenance. That “Will Smith eating spaghetti” was once our most harmless joke of the AI age, but the deepfake video-call at Arup represents the harsh reality where this extractive logic is fully unmasked. Our faces, voices, and trust are inalienable part of our treasure, and only through comprehensive, actionable governance can we ultimately protect our identity security in an increasingly synthetic world.
References
CLI. (2024). Bill Text – AB-2655 Defending Democracy from Deepfake Deception Act of 2024. Ca.gov. https://leginfo.legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202320240AB2655
Couldry, N., & Mejias, U. A. (2019). Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject. Television & New Media, 20(4), 336–349.
Crawford, K. (2021). Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press.
European Union. (2024, June 13). Regulation – EU – 2024/1689 – EN – EUR-Lex. Eur-Lex.europa.eu. https://eur-lex.europa.eu/eli/reg/2024/1689/oj
HKMA. (2024). HKMA Banking Regulatory Document Repository. Hkma.gov.hk. https://brdr.hkma.gov.hk/eng/doc-ldg/docId/20241107-1-EN
Hu, C. (2026, March 12). Policies on AI-Generated Video Content Across Platforms. Medium. https://chierhu.medium.com/policies-on-ai-generated-video-content-across-platforms-01b1130c3112
Köbis, N. C., Doležalová, B., & Soraperra, I. (2021). Fooled twice – People cannot detect deepfakes but think they can. IScience, 24(11), 103364. https://doi.org/10.1016/j.isci.2021.103364
Milmo, D. (2024, February 5). Company worker in Hong Kong pays out £20m in deepfake video call scam. The Guardian. https://www.theguardian.com/world/2024/feb/05/hong-kong-company-deepfake-video-conference-call-scam
OAIC. (2024, October 20). Guidance on privacy and developing and training generative AI models. OAIC. https://www.oaic.gov.au/privacy/privacy-guidance-for-organisations-and-government-agencies/guidance-on-privacy-and-developing-and-training-generative-ai-models#section-top-five-takeaways
Placido, D. D. (2026, February 11). AI Nailed The “Will Smith Eating Spaghetti” Test—What Comes Next?. Forbes. https://www.forbes.com/sites/danidiplacido/2026/02/11/ai-can-flawlessly-generate-will-smith-eating-spaghetti-what-now/
Statista. (2025). Generative AI market size worldwide| Statista. Statista. https://www.statista.com/forecasts/1449838/generative-ai-market-size-worldwide/
Tully, M. (2026, March 29). AI Video Generation Explained: What It Is, How It Works | Colossyan. Colossyan.com. https://www.colossyan.com/posts/ai-video-generation-what-is-it-and-how-does-it-work/
U.S Congress. (2025). Text – S.1367 – 119th Congress (2025-2026): NO FAKES Act of 2025. Congress.gov. https://www.congress.gov/bill/119th-congress/senate-bill/1367/text
Vaccari, C., & Chadwick, A. (2020). Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News. Social Media + Society, 6(1). https://doi.org/10.1177/2056305120903408
.
Be the first to comment