This week, officials from the U.K., E.U., U.S., and seven other countries met in San Francisco to kick off the “International Network of AI Safety Institutes.” The gathering, hosted at the Presidio Golden Gate Club, centered on tackling the risks of AI-generated content, testing foundational models, and assessing advanced AI systems. AI safety institutes from Australia, Canada, France, Japan, Kenya, South Korea, and Singapore joined the initiative.
During the meeting, participants signed a mission statement and allocated over $11 million toward AI-generated content research. They also reviewed the findings from the Network’s first joint safety testing exercise. Attendees included regulatory officials, AI developers, academics, and civil society leaders, all focused on emerging AI challenges and potential safeguards.
This event built on the discussions from the AI Safety Summit held in Seoul last May. There, ten nations committed to fostering international cooperation in artificial intelligence as advancements continue to influence economies and societies.
The European Commission highlighted that the International Network of AI Safety Institutes will facilitate collaboration, pooling technical expertise to address safety risks. The initiative emphasizes understanding cultural and linguistic diversity while working toward a unified view of AI safety risks and mitigation strategies.
Toward that goal, member institutes need to show their progress in AI safety testing ahead of the Paris AI Impact Summit scheduled for February 2025, where they will discuss regulatory matters.
Here are some key outcomes from the conference:
-
Mission Statement: Members agreed to collaborate in four key areas:
- Research: Work with the AI safety research community and share findings.
- Testing: Establish and share best practices for testing advanced AI systems.
- Guidance: Facilitate common approaches to interpreting AI safety test results.
- Inclusion: Promote shared information and tools for broader participation in AI safety science.
-
Funding for Research: Members and several nonprofits unveiled over $11 million in funding to explore ways to mitigate risks from AI-generated content. Critical areas of concern include child sexual abuse material, non-consensual sexual imagery, and AI’s use in fraud and impersonation. Priority will go to researchers looking into digital content transparency techniques and safeguards against producing harmful content.
-
First Joint Testing Exercise: The network completed its initial joint testing on Meta’s Llama 3.1 405B. They examined its general knowledge, multilingual capabilities, and instances of closed-domain hallucinations. The results prompted discussions on improving AI safety testing across various languages and contexts.
-
Agreed Scientific Basis for Risk Assessments: Members reached consensus on a scientific foundation for AI risk assessments, focusing on making them actionable, transparent, comprehensive, iterative, and replicable.
- New Task Force Established: The U.S. AI Safety Institute set up the TRAINS task force, comprising experts from several U.S. agencies. This group aims to test AI models to manage national security risks across several domains.
At the conference, U.S. Commerce Secretary Gina Raimondo emphasized the need to balance AI innovation with safety. She argued that while advancing AI is important, rushing without considering consequences is unwise. The tension between progress and safety has been prominent lately, with regulators trying to protect consumers while not stifling access to beneficial technologies.
Dario Amodei, CEO of Anthropic, echoed the call for safety testing, noting the importance of controlling AI before it develops unpredictable capabilities.
Over the past year, we’ve seen a rise in global AI safety institutes. The inaugural meeting occurred at Bletchley Park in the U.K., which resulted in the founding of the U.K. AI Safety Institute. Its goals include evaluating existing AI systems, conducting foundational research, and sharing information internationally.
The U.S. AI Safety Institute, created by NIST in February 2024, chairs the network. It is responsible for implementing actions from the AI Executive Order issued in October 2023, which emphasizes establishing safety standards for AI systems.
In April, the U.K. announced cooperation with the U.S. to develop tests for advanced AI models, ensuring collaboration on developments from their AI Safety Institutes. The San Francisco conference’s clarity on the U.S. position on AI safety is vital, especially given varying support levels across the nation. Governor Gavin Newsom recently vetoed a controversial AI regulation bill, reflecting the ongoing debate around AI regulation.