Visualização normal

Antes de ontemStream principal

GitHub Security Lab Archives - The GitHub Blog
Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game Joseph Katsioloudes
I was scrolling through my feed one evening when I came across OpenClaw, an open source personal AI assistant that people were calling everything from “Jarvis” to “a portal to a new reality.” The idea is beautiful: an AI that lives on your machine or in the cloud, talks to you over WhatsApp or Telegram, clears your inbox, manages your calendar, browses the web, runs shell commands, and even writes its own plugins. Users were having it check them in for flights, build entire websites from their
14 de Abril de 2026, 15:17

Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game

GitHub Security Lab Archives - The GitHub Blog

14 de Abril de 2026, 15:17

I was scrolling through my feed one evening when I came across OpenClaw, an open source personal AI assistant that people were calling everything from “Jarvis” to “a portal to a new reality.” The idea is beautiful: an AI that lives on your machine or in the cloud, talks to you over WhatsApp or Telegram, clears your inbox, manages your calendar, browses the web, runs shell commands, and even writes its own plugins. Users were having it check them in for flights, build entire websites from their phones, and automate things they never thought possible.

My first reaction was the same as everyone else’s: this is incredible.

My second reaction was…different. I started thinking about what happens when that kind of power meets a malicious prompt. What if someone tricks the agent into reading files it should not access? What if a poisoned web page rewrites the agent’s instructions? What if one agent in a multi-agent chain passes bad data to another that blindly trusts it?

Those questions became Season 4 of the Secure Code Game.

The Secure Code Game: Learn secure coding and have fun doing it

The Secure Code Game is a free, open source in-editor course where players exploit and fix intentionally vulnerable code. When I created the first season in March 2023, the goal was straightforward: make security training that developers would enjoy. Fix the vulnerable code, keep it functional, level up. That core philosophy has not changed across any season.

Season 2 expanded into multi-stack challenges with community contributions across JavaScript, Python, Go, and GitHub Actions. Season 3 took players into LLM security, where they learned to hack and then harden large language models. Along the way, over 10,000 developers across the industry, open source, and academia have played to sharpen their skills.

What has changed with each season is the landscape. When we launched Season 1, AI coding assistants were just starting to become mainstream. By Season 3, we were teaching players to craft malicious prompts and then defend against them. Now, with Season 4, we are tackling the security challenges of AI systems that can act autonomously. They can browse the web, call APIs, coordinate with other agents, and act on your behalf.

Why agentic AI security matters right now

The timing is not a coincidence. AI agents have moved from research prototypes to production tools at remarkable speed, and the security community is racing to keep up.

The OWASP Top 10 for Agentic Applications 2026, developed with input from over 100 security researchers, now catalogues risks like agent goal hijacking, tool misuse, identity abuse, and memory poisoning as critical threats. A Dark Reading poll found that 48% of cybersecurity professionals believe agentic AI will be the top attack vector by the end of 2026. And Cisco’s State of AI Security 2026 report highlighted that while 83% of organizations planned to deploy agentic AI capabilities, only 29% felt ready to do so securely.

The gap between adoption and readiness is exactly where vulnerabilities thrive. And the best way to close that gap is by learning to think like an attacker.

Meet ProdBot: your deliberately vulnerable AI assistant

Season 4 puts you inside ProdBot, your productivity bot, a deliberately vulnerable agentic coding assistant for your terminal. Inspired by tools like OpenClaw and GitHub Copilot CLI, ProdBot turns natural language into bash commands, browses a simulated web, connects to MCP (Model Context Protocol) servers, runs org-approved skills, stores persistent memory, and orchestrates multi-agent workflows.

Your mission across five progressive levels is simple: use natural language to get ProdBot to reveal a secret it should never expose. If you can read the contents of password.txt, you have found a security vulnerability.

No AI or coding experience is needed…just curiosity and willingness to experiment. Everything happens through natural language in the CLI.

Five levels, five upgrades, five vulnerabilities

Each level of the game mirrors a stage in how real AI-powered tools evolve. As ProdBot gains new capabilities, the upgrade opens a new attack surface for you to discover. Here is what ProdBot looks like as it grows:

Level 1 starts with the basics: ProdBot generates and executes bash commands inside a sandboxed workspace. Can you break out of the sandbox?
Level 2 gives ProdBot web access. It can now browse a simulated internet of news, finance, sports, and shopping sites. What could go wrong when an AI reads untrusted content?
Level 3 connects ProdBot to MCP servers…external tool providers for stock quotes, web browsing, and cloud backup. More tools, more power, more ways in.
Level 4 adds org-approved skills and persistent memory. ProdBot can now run pre-built automation plugins and remember your preferences across sessions. Trust is layered…but is it earned?
Level 5 is everything coming together: six specialized agents, three MCP servers, three skills, and a simulated open-source project web. The platform claims all agents are sandboxed and all data is pre-verified. Time to put that to the test.

Each level builds on the previous one, and that progression is the point.

We aren’t going to tell you exactly which vulnerabilities you will find at each level as that would ruin the fun. But we will say this: the attack patterns you will discover in Season 4 are not theoretical. They reflect the kinds of risks that security teams are grappling with right now as organizations deploy autonomous AI systems into production.

Think about CVE-2026-25253 (CVSS 8.8 – High): Known as “ClawBleed” or the one-click Remote Code Execution (RCE) vulnerability. It allowed attackers to steal authentication tokens via a malicious link and gain full control of the OpenClaw instance.

The goal is not just to learn a specific exploit. It is to build the instinct that helps you spot these patterns in the wild, whether you are reviewing an agent’s architecture, auditing a tool integration, or simply deciding how much autonomy to give the AI assistant that just landed on your team.

Get started in under 2 minutes

This entire experience runs in GitHub Codespaces, so there is nothing to install, nothing to configure, and it doesn’t cost you a penny (Codespaces offers up to 60 hours of free usage per month). You can be inside ProdBot’s terminal in under two minutes, and each season is self-contained, so you can jump straight into Season 4 without covering the earlier ones.

You may find Season 3 to be a helpful foundation since it builds the basics of AI security. But it is not required. Just bring your hacker mindset.

Ready? Start Season 4 now >

Special thanks to Rahul Zhade, Staff Product Security Engineer at GitHub, and Bartosz Gałek, creator of Season 3, for testing and improving Season 4.

FAQ

Do I need AI or coding experience to play Season 4?

No. Everything happens through natural language in the CLI. You type plain English, or any language, prompts and ProdBot responds. Curiosity and a willingness to experiment are all you need.

Do I need to complete previous seasons first?

No. Each season is self-contained. You can jump directly into Season 4 by running ProdBot and typing level <N>. That said, Season 3 builds a helpful foundation in AI security and takes about 1.5 hours.

How long does Season 4 take?

Approximately two hours, though it varies depending on how deeply you explore each level. Some players like to try multiple approaches per level.

Is this free?

Yes. The Secure Code Game is open source and free to play. It runs in GitHub Codespaces, which provides up to 60 hours of free usage per month.

What are the rate limits?

Season 4 uses GitHub Models, which have rate limits. If you hit a limit, wait for it to reset and resume. Learn more about responsible use of GitHub Models.

The post Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game appeared first on The GitHub Blog.

GitHub Security Lab Archives - The GitHub Blog
Hack the model: Build AI security skills with the GitHub Secure Code Game Joseph Katsioloudes
We just launched season three of the GitHub Secure Code Game, and this time we’re putting you face to face with the security risks introduced by artificial intelligence. Get ready to learn by doing and have fun doing it! First, you’ll step into the shoes of an adversary crafting malicious prompts. Then, you’ll secure your application against those attacks. The Secure Code Game is a free software security course suitable for all developer levels, where players fix intentionally vulnerable cod
3 de Junho de 2025, 13:37

Hack the model: Build AI security skills with the GitHub Secure Code Game

GitHub Security Lab Archives - The GitHub Blog

Por:Joseph Katsioloudes

3 de Junho de 2025, 13:37

We just launched season three of the GitHub Secure Code Game, and this time we’re putting you face to face with the security risks introduced by artificial intelligence. Get ready to learn by doing and have fun doing it! First, you’ll step into the shoes of an adversary crafting malicious prompts. Then, you’ll secure your application against those attacks.

The Secure Code Game is a free software security course suitable for all developer levels, where players fix intentionally vulnerable code to build code security skills. By placing gameplay directly in the code editor—a developer’s natural habitat—it helps them practice spotting vulnerabilities where they normally work. Its straightforward setup allows players to start in under two minutes using Codespaces.

We launched the first season of the Secure Code Game in March 2023 to fill a gap in developer training, where security often takes a backseat to functionality. To grab attention, we presented seemingly flawless code snippets riddled with critical vulnerabilities, like the OWASP Top 10. We then gamified the learning by challenging players to efficiently fix these issues without introducing new ones or missing edge cases. Supported by a strong community that contributed challenges, season two premiered a year later. Since then, over 10,000 developers across enterprise, open source, and education communities have played to sharpen their skills.

This training was a one-of-a-kind experience. Initially, when I looked at the entire code file, I just thought it was completely normal. I didn’t see any flaws. The unit tests were completely normal, and the code seemed perfect. But the vulnerability was right in front of my eyes. It made me realize how much of a gap there is in identifying even the most basic flaws. It made me more cautious.
Sanyam Mehta, Back-end developer

This is a pretty fun way to learn! I would definitely recommend this game, as it helped me be more aware of the vulnerabilities out there.
Tyler Anton, Computer science student

Now’s the perfect time to elevate your skills!

Today, we’re excited to launch the third season, which immerses players into the fascinating world of artificial intelligence through six realistic challenges. As the world moves decisively into a new era—McKinsey & Company reports that the use of generative AI increased from 33% in 2023 to 71% in 2024 and GitHub Copilot is now being used by more than 77,000 organizations—there’s no better time to enhance your skills.

What you’ll learn

Each of the six security challenges focuses on a different defensive technique. The levels get progressively harder as they build on the defensive techniques of the previous ones. Some of the topics you’ll learn about include:

Crafting robust system prompts: Securely design the initial instructions that guide the model’s behavior, ensuring desired, safe, and relevant outputs by setting its role, constraints, format, and context.
Output validation: Prevent leaks by verifying that the output conforms to certain predefined rules, formats, or expectations.

Input filtering: Examine, modify, or block user-provided text before it’s fed into the model to prevent harmful or irrelevant content from influencing the model as it generates output.

LLM self-verification: Use this technique to have the Large Language Model (LLM) check its own output for accuracy, consistency, and compliance with defined rules or constraints. This may involve checking for errors, validating reasoning, or confirming adherence to policies. Self-verification can be prompted directly or built into the model’s response generation.

Progressing through season three requires players to hack LLMs. Each challenge begins with a set of guiding instructions for the LLM provided in the form of a code and a system message. These elements might include gaps or edge cases that could be exploited using a malicious prompt. Your task is to identify the vulnerabilities and craft prompts to manipulate the model into exposing a hidden secret.

After exploiting the vulnerability, your next task is to refine the code and system message to prevent future leaks from those malicious prompts. You must do this while maintaining the functionality of the code.

I learned to spot vulnerabilities, where and how they occur, and to correct them effectively before pushing them out to the world. I would absolutely recommend this training to anyone, not only in cybersecurity but software development too.
Rajeev Mandalam, Application security at Boeing

One of the major takeaways was that it’s good to focus on details because they could often lead to vulnerabilities that you wouldn’t have imagined. In terms of the format, I liked the fact that it didn’t focus on one particular programming language, and the whole emphasis is on the code concepts. I also liked how easy it was to navigate between game levels. The game provided me with a great new skillset to add to my professional journey.
Reshmi Mehta, Security analyst at Alcon

How to get started

If you’re eager to start learning, we’ve got a Secure Code Game repository all set up and ready to go. It includes instructions for all the seasons, so you can experience everything it has to offer. Just browse through the README and get started.

How season three came together

Don’t wait to be an expert. Build, share, and the right people will find you. Open source changed my life, and it can do the same for you.
Bartosz Gałek (@bgalek), contributor of Secure Code Game season three

It all started in FOSDEM 2025 in Belgium, where I presented the Secure Code Game to a group of open source maintainers. Bartosz was in the audience and here’s what happened in his own words:

The Secure Code Game immediately clicked, as security and game development is my thing! After the talk, I thanked Joseph on LinkedIn and shared HackMerlin with him—a game I created that challenges players to test their prompting skills. To my surprise, he liked it!

One thing led to another, and HackMerlin became the base of our collaboration for season three. The whole journey started as a silly idea, but it turned into something much bigger.

Screenshot of an interactive game interface at Level 6. The game instructs the player to outsmart Merlin by asking clever questions to reveal a password. A cartoon wizard, Merlin, appears at the top, saying 'Hello traveler! Ask me anything...'. Below is a text box where the user can type questions, an 'Ask' button, and a field labeled 'Enter the secret password' followed by a 'Submit' button to check the validity of the answer.

A key challenge in moving from HackMerlin’s UI-centric approach to the Secure Code Game’s code-first season three was ensuring user friendliness. HackMerlin’s backend provided an LLM gateway through the UI, abstracting much of the integration complexity—an area where GitHub products truly excel! Codespaces let us preconfigure the development environment and automate access to GitHub Models, a catalog and playground of AI models that provided us with built-in LLM integration.

Moreover, HackMerlin used an OpenAI model hosted on Azure, allowing for the game’s creator to adjust or disable the model’s provider default safeguards. In contrast, season three prioritized realism by adhering to the default safeguards of GitHub Models. Finally, GitHub Models made it easy for our players to switch between models and explore their differences.

What do you think? Do you have ideas for new challenges? Be our next contributor and help shape the next season of the game! See our contribution guidelines for details.

Time to play! Get ready to unleash your creativity on the challenges of season three! Start playing now >

The post Hack the model: Build AI security skills with the GitHub Secure Code Game appeared first on The GitHub Blog.