Agent JailbreakLab Docs
Learn how to use our platform to test and improve your AI agent's security
Getting Started
Agent JailbreakLab is a platform designed for security researchers and AI developers to test the robustness of AI agents against various jailbreak attempts and security vulnerabilities.
What is an AI Jailbreak?
An AI jailbreak is a method that attempts to bypass an AI's built-in safety measures and guardrails. These attempts may try to get the AI to generate harmful, unethical, or inappropriate content that it was designed to avoid.
How to Run Your First Test
- Navigate to the New Test page
- Select the AI model you want to test
- Choose a jailbreak template or create your own prompt
- Submit the test and review the results
- Share your findings or export the report
Ethical Usage Notice
Agent JailbreakLab is designed for ethical security research and defensive testing purposes only. Please use this platform responsibly and in accordance with our Terms of Service.
Test Types
Agent JailbreakLab offers various test types to thoroughly assess your AI's security posture. Each test is designed to probe different aspects of an AI system's safety mechanisms.
Prompt Injection
Tests that attempt to override or ignore the AI's system instructions through carefully crafted user prompts.
Run this testRestricted Content
Tests that try to elicit harmful, illegal, or otherwise inappropriate content from the AI.
Run this testObfuscation Techniques
Tests that use various obfuscation methods to hide malicious intent from the AI's safety filters.
Run this testAdvanced Techniques
Sophisticated multi-step attacks that combine various methods to bypass AI safeguards.
Run this testTo create a custom test with your own jailbreak techniques, visit our New Test page and select the "Custom" option.
Examples
Below are some example test scenarios to help you understand how to effectively use Agent JailbreakLab.
Example 1: Basic Prompt Injection
Example 2: Advanced Obfuscation
Browse our vulnerability database to see more examples of jailbreak attempts and their effectiveness against different AI models.
Best Practices
Follow these best practices to get the most out of Agent JailbreakLab and improve your AI system's security.
Regular Testing
Test your AI systems regularly, especially after updates to your models or safety measures, to ensure ongoing protection against new vulnerabilities.
Comprehensive Coverage
Use a variety of test types to ensure thorough evaluation of your AI's security posture across different vulnerability categories.
Remediation Process
Establish a clear process for addressing identified vulnerabilities, including prioritization based on severity and potential impact.
Documentation
Keep detailed records of tests performed, vulnerabilities identified, and remediation actions taken to track your security improvements over time.
Security By Design
The most effective approach is to incorporate security considerations from the beginning of your AI development process rather than treating it as an afterthought.
Frequently Asked Questions
Is Agent JailbreakLab free to use?
Yes, Agent JailbreakLab offers free basic testing functionality. Advanced features and API access may require a subscription. Visit our pricing page for more details.
How is test data handled?
Test data is processed securely and stored according to our privacy policy. You can choose to keep your tests private or share them with the community to help improve AI safety research.
Can I test any AI model?
Agent JailbreakLab supports testing of most popular AI models. If you have a custom model you'd like to test, you can use our API integration to set up specialized testing.
How often are new test types added?
We continuously research and add new test types as novel AI jailbreaking techniques are discovered. Our platform is regularly updated to stay ahead of emerging threats.
How can I contribute to the platform?
Researchers and security professionals can contribute by sharing successful jailbreak attempts, suggesting new test types, or participating in our responsible disclosure program.
Have more questions? Contact our support team at support@agentjailbreaklab.com