Amazon Bedrock Guardrails Now Supports Image-Aided Multimodal Toxicity Detection (preview) | Amazon Web Services

Amazon Bedrock Guardrails Now Supports Image-Aided Multimodal Toxicity Detection (preview) | Amazon Web Services

Voiced by Polly

Today, we’re announcing a preview of multimodal image-enhanced toxicity detection in Amazon Bedrock Guardrails. This new capability detects and filters unwanted image content in addition to text, helping you improve the user experience and manage model outputs in your generative AI applications.

Amazon Bedrock Guardrails helps you implement security for generative AI applications by filtering unwanted content, redacting personally identifiable information (PII), and enhancing content security and privacy. You can configure prohibited topic policies, content filters, word filters, PII redaction, contextual grounding checks, and automatic justification checks (preview) to tailor security to your specific use cases and responsible AI policies.

With this launch, you can now use existing content filtering policies in Amazon Bedrock Guardrails to detect and block harmful visual content across categories such as hate, insults, sexual, and violence. You can configure the thresholds from low to high to suit the needs of your application.

This new image support works with all base models (FM) in Amazon Bedrock that support image data, as well as any custom fine-tuned models you bring. It provides a consistent layer of protection across text and image modalities, making it easier to build responsive AI applications.

Tero Hottinen, Vice President, Head of Strategic Partnerships at KONE, envisions the following use case:

In its ongoing assessment, KONE recognizes the potential of Amazon Bedrock Guardrails as key components in protecting gen AI applications, particularly for relevance and contextual grounding checks, as well as multi-modal security. The company envisions integrating design diagrams and product manuals into its applications, with Amazon Bedrock Guardrails playing a key role in enabling more accurate diagnosis and analysis of multimodal content.

Here’s how it works.

Multimodal toxicity detection in action
To get started, create a rail in the AWS Management Console and configure content filters for text, image data, or both. You can also use the AWS SDKs to integrate this functionality into your applications.

Create a handrail
On the console, go to Amazonian subsoil and select Railing. From there, you can create a new railing and use existing content filters to detect and block image data in addition to text data. Category for Rush, Insults, Sexualand Violence under Configuring content filters can be configured for either text or image content or both. Tea Misconduct and Prompt attacks categories can only be configured for text content.

Amazon Bedrock Guardrails multimodal support

After you select and configure the content filters you want to use, you can save the railing and start using it to build safe and responsible generative AI applications.

To test the new railing in the console, select the railing and make your choice Test. You have two options: test the guardrail by selecting and invoking the model, or test the guardrail without invoking the model using the standalone Amazon Bedrock Guardrails. ApplyGuardail API.

with ApplyGuardrail API, you can validate content at any point in your application flow before processing or providing results to the user. You can also use the API to evaluate inputs and outputs for any self-managed (proprietary) or third-party FMs, regardless of the underlying infrastructure. For example, you can use the API to evaluate a Meta Llama 3.2 model hosted on Amazon SageMaker or a Mistral NeMo model running on your laptop.

Test the railing by selecting and recalling the model
Choose a model that supports video inputs or outputs, such as Anthropic’s Claude 3.5 Sonnet. Verify that prompt and response filters are enabled for the image content. Next, provide a prompt, upload an image file, and select Run.

Amazon Bedrock Guardrails multimodal support

In my example, the Amazon Bedrock Guardrails hit. Choose View track for more details.

The handrail route provides a record of how safety measures were used during the interaction. It shows whether the Amazon Bedrock Guardrails hit or not and what assessments were made on both the input (challenge) and output (model response). In my example, the content filters blocked the input prompt because they detected insults in the image with high confidence.

Amazon Bedrock Guardrails multimodal support

Test the railing without recalling the model
In the console, select Use the independent Guardrails API test the railing without calling the model. Select whether you want to validate an input prompt or an example output generated by the model. Then repeat the previous steps. Verify that prompt and response filters are enabled for the image content, provide the content for validation, and select Run.

Amazon Bedrock Guardrails multimodal support

I reused the same image and prompt for my demo and Amazon Bedrock Guardrails hit again. Choose View track again for more details.

Amazon Bedrock Guardrails multimodal support

Join the preview
Image-Aided Multimodal Toxicity Detection is available in preview today at Amazon Bedrock Guardrails in US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Tokyo), Europe (Frankfurt, Ireland , London) and AWS GovCloud (US-West) AWS Regions. To learn more, visit Amazon Bedrock Guardrails.

Try the Multimodal Toxicity Detection Content Filter in the Amazon Bedrock Console today and let us know what you think! Send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS support contacts.

— Antje

Leave a Reply

Your email address will not be published. Required fields are marked *