Save the date for Firebase Demo Day 2024. Learn how to build and run modern, AI-powered apps users love. Learn more.

Generate structured output (like JSON) using the Gemini API

The Gemini API returns responses as unstructured text by default. However, some use cases require structured text, like JSON. For example, you might be using the response for other downstream tasks that require an established data schema.

To ensure that the model's generated output always adheres to a specific schema, you can define a response schema, which works like a blueprint for model responses. You can then directly extract data from the model's output with less post-processing.

Here are some examples:

Ensure that a model's response produces valid JSON and conforms to your provided schema.
For example, the model can generate structured entries for recipes that always include the recipe name, list of ingredients, and steps. You can then more easily parse and display this information in the UI of your app.
Constrain how a model can respond during classification tasks.
For example, you can have the model annotate text with a specific set of labels (for instance, a specific set of enums like positive and negative), rather than labels that the model produces (which could have a degree of variability like good, positive, negative, or bad).

This guide shows you how to generate JSON output by providing a responseSchema in a call to generateContent. It focuses on text-only input, but Gemini can also produce structured responses to multimodal requests that include images, videos, and audio as input.

At the bottom of this page are more examples, like how to generate enum values as output. To view additional examples of how you can generate structured output, check out the list of Example schemas and model responses in the Google Cloud documentation.

Other options for working with the Gemini API

Optionally experiment with an alternative "Google AI" version of the Gemini API
Get free-of-charge access (within limits and where available) using Google AI Studio and Google AI client SDKs. These SDKs should be used for prototyping only in mobile and web apps.

After you're familiar with how a Gemini API works, migrate to our Vertex AI in Firebase SDKs (this documentation), which have many additional features important for mobile and web apps, like protecting the API from abuse using Firebase App Check and support for large media files in requests.

Optionally call the Vertex AI Gemini API server-side (like with Python, Node.js, or Go)
Use the server-side Vertex AI SDKs, Firebase Genkit, or Firebase Extensions for the Gemini API.

Before you begin

If you haven't already, complete the getting started guide for the Vertex AI in Firebase SDKs. Make sure that you've done all of the following:

Set up a new or existing Firebase project, including using the Blaze pricing plan and enabling the required APIs.
Connect your app to Firebase, including registering your app and adding your Firebase config to your app.
Add the SDK and initialize the Vertex AI service and the generative model in your app.

After you've connected your app to Firebase, added the SDK, and initialized the Vertex AI service and the generative model, you're ready to call the Gemini API.

Step 1: Define a response schema

Define a response schema to specify the structure of a model's output, the field names, and the expected data type for each field.

When a model generates its response, it uses the field name and context from your prompt. So that your intent is clear, we recommend that you use a clear structure, unambiguous field names, and even descriptions as needed.

Considerations for response schemas

Keep the following in mind when writing your response schema:

The size of the response schema counts towards the input token limit.
The response schema feature supports the following response MIME types:
- application/json: output JSON as defined in the response schema (useful for structured output requirements)
- text/x.enum: output an enum value as defined in the response schema (useful for classification tasks)
The response schema feature supports the following schema fields:

enum
items
maxItems
nullable
properties
required

If you use an unsupported field, the model can still handle your request, but it ignores the field. Note that the list above is a subset of the OpenAPI 3.0 schema object (see the Vertex AI schema reference).
By default, for Vertex AI in Firebase SDKs, all fields are considered required unless you specify them as optional in an optionalProperties array. For these optional fields, the model can populate the fields or skip them.

Note that this is opposite from the default behavior for the Vertex AI Gemini API.

Step 2: Send a prompt with a response schema to generate JSON

The following example shows how to generate structured JSON output.

To generate structured output, you need to specify during model initialization the appropriate responseMimeType (in this example, application/json) as well as the responseSchema that you want the model to use.

Using responseSchema is supported by Gemini 1.5 Pro and Gemini 1.5 Flash.

Learn how to choose a Gemini model and optionally a location appropriate for your use case and app.

Additional examples

To view additional examples of how you can use and generate structured output, check out the list of Example schemas and model responses in the Google Cloud documentation.

Generate enum values as output

The following example shows how to use a response schema for a classification task. The model is asked to identify the genre of a movie based on its description. The output is one plain-text enum value that the model selects from a list of values that are defined in the provided response schema.

To perform this structured classification task, you need to specify during model initialization the appropriate responseMimeType (in this example, text/x.enum) as well as the responseSchema that you want the model to use.