The Gemini API returns responses as unstructured text by default. However, some use cases require structured text, like JSON. For example, you might be using the response for other downstream tasks that require an established data schema.
To ensure that the model's generated output always adheres to a specific schema, you can define a response schema, which works like a blueprint for model responses. You can then directly extract data from the model's output with less post-processing.
Here are some examples:
Ensure that a model's response produces valid JSON and conforms to your provided schema.
For example, the model can generate structured entries for recipes that always include the recipe name, list of ingredients, and steps. You can then more easily parse and display this information in the UI of your app.Constrain how a model can respond during classification tasks.
For example, you can have the model annotate text with a specific set of labels (for instance, a specific set of enums likepositive
andnegative
), rather than labels that the model produces (which could have a degree of variability likegood
,positive
,negative
, orbad
).
This guide shows you how to generate JSON output by providing a responseSchema
in a call to generateContent
. It focuses on text-only input, but Gemini can
also produce structured responses to multimodal requests that include images,
videos, and audio as input.
At the bottom of this page are more examples, like how to generate enum values as output. To view additional examples of how you can generate structured output, check out the list of Example schemas and model responses in the Google Cloud documentation.
Before you begin
If you haven't already, complete the getting started guide for the Vertex AI in Firebase SDKs. Make sure that you've done all of the following:
Set up a new or existing Firebase project, including using the Blaze pricing plan and enabling the required APIs.
Connect your app to Firebase, including registering your app and adding your Firebase config to your app.
Add the SDK and initialize the Vertex AI service and the generative model in your app.
After you've connected your app to Firebase, added the SDK, and initialized the Vertex AI service and the generative model, you're ready to call the Gemini API.
Step 1: Define a response schema
Define a response schema to specify the structure of a model's output, the field names, and the expected data type for each field.
When a model generates its response, it uses the field name and context from your prompt. So that your intent is clear, we recommend that you use a clear structure, unambiguous field names, and even descriptions as needed.
Considerations for response schemas
Keep the following in mind when writing your response schema:
The size of the response schema counts towards the input token limit.
The response schema feature supports the following response MIME types:
application/json
: output JSON as defined in the response schema (useful for structured output requirements)text/x.enum
: output an enum value as defined in the response schema (useful for classification tasks)
The response schema feature supports the following schema fields:
enum
items
maxItems
nullable
properties
required
If you use an unsupported field, the model can still handle your request, but it ignores the field. Note that the list above is a subset of the OpenAPI 3.0 schema object (see the Vertex AI schema reference).
By default, for Vertex AI in Firebase SDKs, all fields are considered required unless you specify them as optional in an
optionalProperties
array. For these optional fields, the model can populate the fields or skip them.Note that this is opposite from the default behavior for the Vertex AI Gemini API.
Step 2: Send a prompt with a response schema to generate JSON
The following example shows how to generate structured JSON output.
To generate structured output, you need to specify during model initialization
the appropriate responseMimeType
(in this example, application/json
)
as well as the responseSchema
that you want the model to use.
Using responseSchema
is supported by Gemini 1.5 Pro and Gemini 1.5 Flash.
Learn how to choose a Gemini model and optionally a location appropriate for your use case and app.
Additional examples
To view additional examples of how you can use and generate structured output, check out the list of Example schemas and model responses in the Google Cloud documentation.
Generate enum values as output
The following example shows how to use a response schema for a classification task. The model is asked to identify the genre of a movie based on its description. The output is one plain-text enum value that the model selects from a list of values that are defined in the provided response schema.
To perform this structured classification task, you need to specify during model
initialization the appropriate responseMimeType
(in this example,
text/x.enum
) as well as the responseSchema
that you want the model to use.
Learn how to choose a Gemini model and optionally a location appropriate for your use case and app.
Other options to control content generation
- Learn more about prompt design so that you can influence the model to generate output specific to your needs.
- Configure model parameters to control how the model generates a response. These parameters include max output tokens, temperature, topK, and topP.
- Use safety settings to adjust the likelihood of getting responses that may be considered harmful, including hate speech and sexually explicit content.
- Set system instructions to steer the behavior of the model. This feature is like a "preamble" that you add before the model gets exposed to any further instructions from the end user.
Give feedback about your experience with Vertex AI in Firebase