Introducing Structured Outputs to the OpenAI API
OpenAI has introduced Structured Outputs in the API, designed to enforce strict adherence to developer-supplied JSON Schemas. This builds on the foundation laid by the earlier JSON mode, which provided a framework for generating valid JSON outputs.
While JSON mode improved the reliability of outputs, it didn’t guarantee that the model’s response would conform to a specific schema. Structured Outputs address this by giving developers confidence that model-generated outputs will exactly match the specified JSON Schemas.
The feature is particularly important for developers working on applications that require precise and consistent data handling—where any deviation from the expected output format can lead to integration issues or errors in downstream processes.
With Structured Outputs, OpenAI models can now produce responses that are valid JSON and precisely structured according to the requirements defined by developers; eliminating the need for extensive post-processing or manual validation, streamlining the development process, and reducing the potential for errors.
Solving real-world challenges with Structured Outputs
Structured Outputs are designed to address the complex challenge of generating structured data from unstructured inputs—a task that is increasingly common in modern AI applications. In practical terms, this means that developers can now reliably convert messy, free-form data into well-organized, machine-readable formats.
In more advanced workflows, such as multi-step agentic processes where an AI model takes sequential actions based on user inputs, Structured Outputs offer a reliable way to manage the flow of information between different steps.
Developers can maintain consistency in the data format throughout the process to build more robust and scalable applications, knowing that the AI will handle data in a predictable manner.
Breaking away from old constraints
Before the introduction of Structured Outputs, developers faced major challenges in making sure that AI-generated outputs conformed to specific formats—typically requiring them to resort to open-source tools, custom prompts, and repeated API calls to get the desired results.
The lack of a built-in mechanism to enforce output structure meant that even small deviations could lead to time-consuming troubleshooting and rework.
Structured Outputs directly solve these problems by embedding the schema adherence within the model itself. OpenAI has trained its models to better understand and conform to complex schemas—simplifying the development process—as developers no longer need to rely on workarounds to achieve the desired output format.
Developers can focus more on the core functionality of their applications rather than on managing inconsistencies in the AI’s output, which speeds up development time and improves the reliability of the applications built on OpenAI’s models, as they’re more suitable for production environments.
Why Structured Outputs outperform previous models
In evaluations that test adherence to complex JSON schemas, gpt-4o-2024-08-06 scored 100%, a dramatic improvement over the previous gpt-4-0613 model, which scores less than 40%. This precision is key for developers building applications that require exact data formatting, as any deviation can lead to costly errors or system failures.
How to implement Structured Outputs in your workflow
Integrating Structured Outputs into your development workflow is straightforward, with two primary methods available depending on your use case:
- Function calling: Structured Outputs can be enabled via tools by setting the strict: true parameter within the function definition. It’s compatible with all models from gpt-4-0613 onward. Developers can make sure that the model’s outputs will conform to the tool definition, making it easier to integrate AI-generated data into existing systems and workflows. Function calling is particularly useful for scenarios where the model interacts with external APIs or services, as the output meets the required format without needing additional processing.
- Response format parameter: For cases where the model responds directly to the user in a structured way, developers can specify a JSON Schema via the json_schema option in the response_format parameter. This is supported by the latest GPT-4o models, including gpt-4o-2024-08-06 and gpt-4o-mini-2024-07-18. It’s particularly useful in applications where precise data formatting is a must, such as in form submissions, report generation, etc.
Both methods give developers flexibility to choose the most appropriate implementation based on their specific needs. Leveraging these features, developers can build more reliable, scalable, and efficient AI-driven solutions that meet the exacting standards of modern business.
Building with SDK-supported Structured Outputs
How Structured Outputs improve security
Structured Outputs are designed with safety as a top priority. In practice, this means that the models are equipped to handle potentially unsafe requests by refusing to generate outputs that could violate these guidelines.
When a request triggers a safety concern, the model’s response includes a special refusal string value within the API response—allowing developers to programmatically detect when a refusal has occurred, offering a clear indication that the model has decided not to comply with the request.
The detection mechanism is useful in automated systems where immediate human oversight may not be feasible. Developers can use this refusal string to implement fallback procedures or alert mechanisms, so that their applications remain both safe and functional without manual intervention.
Streamline development with native SDK support
OpenAI has made it easier than ever to implement Structured Outputs by introducing native support in its Python and Node SDKs. This streamlines the development process by allowing developers to define schemas directly within their code using familiar libraries like Pydantic for Python and Zod for Node.js.
These libraries are widely used in the development community for defining and validating data structures, and are a natural fit for managing JSON Schemas in the context of Structured Outputs.
With native support, developers can now incorporate Structured Outputs into their applications without extensive custom code.
The SDKs handle the heavy lifting, automatically converting data types, serializing and deserializing JSON responses, and managing refusals.
For developers working in fast-paced environments, where time-to-market is a key consideration, streamlined integration makes a great difference. OpenAI’s SDKs let developers create schema-compliant applications with greater efficiency and fewer resources.
Innovative uses for Structured Outputs
Design dynamic UIs and refine responses more easily
Structured Outputs open up new possibilities for dynamically generating user interfaces (UIs) based on user input. Developers can create AI-powered systems that respond to user actions in real-time, generating custom UIs that match the user’s intent.
For example, an application might use Structured Outputs to generate different forms or dashboards depending on the user’s selections or input data.
Structured Outputs also provide a powerful way to improve response quality by separating the final answer from supporting reasoning or additional commentary—letting developers create AI systems that deliver more polished and coherent outputs, wherein the main response is clearly distinguished from any supplementary information.
Extract and structure data from any source
In business settings, where data comes in many different forms—ranging from meeting notes to emails—Structured Outputs provide a reliable method for organizing this information into usable formats.
For instance, a model equipped with Structured Outputs can be tasked with extracting key details such as to-dos, due dates, and assignments from a set of meeting notes. Through processing the unstructured text and organizing it into a predefined schema, the AI helps transform scattered information into a structured format that can be easily integrated into project management tools or databases.
Whether it’s pulling out action items from a conference call or categorizing feedback from customer emails, Structured Outputs make sure that important data is captured and organized systematically, ultimately reducing the risk of oversight and improving operational efficiency.
The power of constrained decoding
Structured Outputs leverage a sophisticated method known as constrained sampling or constrained decoding to guarantee that the outputs align precisely with the developer-supplied JSON Schema.
The system operates by dynamically determining which tokens (the smallest units of language, like words or symbols) are valid at each step of the generation process. After every token is generated, the system assesses what the next valid token should be, based on the JSON Schema provided by the developer—preventing the AI from making errors that could result in invalid or malformed JSON outputs.
What sets Structured Outputs apart is its use of context-free grammar (CFG) to guide this token sampling process. CFG is more advanced and flexible than traditional methods like finite state machines (FSMs) or regular expressions (regex). While FSMs and regexes are suitable for simpler tasks, they fall short in handling complex, nested, or recursive data structures often found in real-world JSON Schemas.
CFG, on the other hand, can manage these complexities with ease, making sure that even the most intricate schemas are adhered to without error.
6 limitations of Structured Outputs and how to address them
While Structured Outputs provide a solution for generating schema-compliant outputs, it’s important to understand the limitations and restrictions associated with this feature to manage expectations and guarantee proper implementation.
- Schema subset: Structured Outputs currently support only a subset of the full JSON Schema specification. This restriction is intentional, designed to optimize performance and reliability by focusing on the most commonly used and critical features of JSON Schema. Developers should consult the documentation to understand which schema features are supported and plan their implementations accordingly.
- Latency: When using a new schema for the first time, developers may experience additional latency as the system processes the schema, which can take up to a minute for complex schemas. Subsequent responses are much faster, as the system caches the necessary components to speed up future requests. Initial latency should be factored into planning, especially for applications where response time is critical.
- Schema adherence: There are scenarios where the model might fail to adhere to the schema. This could happen if the model refuses a request deemed unsafe, if it reaches the maximum number of tokens before completing the output, or if it encounters another predefined stop condition. Developers need to build their applications to handle these potential cases, possibly by setting up fallback mechanisms or alternative workflows.
- Model mistakes: While Structured Outputs help ensure the format of the data is correct, they do not prevent all types of errors. The model might still make mistakes within the values of the JSON objects—such as errors in calculations or incorrect data entries. Developers must validate the content of the JSON data.
- Parallel function calls: Structured Outputs are not compatible with parallel function calls. When parallel function calls are used, the output may not match the supplied schemas, which can lead to inconsistencies in the data. Developers should disable parallel function calls when using Structured Outputs or consider alternative methods for handling parallel tasks.
- Zero data retention: JSON Schemas used with Structured Outputs are not eligible for Zero Data Retention (ZDR), meaning that data associated with these schemas might be retained according to OpenAI’s data retention policies. This limitation is essential to consider for applications dealing with sensitive or regulated data where compliance with data privacy laws is a major concern.
Developers must understand these limitations if they are to make informed decisions about when and how to use Structured Outputs in their projects.
Get the most out of Structured Outputs
Structured Outputs are now fully available in the OpenAI API, and are accessible across several models, including the latest gpt-4o and gpt-4o-mini series, as well as models that have been available since gpt-4-0613.
One of the most compelling reasons to switch to the gpt-4o-2024-08-06 model is the cost savings it offers.
Developers can achieve a 50% reduction in input costs, bringing the price down to $2.50 per 1 million input tokens. Output costs are also reduced by 33%, priced at $10.00 per 1 million output tokens.
Taking advantage of these savings, companies can reduce their overall AI-related costs while still benefiting from the advanced capabilities of Structured Outputs. Cost efficiency makes the new gpt-4o-2024-08-06 model a viable option for businesses looking to optimize their AI investments without compromising on performance or reliability.