The ultimate crash course on OpenAI APIs: A beginner's handbook

Bringing you all information about awesome Developer tools.

and

Oct 18, 2024

Throughout the AI boom, OpenAI has led the race to form the most advanced large language models (LLMs) and open its door to the developers for building applications using their APIs. With their benchmark-crushing models in action, developers wanted to use and integrate them into their projects and if you are one of them, then the following content will be very much useful.

OpenAI developer platform consists of a bunch of powerful APIs and SDKs that allows developers to integrate advanced AI models into their applications and OpenAI services across different platforms respectively.

Interestingly, across the Generative AI industry, a lot many companies which offer OSS or other proprietary LLMs, use OpenAI API spec as basis. So learning the conventions, the API design, would help you a lot as a developer.

In this article, we will explore the different APIs and libraries or SDKs provided by OpenAI and will also take a look at the various community-hosted SDKs for popular programming languages. Let’s start!

APIs offered by OpenAI

OpenAI offers a set of APIs that enable developers to access several OpenAI models and integrate them into their applications. The following are the APIs provided by OpenAI:

Completion API

Purpose

Generates or completes tasks based on a given prompt using GPT models provided by OpenAI.

Use Cases

Completing, summarizing, and rewriting a text.
Generating or completing a code.
Useful for creative writing, such as writing a story or generating a script, etc.
Generating responses to specific user prompts.

Example

Here is an example of the Completion API in action with text, prompts, and code completions:

Example Request:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "o1",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "o1-mini",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21,
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Image Generation (DALL-E) API

Purpose

This API allows users to generate images from a given descriptive prompt using the DALL-E model.

Use Cases

Generate visual content or artwork based on text inputs.
Generate prototype of products using AI-generated visuals.
Create creative and marketing campaigns using AI-generated images.

Example

Here is an example of the DALL-E API in action to generate an image:

Example Request:

curl <https://api.openai.com/v1/images/generations> \\\\
  -H "Content-Type: application/json" \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -d '{
    "model": "dall-e-3",
    "prompt": "A cute baby sea otter",
    "n": 1,
    "size": "1024x1024"
  }'

Response:

{
  "created": 1589478378,
  "data": [
    {
      "url": "https://..."
    },
    {
      "url": "https://..."
    }
  ]
}

Embedding API

Purpose

This API helps in converting texts into numerous vector representations or embeddings that capture the semantic meaning of the text.

Use Cases

Helps in finding text similarity or groups similar texts in a cluster.
Retrieve information with this API by performing a semantic search.
Get great recommendation and personalization of contents according to needs.
Label or classify a text based on semantic meaning.

Example

Here is an example of the Embedding API in action:

Example Request:

curl <https://api.openai.com/v1/embeddings> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -H "Content-Type: application/json" \\\\
  -d '{
    "input": "The food was delicious and the waiter...",
    "model": "text-embedding-ada-002",
    "encoding_format": "float"
  }'

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        .... (1536 floats total for ada-002)
        -0.0028842222,
      ],
      "index": 0
    }
  ],
  "model": "text-embedding-ada-002",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Sidenote: There are many open-source vector databases that you can store these embeddings, namely Elasticsearch, ChromaDB, Qdrant, Weaviate, Pinecone, etc.

Audio (Whisper) API

Purpose

This API converts speech to text by utilizing the Whisper model of OpenAI.

Use Cases

Transcribe audio or video files into text.
Helpful for developing voice command systems.
Developing speech to text applications that will serve note-taking, call centers, and transcription services.

Example

Here is an example of the Whisper API in action:

Example Request:

curl <https://api.openai.com/v1/audio/transcriptions> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -H "Content-Type: multipart/form-data" \\\\
  -F file="@/path/to/file/audio.mp3" \\\\
  -F model="whisper-1"

Response:

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Models API

Purpose

This API lists and describes the various models available in the API.

Use Cases

Use this API to list models for selection and display them to end users or within the internal systems while building an AI-powered application.
Helps in choosing the right model for your task based on token limits, pricing, and permissions.
Use this model to manage any model after fine-tuning for a specific task.
Check permissions associated for each model with this API to understand whether certain permissions are available for the model.

Example

Here is an example of the Models API in action:

Example Request:

curl <https://api.openai.com/v1/models> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY"

Response:

{
  "object": "list",
  "data": [
    {
      "id": "model-id-0",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization-owner"
    },
    {
      "id": "model-id-1",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization-owner",
    },
    {
      "id": "model-id-2",
      "object": "model",
      "created": 1686935002,
      "owned_by": "openai"
    },
  ],
  "object": "list"
}

Moderation API

Purpose

This API detects and separates harmful and unsafe contents by filtering them out based on hate speech, violence, or adult content.

Use Cases

Ensures that user-generated content on a platform is safe and appropriate.
Helps in moderating live online discussions, social media, and forum content.
Prevents generation of harmful, abusive, or dangerous contents in AI-powered applications.

Example

Here is an example of the Moderation API in action:

Example Request:

curl <https://api.openai.com/v1/moderations> \\\\
  -H "Content-Type: application/json" \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -d '{
    "input": "I want to kill them."
  }'

Response:

{
  "id": "modr-XXXXX",
  "model": "text-moderation-007",
  "results": [
    {
      "flagged": true,
      "categories": {
        "sexual": false,
        "hate": false,
        "harassment": false,
        "self-harm": false,
        "sexual/minors": false,
        "hate/threatening": false,
        "violence/graphic": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "harassment/threatening": true,
        "violence": true,
      },
      "category_scores": {
        "sexual": 1.2282071e-06,
        "hate": 0.010696256,
        "harassment": 0.29842457,
        "self-harm": 1.5236925e-08,
        "sexual/minors": 5.7246268e-08,
        "hate/threatening": 0.0060676364,
        "violence/graphic": 4.435014e-06,
        "self-harm/intent": 8.098441e-10,
        "self-harm/instructions": 2.8498655e-11,
        "harassment/threatening": 0.63055265,
        "violence": 0.99011886,
      }
    }
  ]
}

Fine-Tuning API

Purpose

This API helps developers fine-tune models provided by OpenAI on their own or custom datasets to create specific models tailored for specific applications.

Use Cases

Customize language models for industry-specific tasks.
Helps in creating specialized chatbots with domain knowledge.
Helps in training a model to learn and understand proprietary vocabularies.

Example

Here is an example of the Fine-Tuning API in action:

Example Request:

curl <https://api.openai.com/v1/fine_tuning/jobs> \\\\
  -H "Content-Type: application/json" \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -d '{
    "training_file": "file-BK7bzQj3FfZFXr7DbL6xJwfo",
    "model": "gpt-4o-mini"
  }'

Response:

{
  "object": "fine_tuning.job",
  "id": "ftjob-abc123",
  "model": "gpt-4o-mini-2024-07-18",
  "created_at": 1721764800,
  "fine_tuned_model": null,
  "organization_id": "org-123",
  "result_files": [],
  "status": "queued",
  "validation_file": null,
  "training_file": "file-abc123",
}

Batch API

Purpose

This API allows sending of multiple requests in a single API call and allows creating large batches of API requests for asynchronous processing.

Use Cases

Batch multiple requests on the client side and send them concurrently to reduce the number of API calls which helps increase the efficiency of the application.

Example

Here is an example of the Batch API in action:

Example Request:

curl <https://api.openai.com/v1/batches> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -H "Content-Type: application/json" \\\\
  -d '{
    "input_file_id": "file-abc123",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
  }'

Response:

{
  "id": "batch_abc123",
  "object": "batch",
  "endpoint": "/v1/chat/completions",
  "errors": null,
  "input_file_id": "file-abc123",
  "completion_window": "24h",
  "status": "validating",
  "output_file_id": null,
  "error_file_id": null,
  "created_at": 1711471533,
  "in_progress_at": null,
  "expires_at": null,
  "finalizing_at": null,
  "completed_at": null,
  "failed_at": null,
  "expired_at": null,
  "cancelling_at": null,
  "cancelled_at": null,
  "request_counts": {
    "total": 0,
    "completed": 0,
    "failed": 0
  },
  "metadata": {
    "customer_id": "user_123456789",
    "batch_description": "Nightly eval job",
  }
}

Files API

Purpose

This API helps in managing and uploading files for fine-tuning models or performing any specific data operations.

Use Cases

Helps in uploading data files or datasets to fine-tune a language model or process specific datasets.
Helps in managing files, such as retrieving, deleting, or listing a file, once uploaded.
Use the uploaded files to fine-tune models for specific tasks.

Example

Here is an example of the Files API in action:

Example Request:

curl <https://api.openai.com/v1/files> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -F purpose="fine-tune" \\\\
  -F file="@mydata.jsonl"

Response:

{
  "id": "file-abc123",
  "object": "file",
  "bytes": 120000,
  "created_at": 1677610602,
  "filename": "mydata.jsonl",
  "purpose": "fine-tune",
}

Uploads API

Purpose

This API is part of the broader Files API and allows developers to upload large files in multiple parts.

Use Cases

Useful for tasks like fine-tuning a model with custom datasets.
Helps in storing large datasets for subsequent operations.

Example

Here is an example of the Uploads API in action:

Example Request:

curl <https://api.openai.com/v1/uploads> \\\\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\\\
  -d '{
    "purpose": "fine-tune",
    "filename": "training_examples.jsonl",
    "bytes": 2147483648,
    "mime_type": "text/jsonl"
  }'

Response:

{
  "id": "upload_abc123",
  "object": "upload",
  "bytes": 2147483648,
  "created_at": 1719184911,
  "filename": "training_examples.jsonl",
  "purpose": "fine-tune",
  "status": "pending",
  "expires_at": 1719127296
}

New Updates from OpenAI DevDay 2024

While writing this article as of now, OpenAI DevDay 2024 (OpenAI’s annual developer conference) has just been concluded and it came with a bunch of new updates on the API front. This includes the inclusion of Real-time API with function calling capabilities, Vision to the Fine-tuning API, and addition of prompt-caching and model-distillation in the API.

Real-time API

Real-time API supports natural speech-to-speech conversations and allows developers to build applications that are similar to ChatGPT’s advanced voice mode. Under the hood, the API lets the creation of a persistent WebSocket connection to exchange messages with GPT-4o. This API comes with function calling capabilities which will allow voice assistants to respond to the request of an user by triggering new actions or pulling a new context. This comes in real handy from a user’s perspective because the assistant will be able to place an order on behalf of the user or retrieve information with personalized responses. Learn more on the Real-time API from here.

Usage of Real-time API

Real-time API is a stateful, event-based API that communicates over a WebSocket. The following parameters are required for the WebSocket connection:

URL - wss://api.openai.com/v1/realtime
Query Parameters - ?model=gpt-4o-realtime-preview-2024-10-01
Headers:
- Authorization: Bearer YOUR_API_KEY
- OpenAI-Beta: realtime=v1

To establish the connection, a valid OpenAI API Key is required. There is a popular WebSocket (ws) library in Node.js which you can use to establish a socket connection. This will help send a message from the client and receive a response from the server. Below is an example of the same:

import WebSocket from "ws";
const url = "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-10-01";
const ws = new WebSocket(url, {
    headers: {
        "Authorization": "Bearer " + process.env.OPENAI_API_KEY,
        "OpenAI-Beta": "realtime=v1",
    },
});
ws.on("open", function open() {
    console.log("Connected to server.");
    ws.send(JSON.stringify({
        type: "response.create",
        response: {
            modalities: ["text"],
            instructions: "Please assist the user.",
        }
    }));
});
ws.on("message", function incoming(message) {
    console.log(JSON.parse(message.toString()));
});

Sending events

While being connected to the API, you can send an event to the API from the client. To do that, you have to send a JSON string that contains your event payload data. Here is an example depicting that:

ws.on('open', () => {
  const event = {
    type: 'conversation.item.create',
    item: {
      type: 'message',
      role: 'user',
      content: [
        {
          type: 'input_text',
          text: 'Hello!'
        }
      ]
    }
  };
  ws.send(JSON.stringify(event));
});

Receiving events

Receiving events involves listening to the WebSocket message event, and parse the result as JSON. Here is how you can do that:

ws.on('message', data => {
  try {
    const event = JSON.parse(data);
    console.log(event);
  } catch (e) {
    console.error(e);
  }
});

Vision in the Fine-Tuning API

OpenAI introduced vision in their Fine-Tuning API which makes it possible to fine-tune with images. This will help developers to customize the model with better image understanding capabilities that will enable applications with enhanced visual search functionality, improved object detection, accurate medical image analysis, etc. More information on adding vision in the Fine-tuning API is available here.

Usage of Vision in the Fine-Tuning API

The use of vision with fine-tuning is possible with the fine-tuning of images in your JSONL files. Image inputs can be added within the training data in the same way the inputs are sent to chat completions. Images can be provided as HTTP URLs or data URLs that contain base64 encoded images.

Example of an image message on a line of the JSONL file is given below:

{
  "messages": [
    { "role": "system", "content": "You are an assistant that identifies orbiters in Space." },
    { "role": "user", "content": "What is this orbiter?" },
    { "role": "user", "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/5/57/Galileo_orbiter_arrival_at_Jupiter.jpg"
          }
        }
      ]
    },
    { "role": "assistant", "content": "Danbo" }
  ]
}

This JSON object is expanded for readability but this JSON would appear on a single line in the data file.

Prompt Caching

A new API feature introduced during the developer conference is Prompt Caching. It is evident that a lot of developers while building an AI application uses the same context repeatedly over multiple API calls. With the new feature of Prompt Caching, developers can use recently seen input token which will result in reduction of costs with 50% discount and latency with faster prompt processing times. Obtain more knowledge on this API feature from this documentation.

Model Distillation

Model distillation is another new feature that saw light with DevDay 2024. The aim is to fine-tune a cost efficient model with the outputs of a large frontier model. It is available in all of OpenAI’s platform. This feature will help smaller models achieve the capability of an advanced model for a specific task at a much lower cost. More information on this feature is available here.

SDKs offered by OpenAI

OpenAI offers the following SDKs or libraries that allows developers to build their applications with OpenAI services. The following listed are the ones :

Most commonly used SDK helping python developers to call and interact with OpenAI's API which includes support for language models, embeddings, etc.
GitHub Repository for the SDK is available here.

Node.js SDK

This is a JavaScript/TypeScript library with a support for Node.js and allows JavaScript developers to interact and call OpenAI's API in a server-side environment
GitHub Repository for the SDK is available here.

.NET SDK

This is currently in beta and beneficial for .NET developers to interact with OpenAI's API.
GitHub Repository for the SDK is available here.

Available Community SDKs for developing with OpenAI

Apart from the SDKs mentioned above, OpenAI offers a RESTful API that allows developers to create SDKs or libraries for different languages based on that. Here are a few of them for the popular programming languages:

C#/.NET SDK (OpenAI-API-dotnet)

A stable C# library that simplifies integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

C++ SDK (liboai)

A C++ library that allows C++ developers to simplify integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Closure SDK (openAI-closure)

A library for the Closure programming language that simplifies integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Dart/Flutter SDK (openai)

A library for the Dart programming language that allows dart and flutter developers to simplify integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Elixir SDK (openai.ex)

A library for the Elixir programming language that allows developers using this language to work with OpenAI models and simplify integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Go SDK (go-gpt3)

A Go library that enables Go developers to simplify integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Java SDK (openai-java)

A Java library that provides a Java-friendly interface and simplifies integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Julia SDK (OpenAI.jl)

A Julia library that simplifies integration with OpenAI’s services.
GitHub Repository for the SDK is available here.

Kotlin SDK (openai-kotlin)

A Kotlin library that simplifies integration of Kotlin or Android projects with OpenAI’s services.
GitHub Repository for the SDK is available here.

PHP SDK (openai-php client)

A PHP library that provides an interface to call and interact with OpenAI’s API.
GitHub Repository for the SDK is available here.

Ruby SDK (ruby-openai)

A library that is useful for Ruby developers to interact with OpenAI's API and integrate the services in their projects.
GitHub Repository for the SDK is available here.

Rust SDK (fieri)

A library that provides a Rust-friendly interface and allows developers using the language to interact with the API and integrate the services in their applications.
GitHub Repository for the SDK is available here.

Scala SDK (openai-scala-client)

A Scala library, useful for Scala developers to have simplified integration with OpenAI services.
GitHub Repository for the SDK is available here.

Swift SDK (openai-kit)

A library supporting the Swift programming language which is useful for iOS and MacOS developers to simplify integration of OpenAI services in their applications.
GitHub Repository for the SDK is available here.

If you are overwhelmed with different SDKs from OpenAI and still want to keep it simple, you can probably use something like a LLM Gateway or AI Gateway, namely Portkey’s AI Gateway, Lite-llm AI Gateway, etc.

In Conclusion

Hope you have taken a note of the different OpenAI APIs and their use cases, along with the libraries available to use. Using these APIs along with the library of your choice will help you integrate chat completion, image-generation, audio-transcribing, etc., capabilities in your application.

Interested in learning more about OpenAI APIs, check out the developer documentation, and subscribe to DevShorts for keeping yourself up-to-date with these useful articles and our weekly tech roundups!!

Dev Shorts

Discussion about this post

Dev Shorts

The ultimate crash course on OpenAI APIs: A beginner's handbook

Bringing you all information about awesome Developer tools.

APIs offered by OpenAI

Completion API

Purpose

Use Cases

Example

Image Generation (DALL-E) API

Purpose

Use Cases

Example

Embedding API

Purpose

Use Cases

Example

Audio (Whisper) API

Purpose

Use Cases

Example

Models API

Purpose

Use Cases

Example

Moderation API

Purpose

Use Cases

Example

Fine-Tuning API

Purpose

Use Cases

Example

Batch API

Purpose

Use Cases

Example

Files API

Purpose

Use Cases

Example

Uploads API

Purpose

Use Cases

Example

New Updates from OpenAI DevDay 2024

Real-time API

Usage of Real-time API

Sending events

Receiving events

Vision in the Fine-Tuning API

Usage of Vision in the Fine-Tuning API

Prompt Caching

Model Distillation

SDKs offered by OpenAI

Python SDK

Node.js SDK

.NET SDK

Available Community SDKs for developing with OpenAI

C#/.NET SDK (OpenAI-API-dotnet)

C++ SDK (liboai)

Closure SDK (openAI-closure)

Dart/Flutter SDK (openai)

Elixir SDK (openai.ex)

Go SDK (go-gpt3)

Java SDK (openai-java)

Julia SDK (OpenAI.jl)

Kotlin SDK (openai-kotlin)

PHP SDK (openai-php client)

Ruby SDK (ruby-openai)

Rust SDK (fieri)

Scala SDK (openai-scala-client)

Swift SDK (openai-kit)

In Conclusion

Discussion about this post