Generate podcasts (API method)

Gemini Enterprise offers an API that lets you generate podcasts based on source documents. The output is very similar to the podcasts that end users can generate from within their notebooks.

Podcast generation through the API is well suited for batch jobs where you might have dozens or hundreds of books, articles, or courses, and you want to generate a podcast for each.

The Podcast API is a standalone API. That is, you don't need a NotebookLM Enterprise notebook, a Gemini Enterprise license, or a data store. All you need is an enabled Google Cloud project and the Podcast API User role.

Inputs

The input for the API is an array of context elements. This is the source material that the podcast gets generated from. The input can be in the form of text, images, audio, and video. The total content of the context array must be less than 100,000 tokens.

For a list of supported types, see the technical specifications for images, documents, video, and audio on this page about Gemini 2.5 Flash.

Output

The output from the API is the podcast, in MP3 format.

Before you begin

Before you can generate a podcast using the API, you must have the following:

A Google Cloud project with the Discovery Engine API enabled. See Create a project and enable the API.
The Identity and Access Management (IAM) role of Podcast API User (roles/discoveryengine.podcastApiUser). For general information about granting roles, see Set up NotebookLM Enterprise.

Generate a podcast from context input

Use the following command to generate a podcast by calling the podcast method.

The input is an array of multimedia objects such as text, images, and audio and video clips.

REST

To generate and export a podcast, do the following:

Run the following curl command:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://discoveryengine.googleapis.com/v1/projects/PROJECT_ID/locations/global/podcasts" \
  -d '{
      "podcastConfig": {
        "focus": "FOCUS",
        "length": "LENGTH",
        "languageCode": "LANGUAGE_CODE"
      },
      "contexts": [
        {
          "MEDIA_TYPE_1": "MEDIA_CONTENT_1"
        },
        {
          "MEDIA_TYPE_2": "MEDIA_CONTENT_2"
        }
      ],
      "title": "PODCAST_TITLE",
      "description": "PODCAST_DESCRIPTION"
  }'

Replace the following:

PROJECT_ID: the ID of your project.
FOCUS: a prompt where you suggest the focus of the podcast.
LENGTH: there are two options:
- SHORT (typically 4-5 minutes)
- STANDARD (typically around 10 minutes but it can be shorter with smaller data sets)
LANGUAGE_CODE: optional. Specify the language code for the podcast. Use language tags defined by BCP47. If the language code isn't provided, then the podcast is generated in English.
MEDIA_TYPE_N: specify the type of media that you are referencing to generate the podcast. Allowed types are the following:
- text. Plain text.
- blob. For all media types except plain text, use this type and upload the data as raw bytes.
MEDIA_CONTENT_N: the content itself in plain text or raw bytes. The total content of the context array must be less than 100K tokens.
PODCAST_TITLE: a title for the podcast. This can be for internal use, or you can choose to display it to your end users.
PODCAST_DESCRIPTION: a description of the podcast. This can be for internal use, or you can choose to display it to your end users.

Example command and result

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1/projects/my-project-123/locations/global/podcasts" \
-d '{
    "podcastConfig": {
      "focus": "Can you talk about how to find a job in Google?",
      "length": "SHORT"
    },
    "contexts": [
      {
        "text": "Breaking into Google is a highly competitive endeavor, attracting millions of applicants globally due to its reputation as a top employer, its innovative work, and comprehensive perks. Success hinges on a multi-faceted approach, starting with meticulously tailored online applications that incorporate job description keywords for ATS and showcasing Googlyness—a blend of curiosity, collaborative spirit, and leadership potential. The rigorous, multi-stage interview process involves recruiter screens, behavioral interviews (often using the STAR method), and for technical roles, demanding coding challenges and system design questions that assess not just correct answers but also problem-solving thought processes and communication skills. Networking for referrals and informational interviews can significantly boost visibility, but ultimately, thorough preparation through mock interviews and platforms like LeetCode, combined with patience and resilience through the often lengthy process, are paramount for navigating this challenging but rewarding path."
      },
      {
        "text": "Finding your way into a career at Google begins with their comprehensive careers website, a digital gateway brimming with opportunities. To embark on this journey, you first navigate the job board, using keywords like software engineer or product manager to pinpoint potential roles. To refine your search, utilize the array of filters available for location, experience level, degree, skills, and even specific Google organizations. You can even browse by team if you have a particular department in mind, like Engineering and Technology or Marketing and Communications. Once you discover a promising position, delve into its detailed description, paying close attention to the minimum qualifications – these are the foundational criteria against which your application will be assessed. Remember, Google seeks out leaders who can perform at the highest level, and while experience is valued, internships or graduate programs can be a great entry point for those earlier in their career. When you are ready to apply, you will need to create a Careers Profile, using your Google Account for seamless sign-in and communication. Crucially, tailor your resume for each specific role, highlighting relevant experiences and quantifying your achievements with concrete data. While a one-page resume is generally preferred, a two-page resume is acceptable for those with more extensive experience. Notably, cover letters are not typically required unless explicitly stated in the job description. Google encourages quality over quantity, so strategically apply for up to three jobs every 30 days, choosing roles that truly align with your skills and passions. Once you have submitted your applications, your Careers Profile becomes your tracking center, where you can monitor the status of each submission, from Draft to Submitted. If you do not hear back within eight weeks, the search continues, though Google recruiters may proactively reach out for other opportunities later. Remember, perseverance and a solid understanding of Google values, combined with a continuously refined skill set and a well-prepared resume, will greatly enhance your chances of securing a position at this innovative company."
      }
    ],
    "title": "Find a job at Google ",
    "description": "This podcast is based on two plain text documents that describe various aspects of getting a job at Google."
}'

{
"name": "projects/123456/locations/global/operations/create-podcast-54321"
}

It takes a few minutes to generate a podcast.

Make note of the operation name; you need it to download the podcast in step 4. In the example above, the operation name is projects/123456/locations/global/operations/create-podcast-54321.
Optional. Poll the status of the podcast creation operation. See Get details about a long-running operation.

After the operation has finished, run the following curl command to download the podcast:

curl -v \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://discoveryengine.googleapis.com/v1/OPERATION_NAME:download?alt=media" \
  --output FILENAME.mp3 -L

Replace the following:

OPERATION_NAME: the name of the operation that you noted down in step 2.
FILENAME: a filename for the podcast.

This command downloads the podcast to an MP3 file in your local directory.

Example command and result

curl -v \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://discoveryengine.googleapis.com/v1/projects/123456/locations/global/operations/create-podcast-54321:download?alt=media" \
  --output my-podcast.mp3 -L
  
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Host discoveryengine.googleapis.com:443 was resolved.
  ...
{ [42044 bytes data]
100 14.3M  100 14.3M    0     0  10.9M      0  0:00:01  0:00:01 --:--:-- 29.7M
* Connection #0 to host discoveryengine.googleapis.com left intact

Compliance

The podcast API isn't in compliance with customer-managed encryption keys, CMEK for Gemini Enterprise.