Which programming languages have official Google Cloud Text-to-Speech SDKs?

Official SDKs are available for Python, Node.js, Java, Go, C#, PHP, and Ruby, maintained by Google to ensure compatibility and stability.

How do I install the Google Cloud Text-to-Speech SDKs?

Installation typically involves using language-specific package managers: `pip` for Python, `npm` for Node.js, Maven/Gradle for Java, `go get` for Go, `dotnet add package` for C#, `composer` for PHP, and `gem install` for Ruby.

Is there a free tier for using Google Cloud Text-to-Speech with the SDKs?

Yes, Google Cloud Text-to-Speech offers a free tier of up to 1 million characters per month for Standard voices and up to 500,000 characters per month for WaveNet voices.

What is the recommended way to authenticate when using the Text-to-Speech SDKs?

Google recommends using Application Default Credentials (ADCs) for local development or service account keys configured with appropriate IAM roles for production environments.

Can I use SSML (Speech Synthesis Markup Language) with the SDKs?

Yes, the SDKs fully support SSML input, allowing for fine-grained control over speech characteristics such as pauses, pronunciation, and emphasis, by passing SSML strings to the synthesis input.

Are there community-contributed libraries for Google Cloud Text-to-Speech?

While official SDKs are primary, the open-source community may develop additional wrappers or tools. Developers should evaluate these for maintenance and compatibility before use.

How do I select different voices or audio formats using the SDKs?

The SDKs provide parameters within the synthesis request to specify voice attributes (e.g., language code, gender, voice name) and desired audio encoding formats (e.g., MP3, OGG_OPUS).

Google Cloud Text-to-Speech SDKs & Libraries: Reference (2026)

Google Cloud Text-to-Speech SDKs & libraries provide programmatic access to its speech synthesis capabilities, facilitating integration into various applications. These client libraries, available for multiple programming languages, abstract the underlying REST API, enabling developers to convert text into natural-sounding speech with fewer lines of code. They support features like voice selection, pitch adjustment, and speaking rate control.

SDKs overview

Google Cloud Text-to-Speech offers client libraries to interact with its API programmatically. These Software Development Kits (SDKs) simplify the process of sending text input to the Text-to-Speech service and receiving synthesized audio in various formats. The SDKs handle authentication, request formatting, and response parsing, allowing developers to focus on application logic rather than HTTP specifics.

The official Google Cloud client libraries are generated to provide idiomatic interfaces for each supported language, ensuring consistency with common programming patterns and practices. These libraries are maintained by Google and are the recommended method for integrating Text-to-Speech into applications across different environments, including server-side, desktop, and mobile platforms.

The Text-to-Speech API itself enables conversion of text or Speech Synthesis Markup Language (SSML) into audio data. Developers can specify voice properties, such as language, gender, and voice type (Standard or WaveNet), and audio encoding formats like MP3, OGG_OPUS, or LINEAR16. For detailed API specifications, consult the Google Cloud Text-to-Speech REST API reference.

Official SDKs by language

Google Cloud Text-to-Speech provides official client libraries for several popular programming languages. These libraries are designed to offer a consistent and developer-friendly experience. Each SDK includes methods for common operations, such as synthesizing speech from text or SSML input, and managing configuration options like voice selection and audio output settings. The following table provides an overview of the officially supported SDKs:

Language	Package/Module Name	Installation Command (Example)	Maturity
Python	`google-cloud-texttospeech`	`pip install google-cloud-texttospeech`	Stable
Node.js	`@google-cloud/text-to-speech`	`npm install @google-cloud/text-to-speech`	Stable
Java	`com.google.cloud:google-cloud-texttospeech`	Maven: Add dependency in `pom.xml`	Stable
Go	`cloud.google.com/go/texttospeech/apiv1`	`go get cloud.google.com/go/texttospeech/apiv1`	Stable
C#	`Google.Cloud.TextToSpeech.V1`	`dotnet add package Google.Cloud.TextToSpeech.V1`	Stable
PHP	`google/cloud-text-to-speech`	`composer require google/cloud-text-to-speech`	Stable
Ruby	`google-cloud-text_to_speech`	`gem install google-cloud-text_to_speech`	Stable

For specific versioning and compatibility details, refer to the Google Cloud Text-to-Speech Client Libraries documentation.

Installation

Before installing the Google Cloud Text-to-Speech SDKs, ensure you have a Google Cloud project set up with the Text-to-Speech API enabled and appropriate authentication configured. This typically involves setting up service account credentials or using Application Default Credentials (ADCs) for local development. For details on setting up authentication, consult the Google Cloud authentication guide.

Installation methods vary by programming language and package manager. The following provides general installation instructions for the primary supported languages:

Python

Use pip to install the official Python client library:

pip install google-cloud-texttospeech

Node.js

Install the Node.js client library using npm:

npm install @google-cloud/text-to-speech

Java

For Maven projects, add the following dependency to your pom.xml file:

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-texttospeech</artifactId>
  <version>YOUR_VERSION</version> <!-- Replace with the latest version -->
</dependency>

For Gradle, add it to your build.gradle file:

implementation 'com.google.cloud:google-cloud-texttospeech:YOUR_VERSION'

Check the Google Cloud Java Text-to-Speech library overview for the latest version.

Go

Obtain the Go module:

go get cloud.google.com/go/texttospeech/apiv1

C#

Install the NuGet package using the .NET CLI:

dotnet add package Google.Cloud.TextToSpeech.V1

Alternatively, use the NuGet Package Manager in Visual Studio. Consult the Google Cloud C# Text-to-Speech library overview for the latest version.

PHP

Install via Composer:

composer require google/cloud-text-to-speech

Ruby

Install the gem:

gem install google-cloud-text_to_speech

Quickstart example

This Python example demonstrates how to synthesize speech from text using the Google Cloud Text-to-Speech client library. The code sends a text string to the API, specifies a voice and audio format, and saves the synthesized audio to an MP3 file.

import os
from google.cloud import texttospeech

# Set environment variable for authentication (replace with your service account key path)
# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/path/to/your/keyfile.json"

def synthesize_text(text, output_filename="output.mp3"):
    """Synthesizes speech from the input text and saves it to a file."""

    client = texttospeech.TextToSpeechClient()

    input_text = texttospeech.SynthesisInput(text=text)

    # Select the language and SSML voice gender (optional)
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US",
        ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL,
        name="en-US-Wavenet-D" # Example for a WaveNet voice
    )

    # Select the type of audio file you want returned
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    # Perform the text-to-speech request
    response = client.synthesize_speech(
        input=input_text,
        voice=voice,
        audio_config=audio_config
    )

    # The response's audio_content is binary. Write it to a file.
    with open(output_filename, "wb") as out:
        out.write(response.audio_content)
        print(f'Audio content written to file "{output_filename}"')

if __name__ == "__main__":
    text_to_synthesize = "Hello, this is a test from Google Cloud Text-to-Speech using Python."
    synthesize_text(text_to_synthesize)

Before running this code, ensure you have authenticated your environment. For local development, this typically means setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your service account key file. More authentication options are detailed in the Google Cloud production environment authentication guide.

This example utilizes a WaveNet voice (en-US-Wavenet-D), known for its natural-sounding speech. WaveNet voices incur different pricing compared to standard voices; refer to the Google Cloud Text-to-Speech pricing page for current rates.

Community libraries

While Google provides official client libraries for a range of languages, the open-source community may develop additional tools and wrappers. These community-contributed libraries often aim to simplify specific integration patterns, provide framework-specific bindings, or offer alternative interfaces. For example, some developers might create libraries that integrate Text-to-Speech with web frameworks like Django or Flask, or provide command-line tools for quick synthesis tasks.

When considering community libraries, it is important to evaluate their maintenance status, documentation quality, and compatibility with the latest API versions. While they can offer specialized functionality, official SDKs generally provide the most stable and directly supported integration path. Resources like GitHub and language-specific package repositories (e.g., PyPI for Python, npm for Node.js) are common places to discover such community efforts. For general guidance on API client libraries, the Mozilla Developer Network's API client definition describes their role in software development.

Google Cloud Text-to-Speech's robust API design also allows for direct interaction via RESTful HTTP requests, which community libraries often abstract. This flexibility means that even without a dedicated library for a niche language, developers can integrate the service by constructing HTTP requests directly, though this requires manual handling of authentication and request/response serialization.

Google Cloud Text-to-Speech SDKs & Libraries: Reference (2026)

SDKs overview

Official SDKs by language

Installation

Python

Node.js

Java

Go

C#

PHP

Ruby

Quickstart example

Community libraries

Frequently asked questions

Reviews

Discussion

Written by

SDKs overview

Official SDKs by language

Installation

Python

Node.js

Java

Go

C#

PHP

Ruby

Quickstart example

Community libraries

Related

Frequently asked questions

Reviews

Discussion

Written by