SDKs overview
AWS Polly offers a suite of Software Development Kits (SDKs) that simplify interaction with its text-to-speech capabilities. These SDKs are designed to abstract the underlying API calls, allowing developers to integrate voice synthesis into their applications using familiar programming languages and paradigms. The official AWS SDKs provide a consistent interface for accessing AWS services, including Polly, across various development environments. They handle authentication, request signing, and error handling, reducing the boilerplate code required for integration.
Developers can use these SDKs to perform operations such as converting text to speech, listing available voices, and generating speech marks, which provide metadata about the timing of speech events. The SDKs support various audio output formats and offer configuration options for voice selection, speaking rate, and pitch. This programmatic access enables the creation of dynamic audio content, voice-enabled applications, and accessibility features at scale.
Official SDKs by language
AWS provides official SDKs for a range of popular programming languages, ensuring broad compatibility and ease of integration for developers. These SDKs are actively maintained by AWS and offer comprehensive support for all Polly features. Each SDK is tailored to the conventions and best practices of its respective language, providing a native development experience. The table below outlines the key official SDKs available for AWS Polly:
| Language | SDK Package | Installation Command (Example) | Maturity |
|---|---|---|---|
| Python | boto3 |
pip install boto3 |
Stable, actively maintained |
| Java | aws-java-sdk-polly |
Maven: Add dependency to pom.xml |
Stable, actively maintained |
| JavaScript | aws-sdk |
npm install aws-sdk |
Stable, actively maintained |
| .NET | AWSSDK.Polly |
Install-Package AWSSDK.Polly (NuGet) |
Stable, actively maintained |
| Go | aws-sdk-go |
go get github.com/aws/aws-sdk-go/service/polly |
Stable, actively maintained |
| PHP | aws/aws-sdk-php |
composer require aws/aws-sdk-php |
Stable, actively maintained |
| Ruby | aws-sdk-polly |
gem install aws-sdk-polly |
Stable, actively maintained |
| C++ | aws-sdk-cpp |
Build from source or use package manager | Stable, actively maintained |
For detailed information on each SDK, including specific package names and advanced configuration, refer to the AWS SDKs and Tools documentation.
Installation
Installing an AWS SDK typically involves using a language-specific package manager. Before installation, ensure you have the appropriate runtime environment and package manager configured. You will also need AWS credentials configured, either via environment variables or the AWS CLI, to authenticate your requests to Polly.
Python (Boto3)
pip install boto3
Boto3 is the AWS SDK for Python, providing an interface to AWS services like Polly. For more details, consult the Boto3 documentation.
Java
For Java projects, you typically add the AWS SDK for Java dependency to your build tool (e.g., Maven or Gradle).
Maven: Add the following to your pom.xml:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>polly</artifactId>
<version>2.x.x</version>
</dependency>
Ensure you use the latest stable version. Refer to the AWS SDK for Java setup guide for specific versioning and additional dependencies.
JavaScript (Node.js/Browser)
npm install aws-sdk
The AWS SDK for JavaScript supports both Node.js environments and browser-based applications. Detailed installation instructions are available in the AWS SDK for JavaScript developer guide.
.NET
Install-Package AWSSDK.Polly
This command uses NuGet, the package manager for .NET. For further information, see the AWS SDK for .NET getting started guide.
Go
go get github.com/aws/aws-sdk-go/service/polly
The AWS SDK for Go allows developers to interact with AWS services using Go. More details can be found in the AWS SDK for Go installation instructions.
PHP
composer require aws/aws-sdk-php
The AWS SDK for PHP is distributed via Composer. The AWS SDK for PHP installation guide provides comprehensive instructions.
Ruby
gem install aws-sdk-polly
The AWS SDK for Ruby is available as a RubyGem. For setup and usage, refer to the AWS SDK for Ruby setup documentation.
C++
The AWS SDK for C++ typically requires building from source or using a system package manager. Detailed instructions are provided in the AWS SDK for C++ getting started documentation.
Quickstart example
This Python example demonstrates how to use Boto3 to synthesize speech from text using AWS Polly and save it to an MP3 file. Ensure your AWS credentials are configured before running.
import boto3
from botocore.exceptions import ClientError
# Create a Polly client
polly_client = boto3.client('polly', region_name='us-east-1')
text_to_synthesize = "Hello, my name is Joanna. I am a neural voice from AWS Polly."
output_filename = "hello_joanna.mp3"
try:
response = polly_client.synthesize_speech(
Text=text_to_synthesize,
OutputFormat='mp3',
VoiceId='Joanna', # Or 'Matthew', 'Amy', etc. for standard voices
Engine='neural' # Use 'standard' for standard voices
)
if "AudioStream" in response:
with open(output_filename, 'wb') as file:
file.write(response['AudioStream'].read())
print(f"Speech saved to {output_filename}")
else:
print("Could not find AudioStream in response.")
except ClientError as e:
print(f"Error synthesizing speech: {e}")
This example initializes the Polly client, specifies the text, output format, voice ID, and engine type (neural or standard). It then calls the synthesize_speech method and writes the returned audio stream to a local MP3 file. The Engine parameter is crucial for selecting between Neural and Standard voices, which have different pricing and quality characteristics.
Community libraries
While AWS provides comprehensive official SDKs, the broader developer community occasionally creates additional libraries or wrappers that can offer specialized functionalities, simplified interfaces, or integrations with specific frameworks. These community-driven projects can sometimes fill niche requirements or provide alternative approaches to interacting with AWS Polly.
However, it is important to note that community libraries are not officially supported by AWS. Their maintenance, security, and compatibility with the latest AWS Polly features may vary. Developers should exercise due diligence when evaluating and incorporating third-party libraries into production systems, ensuring they are well-maintained, secure, and meet project requirements. For most standard use cases, the official AWS SDKs are the recommended and most reliable method for integrating with AWS Polly. A good practice for evaluating any third-party library is to check its GitHub repository for active development, issue resolution, and community contributions, as recommended by Google's open-source project criteria.
As of late 2024, the official AWS SDKs remain the predominant and most robust tools for interacting with AWS Polly, with no widely adopted, distinct community libraries that offer fundamentally different core functionality or significant advantages over the official offerings for general text-to-speech tasks.