Overview

Onyx Bazaar is a platform that provides programmatic access to a collection of public and commercial datasets, primarily through its RESTful API. The service is designed to facilitate data integration for various use cases, including academic research, data science prototyping, and business intelligence. It centralizes access to diverse data sources, reducing the overhead associated with discovering, cleaning, and integrating disparate public datasets.

The platform's core offerings include the Onyx Data Lake, which serves as a repository for aggregated data, and the Onyx Query Engine, which enables structured querying of these datasets. For data discovery, the Onyx Data Catalog allows users to browse available datasets and understand their schemas. This structure supports developers in building applications that require external data feeds, such as market analysis tools, environmental monitoring systems, or academic research platforms.

Onyx Bazaar is particularly suited for scenarios where a wide array of external data is beneficial. For instance, in geospatial analysis, access to demographic, environmental, or infrastructure data can enrich models and visualizations. Similarly, small businesses can integrate market trends or competitor data without maintaining extensive data ingestion pipelines. The platform prioritizes developer experience, offering clear documentation and SDKs in Python, JavaScript, and Go to streamline common data retrieval and querying tasks, as noted in their developer experience notes.

The service is built to handle varying scales of data access, from individual developer projects to enterprise-level integrations. Its compliance with standards such as SOC 2 Type II, GDPR, and CCPA addresses data governance requirements, which is a consideration for organizations handling sensitive or regulated information. This makes it a suitable option for projects requiring both data diversity and adherence to compliance frameworks.

Key features

  • API Access to Public Data: Provides programmatic access to a catalog of public and commercial datasets through a RESTful API.
  • Onyx Data Lake: A centralized repository for aggregated and structured open data.
  • Onyx Query Engine: Enables users to perform structured queries across various datasets within the platform.
  • Onyx Data Catalog: A browsable interface for discovering available datasets, understanding their schemas, and metadata.
  • SDKs for Multiple Languages: Official SDKs available for Python, JavaScript, and Go, designed to simplify API interactions.
  • Developer-Focused Documentation: Comprehensive documentation with code examples and guides for integration.
  • Data Explorer UI: An intuitive user interface for initial data discovery and exploration without requiring immediate API integration.
  • Compliance Standards: Adherence to SOC 2 Type II, GDPR, and CCPA for data security and privacy.

Pricing

As of May 2026, Onyx Bazaar offers a tiered pricing model that includes a free developer plan and multiple paid plans. Custom enterprise pricing is available for organizations with specific requirements.

Plan Monthly Requests Data Storage Price (USD/month) Key Features
Developer Plan 5,000 N/A Free API access, basic datasets, community support
Standard Plan 50,000 10 GB $29 All Developer features, priority support, additional datasets
Professional Plan 250,000 50 GB $99 All Standard features, advanced analytics, enhanced support
Enterprise Plan Custom Custom Custom Dedicated support, custom integrations, SLAs

For detailed and up-to-date pricing information, refer to the official Onyx Bazaar pricing page.

Common integrations

  • Data Science Notebooks: Integration with Jupyter notebooks or Google Colab using the Python SDK for data analysis and visualization.
  • Business Intelligence Tools: Connecting to BI platforms like Tableau or Power BI for dashboarding and reporting by exporting data or using custom connectors.
  • Cloud Data Warehouses: Ingesting data into services such as Google BigQuery or AWS Redshift for large-scale data warehousing and analytics.
  • Web Applications: Embedding data directly into web applications using the JavaScript SDK to display real-time or historical information.
  • ETL Pipelines: Incorporating Onyx Bazaar as a data source within ETL (Extract, Transform, Load) workflows using custom scripts or integration platforms like Tray.io.
  • Geospatial Analysis Platforms: Integrating with GIS software or mapping libraries to overlay public data onto geographical maps for spatial analysis.

Alternatives

  • Quandl (Nasdaq Data Link): Offers a large repository of financial, economic, and alternative datasets, often with more focus on structured numerical data for financial markets.
  • Google Public Datasets: A collection of public datasets hosted on Google Cloud, accessible via BigQuery or other Google Cloud services, covering a broad range of topics.
  • AWS Open Data: Provides access to a wide variety of public datasets hosted on Amazon S3, enabling integration with AWS services for processing and analysis.

Getting started

To begin using Onyx Bazaar, developers can typically sign up for an account, obtain an API key, and then use one of the provided SDKs or make direct HTTP requests to the API endpoints. The following Python example demonstrates how to fetch data using the Onyx Bazaar API:


import requests

# Replace with your actual API key from Onyx Bazaar dashboard
API_KEY = "YOUR_ONYX_API_KEY"
BASE_URL = "https://api.onyxbazaar.com/v1"

def get_dataset_info(dataset_id):
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Accept": "application/json"
    }
    url = f"{BASE_URL}/datasets/{dataset_id}"
    try:
        response = requests.get(url, headers=headers)
        response.raise_for_status()  # Raise an exception for HTTP errors
        return response.json()
    except requests.exceptions.HTTPError as http_err:
        print(f"HTTP error occurred: {http_err}")
    except Exception as err:
        print(f"Other error occurred: {err}")
    return None

def query_dataset(dataset_id, params=None):
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Accept": "application/json"
    }
    url = f"{BASE_URL}/datasets/{dataset_id}/query"
    try:
        response = requests.post(url, headers=headers, json=params)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.HTTPError as http_err:
        print(f"HTTP error occurred: {http_err}")
    except Exception as err:
        print(f"Other error occurred: {err}")
    return None

if __name__ == "__main__":
    # Example: Get information about a hypothetical 'weather_data' dataset
    dataset_id = "weather_data_us_2023"
    print(f"Fetching info for dataset: {dataset_id}")
    info = get_dataset_info(dataset_id)
    if info:
        print("Dataset Info:")
        print(info)

    # Example: Query the 'weather_data' dataset for specific conditions
    print(f"\nQuerying dataset: {dataset_id}")
    query_params = {
        "select": ["city", "date", "temperature_c", "humidity"],
        "where": {
            "temperature_c": {"gt": 25},
            "city": {"eq": "New York"}
        },
        "limit": 5
    }
    query_results = query_dataset(dataset_id, query_params)
    if query_results:
        print("Query Results:")
        for record in query_results.get("data", []):
            print(record)

This Python script demonstrates how to authenticate with an API key, retrieve metadata for a specific dataset, and then execute a structured query to fetch filtered data. Developers would replace "YOUR_ONYX_API_KEY" with their actual key and adjust dataset_id and query_params to match their specific data requirements. Further details and additional examples for other languages can be found in the Onyx Bazaar API reference and documentation.