Authentication overview
Apache Superset employs an extensible authentication framework to secure access to its data exploration and visualization features. The system is designed to allow administrators to integrate Superset with existing identity management solutions, ensuring that users are properly verified before interacting with dashboards and datasets. Authentication in Superset primarily relies on configurations within the superset_config.py file, which dictates the chosen method and its parameters. This approach enables a flexible security posture, from simple database authentication to complex enterprise-grade single sign-on (SSO) integrations Superset security documentation.
Once authenticated, users are subject to Superset's Role-Based Access Control (RBAC) system, which governs permissions for data sources, dashboards, and features. Authentication verifies who a user is, while authorization (RBAC) determines what they can do within the application. This separation of concerns is fundamental to maintaining a secure and manageable analytics environment.
Supported authentication methods
Apache Superset supports several authentication backends, allowing administrators to choose the most suitable method for their organization's security policies and infrastructure. The primary methods include database authentication, LDAP, OAuth, OpenID Connect, and REMOTE_USER. Each method offers different levels of integration complexity and security guarantees.
| Method | When to Use | Security Level |
|---|---|---|
| Database (Built-in) | Default for small deployments, testing, or when no external identity provider is available. User credentials are stored directly in Superset's metadata database. | Moderate (relies on database security and strong password policies) |
| LDAP (Lightweight Directory Access Protocol) | For organizations with existing LDAP or Active Directory services, enabling centralized user management. | High (integrates with established enterprise identity management) |
| OAuth / OpenID Connect | For integrating with modern identity providers (IdPs) like Google, Azure AD, Okta, Auth0, or custom OAuth 2.0 servers, providing Single Sign-On (SSO) capabilities. | High (leverages industry-standard protocols, supports MFA and advanced IdP features) |
| REMOTE_USER | When authentication is handled by a reverse proxy (e.g., Nginx, Apache HTTP Server) before the request reaches Superset, which then trusts the proxy to set the REMOTE_USER header. |
High (security relies heavily on robust proxy configuration) |
Database authentication
The simplest authentication method involves storing user credentials directly within Superset's metadata database. This is the default setup and is suitable for initial deployments or environments where external identity providers are not required. User accounts are managed directly through the Superset UI or command-line interface. To configure database authentication, ensure the AUTH_TYPE in superset_config.py is set to AUTH_DB Superset authentication documentation. While convenient, this method means Superset itself manages user passwords, necessitating strong password policies and secure database practices.
LDAP authentication
For enterprise environments, LDAP integration allows Superset to authenticate users against an existing LDAP or Active Directory server. This centralizes user management and avoids duplicating user accounts. Configuration involves specifying the LDAP server details, bind DN, search parameters, and attribute mappings in superset_config.py. Superset supports both plain LDAP and LDAPS (secure LDAP) for encrypted communication Superset LDAP configuration. Proper configuration ensures that user roles in Superset can be mapped to LDAP groups, simplifying authorization management.
OAuth and OpenID Connect
Superset can be configured to use OAuth 2.0 and OpenID Connect to delegate authentication to external identity providers (IdPs). This enables Single Sign-On (SSO) and allows Superset to integrate with services like Google, Azure AD, Okta, Auth0, and others. OpenID Connect builds upon OAuth 2.0 to provide identity layers, ensuring user identity verification OpenID Connect specification. Configuration in superset_config.py involves defining OAuth providers, client IDs, client secrets, authorization URLs, token URLs, and scope parameters. This method is highly recommended for production deployments requiring robust security and integration with enterprise identity solutions Superset OAuth setup.
REMOTE_USER authentication
The REMOTE_USER authentication method is used when an upstream proxy (e.g., Nginx, Apache HTTP Server) handles user authentication and passes the authenticated username to Superset via the REMOTE_USER HTTP header. Superset then trusts this header as the authenticated user. This method is suitable for deployments where a sophisticated security gateway or identity proxy is already in place. It requires careful configuration of the proxy to prevent unauthorized users from forging the REMOTE_USER header, as Superset implicitly trusts the value provided Superset REMOTE_USER details.
Getting your credentials
The process of obtaining and managing credentials in Apache Superset depends entirely on the chosen authentication method:
- Database Authentication: For the built-in database authentication, administrators initially create user accounts and set passwords via the Superset command-line interface (
superset fab create-adminfor the initial admin user) or within the Superset UI under the "Security" > "List Users" section. Users can typically reset their passwords if enabled. - LDAP Authentication: Credentials are managed by your organization's LDAP or Active Directory server. Users use their existing enterprise network credentials to log into Superset. Superset does not store these credentials but authenticates against the LDAP server.
- OAuth / OpenID Connect: User credentials are managed by the configured Identity Provider (IdP) (e.g., Google, Azure AD, Okta). Superset acts as a client, redirecting users to the IdP for authentication. The IdP issues an access token and potentially an ID token back to Superset, which then verifies the user's identity. Configuration involves obtaining a Client ID and Client Secret from your IdP, which are then configured in Superset's
superset_config.pyfile OAuth 2.0 specification. - REMOTE_USER Authentication: User credentials are not directly managed by Superset but by the upstream proxy that performs the authentication. Superset expects the proxy to inject the authenticated username into the
REMOTE_USERHTTP header. The credentials themselves are managed by the proxy's authentication mechanism (e.g., basic auth, client certificates, Kerberos).
Regardless of the method, ensure that all sensitive credentials (like client secrets for OAuth, LDAP bind passwords) are stored securely and not hardcoded directly into source control. Environment variables or a secure secret management system are recommended.
Authenticated request example
Apache Superset primarily authenticates user sessions through a web browser. After a successful login via one of the configured methods, Superset issues a session cookie to the user's browser. Subsequent requests to the Superset API or UI are authenticated by the presence of this session cookie.
For programmatic access, such as calling the Superset API from a script or another application, you would typically need to first perform an authentication flow (e.g., OAuth 2.0 Authorization Code Flow for third-party apps, or direct login for internal scripts), obtain the session cookie or an access token, and then include it in subsequent API requests. Superset exposes a REST API that requires authentication, usually via session cookies for browser-based access or through token-based authentication (if the chosen auth backend supports it directly or via an integration layer).
Here's a conceptual example using curl, assuming a prior login has established a session and you've extracted the Superset session cookie (_superset_session):
# Assuming you have an active session cookie 'your_superset_session_cookie_value'
# This cookie would be obtained after a successful login through a browser or an automated process.
curl -X GET \
-H "Content-type: application/json" \
-H "Cookie: _superset_session=your_superset_session_cookie_value" \
"https://your-superset-instance.com/api/v1/chart/"
This example demonstrates fetching a list of charts using a previously acquired session cookie. The actual mechanism for obtaining this cookie or an API token will vary based on your selected authentication backend and whether you are integrating a browser or a headless client.
Security best practices
Implementing strong authentication in Apache Superset is critical for protecting sensitive data and preventing unauthorized access. Adhere to these best practices:
- Choose the Strongest Authentication Method: Prioritize OAuth/OpenID Connect or LDAP integration over database authentication for production environments. These methods leverage established identity providers, often supporting multi-factor authentication (MFA) and centralized user management.
- Enable HTTPS/SSL/TLS: Always deploy Superset behind an HTTPS-enabled web server (e.g., Nginx, Apache HTTP Server). This encrypts all communication between users and Superset, protecting credentials and data in transit.
- Secure
superset_config.py: Thesuperset_config.pyfile contains sensitive configuration details, including database connection strings, secret keys, and OAuth client secrets. Ensure this file has restrictive file permissions and is not publicly accessible. - Use Environment Variables for Secrets: Avoid hardcoding sensitive information directly into
superset_config.py. Instead, use environment variables to inject secrets (e.g.,database_uri, OAuth client secrets, secret key) during deployment. This practice is often referred to as adhering to the 12 Factor App principles for configuration. - Rotate Secret Key: The
SECRET_KEYinsuperset_config.pyis crucial for session security. This key should be a strong, randomly generated string and rotated periodically to mitigate the risk of compromise. - Implement Role-Based Access Control (RBAC): Beyond authentication, meticulously configure Superset's RBAC to ensure users only have access to the data sources, datasets, and dashboards necessary for their roles. Regularly review user permissions.
- Audit Logs: Enable and regularly review Superset's audit logs to monitor user activities, failed login attempts, and changes to critical configurations.
- Keep Superset Updated: Regularly update your Apache Superset instance to the latest stable version. Updates often include critical security patches and bug fixes that address newly discovered vulnerabilities.
- Strong Password Policies: If using database authentication, enforce strong password policies (complexity, length, expiration) for Superset users.
- Protect API Keys/Tokens: If programmatic access to the Superset API is used, treat API keys or session tokens as sensitive credentials. Store them securely and transmit over encrypted channels. Implement token expiration and rotation where possible.