Skip to content

Getting Started with on-demand Kubernetes

This guide will help you get started with Safespring's on-demand Kubernetes service.

Overview

On-demand Kubernetes allows users to create, scale and use Kubernetes clusters. Such a functionality is made available through a portal and API, where users can manage the clusters.

It is built on top of the Safespring Compute service.

Cluster Management

Clusters can be created using the Safespring portal or API. Once created, clusters can be scaled up or down as needed.

We support using 3 or 5 control plane nodes. When Kubernetes API uptime is critical for your business operations, we recommend using 5 control plane nodes.

Worker nodes can be used with L2 or B2 flavors, see flavors for more details and trade-offs. Generally B2 provides better uptime and L2 with local disk provides better disk performance.

We make use of Talos Linux and follow its Support Matrix with the following currently supported versions:

Talos Version Supported Kubernetes Versions
1.10.7 1.33.5
1.11.2 1.34.1
1.11.5 1.34.1
1.12.1 (planned) 1.35.0 (planned)

Access and Authentication

Access to the clusters is handled through the accounts you've been provided with for onboarding in the self-service portal.

Clusters are integrated with the OIDC compatible identity provider that is integrated with the portal. Use the following instructions on how to authenticate to a Kubernetes cluster with your account.

Networking

Cilium is used as the default CNI (Container Network Interface). Cilium is configured with the following settings:

  • Gateway API enabled
  • host routing
  • kube-proxy replacement
  • vxlan encapsulation

Load balancing

Dedicated load balancers managed by Safespring are used to direct traffic to both the control plane API and worker nodes. The cluster is provisioned by default with a single dedicated IPv4 addresses address shared across control plane and worker traffic.

When nodes get added or removed from the cluster, the load balancers will automatically be updated.

Storage

Clusters are configured with Cinder CSI (Container Storage Interface) during creation. The following storage classes are available for Cinder CSI:

  • large - HDD based block storage
  • fast - NVMe based block storage

Currently all available storage classes are based on networked storage.

SLA and Availability

Safespring on-demand Kubernetes is a highly available, reliable Kubernetes service. The clusters provisioned have a managed control plane with a 99.5% uptime SLA measured against the kubernetes API availability.

The worker nodes that are part of the cluster are part of the default Compute service SLA and not included in the on-demand Kubernetes SLA.

For any questions regarding the SLA please reach out to support.

Cluster Components

Core Components in a Control Plane

We consider the following components

  1. API Server (kube-apiserver)
  2. etcd
  3. Controller Manager (kube-controller-manager)
  4. Scheduler (kube-scheduler)
  5. Cloud Controller Manager

Additionally we consider Cilium CNI as necessary for running the Kubernetes cluster, and we do not recommend replacing it.

Additional Components

Component Description
OpenStack Cloud Controller Manager Integrates with OpenStack to provide node metadata, load balancers, and storage support.
Cert Manager Automates the management and issuance of TLS certificates for Kubernetes workloads. For Gateway API a cluster issuer will need to be created.
Traffic Management Cilium API Gateway: eBPF-based ingress solution with advanced traffic management. We provide GatewayClass cilium.
Cinder CSI (optional) Container Storage Interface (CSI) driver for provisioning and managing OpenStack Cinder volumes. Making use of Cinder CSI for persistent volumes.
Cilium eBPF-based networking, security, and observability for Kubernetes clusters, providing advanced features like network policies and load balancing.
NVIDIA Device Plugin Enables Kubernetes workloads to request and use GPUs for machine learning, AI, and high-performance compute applications. Only available if worker nodes have GPU flavors, see how to run GPU workloads.