Prerequisites

Assure that you have the following prerequisites before starting the installation:

  • Built the CAPI image on Openstack using image-builder repository
  • A running cluster with CAPI installed on it. Official documentation can be found here.
  • A valid clouds.yaml file in the ~/.config/openstack directory
  • Have Helm installed on your machine
  • Terraform/OpenTofu installed on your machine
  • Deployed bootstrap resources inside your Openstack project: network, subnet, router, security group, and floating IP. To create these resources you can apply the following Terraform file:
locals {
  resources_base_name = "capi"
  dns_servers = [
    "8.8.8.8"
  ]
  cidr = "192.168.0.0/24"
  public_key = "ssh-ed25519 xx"
}

data "openstack_networking_network_v2" "external" {
  name = "public"
}

resource "openstack_networking_network_v2" "capi-net" {
  name           = local.resources_base_name
  admin_state_up = true
  mtu            = 1442
}

resource "openstack_networking_subnet_v2" "capi-subnet" {
  name            = local.resources_base_name
  network_id      = openstack_networking_network_v2.capi-net.id
  cidr            = local.cidr
  ip_version      = 4
  dns_nameservers = ["8.8.8.8"]
}

resource "openstack_networking_router_v2" "capi-router" {
  name                = local.resources_base_name
  admin_state_up      = true
  external_network_id = data.openstack_networking_network_v2.external.id
}

resource "openstack_networking_router_interface_v2" "capi-iface" {
  router_id = openstack_networking_router_v2.capi-router.id
  subnet_id = openstack_networking_subnet_v2.capi-subnet.id
}

resource "openstack_networking_floatingip_v2" "capi-api-server-floating" {
  pool  = "public"
}

# Optional but recommended if you want to access the control plane nodes.
resource "openstack_compute_keypair_v2" "capi-keypair" {
  name       = local.resources_base_name
  public_key = local.public_key
}

output "api-server-floating-ip" {
  value       = openstack_networking_floatingip_v2.capi-api-server-floating.address
  description = "The floating IP of the API server"
}

output "router-id" {
  value       = openstack_networking_router_v2.capi-router.id
  description = "The ID of the router"
}

output "network-id" {
  value       = openstack_networking_network_v2.capi-net.id
  description = "The ID of the network"
}

output "external-network-id" {
  value       = data.openstack_networking_network_v2.external.id
  description = "The ID of the external network"
}

Create a working control plane

Create a secret to store cloud credentials

From your file inside ~/.config/openstack/clouds.yaml create a secret to store the cloud credentials:

apiVersion: v1
stringData:
  cacert: ""
  clouds.yaml: |
    clouds:
      <region_name>:
        auth:
          auth_url: <auth_url>
          application_credential_id: <application_credential_id>
          application_credential_secret: <application_credential_secret>
        region_name: <region_name>
        interface: public
        identity_api_version: 3
        auth_type: v3applicationcredential
kind: Secret
metadata:
  labels:
    clusterctl.cluster.x-k8s.io/move: "true"
  name: openstack-creds

Once you have created the secret, you can apply it by running the following command:

kubectl apply -f <file_name>
secret/openstack-creds created

Create the OpenstackMachineTemplate

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
  name: capi-cp-example
spec:
  template:
    spec:
      sshKeyName: capi # Specify the keypair name
      flavor: boost-c4m6144 # Specify the flavor
      image:
        filter:
          name: ubuntu-2404-1-29 # Depending on what you specified from the image-builder repository
      ports:
        - disablePortSecurity: true
      rootVolume:
        sizeGiB: 50 # Specify the volume size
        type: silver # Specify the volume type, valid options are: wood, silver, gold.
        availabilityZone:
          name: nova
      tags:
        - capi
        - kubernetes

From the above example, you can see that we are creating an OpenStackMachineTemplate that will be used to create the control plane nodes. The flavor field specifies the flavor of the instance, the image field specifies the image that will be used to create the instance, the rootVolume field specifies the volume size and type, and the tags field specifies the tags that will be attached to the instance.

Create cluster base resources

Copy and edit the following files to create the base resources for the cluster:

KubeadmControlPlane

---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: capi-example
  annotations:
    controlplane.cluster.x-k8s.io/skip-coredns: ""
    controlplane.cluster.x-k8s.io/skip-kube-proxy: ""
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      controllerManager:
        extraArgs:
          cloud-provider: external
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: external
          provider-id: openstack:///'{{ instance_id }}'
          node-labels: nvidia.com/gpu.deploy.driver=false # This is an optional field but it will be weither you want to deploy driver on control plane nodes
          register-with-taints: "" # Optional field
        name: '{{ local_hostname }}'
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: external
          provider-id: openstack:///'{{ instance_id }}'
          node-labels: nvidia.com/gpu.deploy.driver=false # This is an optional field but it will be weither you want to deploy driver on control plane nodes
          register-with-taints: "" # Optional field
        name: '{{ local_hostname }}'
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: OpenStackMachineTemplate
      name: capi-cp-example
  replicas: <replicas> # Specify the number of control plane nodes, recommended value is 3
  version: v1.29.6 # Here it depends on the version of the image you built

From the above example, you can see that we are creating a KubeadmControlPlane that will be used to create the control plane nodes. The kubeadmConfigSpec field specifies the configuration that will be used to create the control plane nodes, the machineTemplate field specifies the machine template that will be used to create the control plane nodes, the replicas field specifies the number of control plane nodes that will be created, and the version field specifies the version of the control plane nodes.

OpenstackCluster

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
  name: capi-example
spec:
  apiServerLoadBalancer:
    enabled: true
  apiServerFloatingIP: <floating_ip> # Specify the floating IP that will be used to access the control plane nodes
  network:
    id: <network_id> # You can get the network id from the command `openstack network list` 
  router:
    id: <router_id> # You can get the router id from the command `openstack router list`
  externalNetwork:
    id: <external_network_id> # You can get the external network id from the command `openstack network list`
  identityRef:
    cloudName: <region_name> # Specify the region name
    name: openstack-creds

From the above example, you can see that we are creating an OpenStackCluster that will be used to create the base resources for the cluster. The apiServerLoadBalancer field specifies whether the API server load balancer will be enabled or not, the apiServerFloatingIP field specifies the floating IP that will be used to access the control plane nodes, the network field specifies the network that will be used to create the cluster, the router field specifies the router that will be used to create the cluster, the externalNetwork field specifies the external network that will be used to create the cluster, and the identityRef field specifies the cloud credentials that will be used to create the cluster.

Cluster

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: capi-example
  annotations:
    controlplane.cluster.x-k8s.io/skip-coredns: ""
    controlplane.cluster.x-k8s.io/skip-kube-proxy: ""
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 10.72.0.0/16 # Specify the CIDR block for the pods
    services:
      cidrBlocks:
      - 10.73.0.0/16 # Specify the CIDR block for the services
    serviceDomain: cluster.local
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: capi-example
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: OpenStackCluster
    name: capi-example

From the above example, you can see that we are creating a Cluster that will be used to create the base resources for the cluster. The clusterNetwork field specifies the network configuration for the cluster, the controlPlaneRef field specifies the control plane configuration for the cluster, and the infrastructureRef field specifies the infrastructure configuration for the cluster.

Apply the resources

Finally, apply the resources by running the following command:

kubectl apply -f <file_name>
openstackmachinetemplate.infrastructure.cluster.x-k8s.io/capi-cp-example created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/capi-example created
openstackcluster.infrastructure.cluster.x-k8s.io/capi-example created
cluster.cluster.x-k8s.io/capi-example created

Enroll the control plane nodes

To allow us to enroll the control plane nodes we will need some setup after the resources are created. First you can monitor that the load balancer is created by running the following command:

kubectl describe openstackcluster capi-example

It should output something like this:

[...]
Events:
  Type    Reason                         Age    From                  Message
  ----    ------                         ----   ----                  -------
  Normal  Successfulcreateloadbalancer   2m16s  openstack-controller  Created load balancer k8s-clusterapi-cluster-default-capi-example-kubeapi with id d5f2c8c6-0d75-46d7-a9fe-60a991b31c36
  Normal  Successfulassociatefloatingip  60s    openstack-controller  Associated floating IP <redacted> with port 00ec6f1d-3bfd-4971-81fa-83d6b7c51924
  Normal  Successfulcreatelistener       53s    openstack-controller  Created listener k8s-clusterapi-cluster-default-capi-example-kubeapi-6443 with id 558ef75d-6897-48f2-a8d0-5bd87c695438
  Normal  Successfulcreatepool           51s    openstack-controller  Created pool k8s-clusterapi-cluster-default-capi-example-kubeapi-6443 with id 4ac6bd78-b295-4c37-9c9e-98d10e3ad773
  Normal  Successfulcreatemonitor        47s    openstack-controller  Created monitor k8s-clusterapi-cluster-default-capi-example-kubeapi-6443 with id a0503aef-8cb2-4755-9a70-3e4b99c86623

If it seems to be stuck in the Pending state, you can check the load balancer by running the following command you might have more informations:

openstack loadbalancer list

Then you can check the control plane nodes by running the following command:

kubectl get machines # Optionaly you can add the -w flag to watch the machines

It should output something like this:

NAME                 CLUSTER        NODENAME             PROVIDERID                                          PHASE     AGE     VERSION
capi-example-fwvqv   capi-example   capi-example-fwvqv   openstack:///66df7920-1b3a-4ec5-919b-990cc8909777   Running   4m19s   v1.29.6

If it seems to be stuck in the Provisioning state (>5 minutes), you can check the machines by running the following command:

kubectl describe openstackmachines

Once you have one machine in the Provisioned state, Phase set to Running, and NodeName set to the name of the machine, you can extract the Kubeconfig file by running the following command:

clusterctl get kubeconfig capi-example > capi-example.kubeconfig

From now on you can either use the commands down below by using --kubeconfig capi-example.kubeconfig or export the KUBECONFIG environment variable by running the following command:

export KUBECONFIG=capi-example.kubeconfig

We will assume that you have exported the KUBECONFIG environment variable.

Now you can remove the taint node.cloudprovider.kubernetes.io/uninitialized from the control plane nodes by running the following command:

kubectl taint nodes <node_name> node.cloudprovider.kubernetes.io/uninitialized:NoSchedule-

Now we need to setup a CNI (Container Network Interface) plugin to allow the control plane nodes to communicate with each other. We advise you to use flannel, you can install it by running the following command:

helm repo add flannel https://flannel-io.github.io/flannel/
helm install flannel --set podCidr="10.72.0.0/16" --set flannel.mtu=1442 --namespace kube-system  flannel/flannel
NAME: flannel
LAST DEPLOYED: Tue Jan 21 17:39:03 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

You can check the pods by running the following command:

kubectl get pods -n kube-system -l app=flannel

It should output something like this:

NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-dq8zw   1/1     Running   0          4m21s

Now we need to setup the Openstack Cloud Controller to allow the control plane nodes to communicate with the Openstack API. First you need to edit this values file:

cloudConfig:
  global:
    auth-url: <auth_url>
    application-credential-id: <application_credential_id>
    application-credential-secret: <application_credential_secret>
    region: <region_name>
  loadBalancer:
    floating-network-id: <floating_network_id> # You can get the floating network id from the command `openstack network list`
    floating-subnet-id: <floating_subnet_id> # You can get the floating subnet id from the command `openstack network show <floating_network_id>`

Now you can install the Openstack Cloud Controller by running the following command:

helm repo add cpo https://kubernetes.github.io/cloud-provider-openstack
helm install -n openstack --create-namespace openstack-ccm cpo/openstack-cloud-controller-manager --values <values_file>
NAME: openstack-ccm
LAST DEPLOYED: Tue Jan 21 17:48:39 2025
NAMESPACE: openstack
STATUS: deployed
REVISION: 1
TEST SUITE: None

You can check the pods by running the following command:

kubectl get pods -n openstack 

It should output something like this:

NAME                                       READY   STATUS    RESTARTS   AGE
openstack-cloud-controller-manager-nbj8k   1/1     Running   0          3m18s

Now you just have to wait that your control plane has the right number of nodes and that the control plane is ready by running the following command:

kubectl get nodes
# Alternatively you can add the -w flag to watch the nodes

Once it is all done your KubeadmControlPlane should have attributes Initialized and apiServerAvailable set to true and the replicas should be equal to the number of control plane nodes.

kubectl get kubeadmcontrolplane

It should output something like this:

NAME           CLUSTER        INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
capi-example   capi-example   true          true                   3          3       3         0             35m   v1.29.6