Kubernetes Cluster API Provider Azure

Kubernetes-native declarative infrastructure for Azure.

What is the Cluster API Provider Azure (CAPZ)

The Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management.

The API itself is shared across multiple cloud providers allowing for true Azure hybrid deployments of Kubernetes.

CAPZ enables efficient management at scale of self-managed or managed (AKS) clusters on Azure. Furthermore, the CAPZ management cluster can be utilized with the automatically installed Azure Service Operator (ASO) installation dependency to manage any Azure infrastructure. For more information see the roadmap high level vision.

Documentation

Please see our Book for in-depth user documentation.

Additional docs can be found in the /docs directory, and the index is here.

Quick Start

Check out the Cluster API Quick Start to create your first Kubernetes cluster on Azure using Cluster API.

Flavors

See the flavors documentation to know which cluster templates are provided by CAPZ.

Getting Help

If you need help with CAPZ, please visit the #cluster-api-azure channel on Slack, open a GitHub issue, or join us at Office Hours.

Compatibility

Cluster API Versions

Currently, CAPZ is compatible only with the v1beta1 version of CAPI (v1.0.x). Support for v1alpha3 (v0.3.x) and v1alpha4 (v0.4.x) is deprecated and has been removed.

Kubernetes Versions

The Azure provider is able to install and manage the versions of Kubernetes supported by the Cluster API (CAPI) project.

Managed Clusters (AKS)

Managed Clusters (AKS) follow their own Kubernetes version support policy. Please use the Azure portal or CLI to find the versions supported in your cluster's location.

For more information on Kubernetes version support, see the Cluster API book.

Getting involved and contributing

Are you interested in contributing to cluster-api-provider-azure? We, the maintainers and community, would love your suggestions, contributions, and help! Also, the maintainers can be contacted at any time to learn more about how to get involved.

To set up your environment checkout the development guide.

In the interest of getting more new people involved, we tag issues with good first issue. These are typically issues that have smaller scope but are good ways to start to get acquainted with the codebase.

We also encourage ALL active community participants to act as if they are maintainers, even if you don't have "official" write permissions. This is a community effort, we are here to serve the Kubernetes community. If you have an active interest and you want to get involved, you have real power! Don't assume that the only people who can get things done around here are the "maintainers".

We also would love to add more "official" maintainers, so show us what you can do!

This repository uses the Kubernetes bots. See a full list of the commands here.

Office hours

The community holds office hours every week, with sessions open to all users and developers.

Office hours are hosted on a zoom video chat every Thursday at 09:00 (PT) / 12:00 (ET) Convert to your timezone and are published on the Kubernetes community meetings calendar. Please add your questions or ideas to the agenda.

Other ways to communicate with the contributors

Please check in with us in the #cluster-api-azure channel on Slack.

Github issues

Bugs

If you think you have found a bug please follow the instructions below.

Please spend a small amount of time giving due diligence to the issue tracker. Your issue might be a duplicate.
Get the logs from the cluster controllers. Please paste this into your issue.
Open a bug report.
Remember users might be searching for your issue in the future, so please give it a meaningful title to help others.
Feel free to reach out to the cluster-api community on kubernetes slack.

Tracking new features

We also use the issue tracker to track features. If you have an idea for a feature, or think you can help Cluster API Provider Azure become even more awesome, then follow the steps below.

Open a feature request.
Remember users might be searching for your issue in the future, so please give it a meaningful title to help others.
Clearly define the use case, using concrete examples. EG: I type this and cluster-api-provider-azure does that.
Some of our larger features will require some design. If you would like to include a technical design for your feature please include it in the issue.
After the new feature is well understood, and the design agreed upon we can start coding the feature. We would love for you to code it. So please open up a WIP (work in progress) pull request, and happy coding.

Getting started with cluster-api-provider-azure

Prerequisites

Requirements

A Microsoft Azure account
- Note: If using a new subscription, make sure to register the following resource providers:
  - Microsoft.Compute
  - Microsoft.Network
  - Microsoft.ContainerService
  - Microsoft.ManagedIdentity
  - Microsoft.Authorization
  - Microsoft.ResourceHealth (if the EXP_AKS_RESOURCE_HEALTH feature flag is enabled)
The Azure CLI
A supported version of clusterctl

Setting up your Azure environment

az login

List your Azure subscriptions.

az account list -o table

If more than one account is present, select the account that you want to use.

az account set -s <SubscriptionId>

Save your Subscription ID in an environment variable.

export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"

Creating an AKS Management Cluster with Workload Identity

Create an AKS Cluster with Workload Identity and OIDC Endpoint Enabled.

az aks create \
--resource-group <resource-group-name> \
--name <aks-cluster-name> \
--enable-oidc-issuer \
--enable-workload-identity \
--node-count 2 \
--node-vm-size Standard_B2s \
--generate-ssh-keys \
--location <region>

Retrieve Credentials for the AKS Cluster to interact with it using kubectl:

az aks get-credentials --resource-group <resource-group-name> --name <aks-cluster-name>

Retrieve the OIDC Issuer URL.

az aks show \
--resource-group <resource-group-name> \
--name <aks-cluster-name> \
--query "oidcIssuerProfile.issuerUrl" -o tsv

Hold onto the OIDC issuer URL for creating federated credentials.

Create a User-Assigned Managed Identity (UAMI) to use for Workload Identity.

az identity create \
--name <uami-name> \
--resource-group <resource-group-name> \
--location <region>

Hold onto the UAMI clientID and principalID for the next steps.

Assign the Contributor role to the UAMI so it can manage Azure resources.

az role assignment create \
--assignee <uami-principal-id> \
--role Contributor \
--scope /subscriptions/<subscription-id>

Add a Federated Credential to the UAMI

To configure the federated credential for the UAMI, follow the detailed instructions in Azure Workload Identity: Federated identity credential for an Azure AD application. For CAPZ, the federated credential should be configured for the capz-manager service account in the capz-system namespace, like the below:

az identity federated-credential create \
  --name <federated-credential-name> \
  --identity-name <uami-name> \
  --resource-group <resource-group-name> \
  --issuer <oidc-issuer-url> \
  --subject "system:serviceaccount:capz-system:capz-manager" \

Initialize the management cluster

Run the following command to initialize the management cluster with Cluster API Provider Azure (CAPZ):

clusterctl init --infrastructure azure

This command sets up the necessary components, including Cluster API Core, CAPZ, and Azure Service Operator (ASO). View the Cluster API Quick Start: Initialize the management cluster documentation for more detailed instructions. Ensure you select the "Azure" tabs for Azure-specific guidance.

Annotate the capz-manager service account in the capz-system namespace with the UAMI's clientId:

kubectl annotate serviceaccount capz-manager \
-n capz-system \
azure.workload.identity/client-id=<uami-client-id>

Building your First Cluster

To create a workload cluster, follow the Cluster API Quick Start: Create your first workload cluster for detailed instructions. Ensure you select the "Azure" tabs for Azure-specific guidance.

Cluster API Azure Roadmap

The best place to see what's coming within a 1-2 month timeline is in the public milestones. All open items for the the next numbered milestone (e.g. 1.17) are visualized in the Milestone-Open project board view and planned at the very beginning of the 2-month release cycle. This planning and discussion begins at Cluster API Azure Office Hours after a major release. The CAPZ project board roadmap view tracks the larger "epic issues" and their progress. Active community PR contributions are prioritized throughout the release, but unplanned work will arise. Hence the items in the milestone are a rough estimate which may change. The "next" milestone is a very rough collection of issues for the milestone after the current numbered one to help prioritize upcoming work.

High Level Vision

CAPZ is the official production-ready Cluster API implementation to administer the entire lifecycle of self-managed or managed Kubernetes clusters (AKS) on Azure. Cluster API extends the Kubernetes API to provide tooling consistent across on-premises and cloud providers to build and maintain Kubernetes clusters at scale while working with GitOps and the surrounding tooling ecosystem. See related blog post.

Azure Service Operator (ASO) is automatically installed with CAPZ and can be utilized in addition to CAPZ's Kubernetes cluster Infrastructure as Code (IaC) definition to manage all of your Azure resources on the same management cluster. See AKS Platform Engineering code sample.

Long-Term Priorities

CAPZ can provision three major types of clusters, each of which have a different investment priority.

Self-Managed clusters - maintain the current functionality via bug fixes and security patches. New features will be accepted via contributor pull requests.
Managed clusters (AzureManaged* current API) - maintain the current functionality via bug fixes and security patches. New features will be accepted via contributor pull requests. It is recommended that the existing asoManaged*Patches functionality be considered as a stop-gap to missing features in the CAPZ definitions for AKS. See deprecation timeline below.
Managed clusters (AzureASOManaged* new API) - was moved out of experimentation in July for the 1.16 release and is where the investment lies moving forward for provisioning AKS clusters.

See the managed clusters for further background and comparison of the two managed cluster APIs.

Approximate timeline for deprecation of AzureManaged API:

Mar 2025 - 1.19
May 2025 - 1.20: Move AzurASOManaged API to beta
July 2025 - 1.21: No new features accepted for AzureManaged API. Warning message in code for deprecation.
Sept 2025 - 1.22: GA AzureASOManaged API
Dec 2025 - GA 1.23: AzureManaged API is removed from code base, AzureASOManaged API is default.

There may be investment creating a new API definition for AKS from the Managed Kubernetes CAPI proposal at some point in the future. If interested in this functionality, please file an issue on the CAPZ repository and come to the community group meeting to discuss.

Topics

This section contains information which is relevant to managed and self-managed clusters created by CAPZ.

Azure Service Operator

Overview

CAPZ interfaces with Azure to create and manage some types of resources using Azure Service Operator (ASO).

More context around the decision for CAPZ to pivot towards using ASO can be found in the proposal.

Visit this page to learn more about the AzureASOManaged cluster API which provisions an AKS cluster.

Primary changes

For most users, the introduction of ASO is expected to be fully transparent and backwards compatible. Changes that may affect specific use cases are described below.

Installation

Beginning with CAPZ v1.11.0, ASO's control plane will be installed automatically by clusterctl in the capz-system namespace alongside CAPZ's control plane components. When ASO is already installed on a cluster, installing ASO again with CAPZ is expected to fail and clusterctl cannot install CAPZ without ASO. The suggested workaround for users facing this issue is to uninstall the existing ASO control plane (but keep the ASO CRDs) and then to install CAPZ.

Bring-your-own (BYO) resource

CAPZ had already allowed users to pre-create some resources like resource groups and virtual networks and reference those resources in CAPZ resources. CAPZ will then use those existing resources without creating new ones and assume the user is responsible for managing them, so will not actively reconcile changes to or delete those resources.

This use case is still supported with ASO installed. The main difference is that an ASO resource will be created for CAPZ's own bookkeeping, but configured not to be actively reconciled by ASO. When the Cluster API Cluster owning the resource is deleted, the ASO resource will also be deleted from the management cluster but the resource will not be deleted in Azure.

Additionally, BYO resources may include ASO resources managed by the user. CAPZ will not modify or delete such resources. Note that clusterctl move will not move user-managed ASO resources.

Configuration with Environment Variables

These environment variables are passed through to the aso-controller-settings Secret to configure ASO when CAPZ is installed and are consumed by clusterctl init. They may also be modified directly in the Secret after installing ASO with CAPZ:

AZURE_AUTHORITY_HOST
AZURE_RESOURCE_MANAGER_AUDIENCE
AZURE_RESOURCE_MANAGER_ENDPOINT
AZURE_SYNC_PERIOD

More details on each can be found in ASO's documentation.

Using ASO for non-CAPZ resources

CAPZ's installation of ASO can be used directly to manage Azure resources outside the domain of Cluster API.

Installing more CRDs

For a fresh installation

Before performing a clusterctl init, users can specify additional ASO CRDs to be installed in the management cluster by exporting ADDITIONAL_ASO_CRDS variable. For example, to install all the CRDs of cache.azure.com and MongodbDatabase.documentdb.azure.com:

export ADDITIONAL_ASO_CRDS="cache.azure.com/*;documentdb.azure.com/MongodbDatabase"
continue with the installation of CAPZ as specified here Cluster API Quick Start.

For an existing CAPZ installation being upgraded to v1.14.0(or beyond)

CAPZ's installation of ASO configures only the ASO CRDs that are required by CAPZ. To make more resource types available, export ADDITIONAL_ASO_CRDS and then upgrade CAPZ. For example, to install the all CRDs of cache.azure.com and MongodbDatabase.documentdb.azure.com, follow these steps:

export ADDITIONAL_ASO_CRDS="cache.azure.com/*;documentdb.azure.com/MongodbDatabase"
continue with the upgrade of CAPZ as specified [here](https://cluster-api.sigs.k8s.io/tasks/upgrading-cluster-api-versions.html?highlight=upgrade#when-to-upgrade]

You will see that the --crd-pattern in Azure Service Operator's Deployment (in the capz-system namespace) looks like below:

.
- --crd-names=cache.azure.com/*;documentdb.azure.com/MongodbDatabase
.

More details about how ASO manages CRDs can be found here.

Note: To install the resource for the newly installed CRDs, make sure that the ASO operator has the authentication to install the resources. Refer authentication in ASO for more details. An example configuration file and demo for Azure Cache for Redis can be found here.

ClusterClass

Feature status: Experimental
Feature gate: ClusterTopology=true

ClusterClass is a collection of templates that define a topology (control plane and machine deployments) to be used to continuously reconcile one or more Clusters. It is built on top of the existing Cluster API resources and provides a set of tools and operations to streamline cluster lifecycle management while maintaining the same underlying API.

CAPZ currently supports ClusterClass for both managed (AKS) and self-managed clusters. CAPZ implements this with four custom resources:

AzureClusterTemplate
AzureManagedClusterTemplate
AzureManagedControlPlaneTemplate
AzureManagedMachinePoolTemplate

Each resource is a template for the corresponding CAPZ resource. For example, the AzureClusterTemplate is a template for the CAPZ AzureCluster resource. The template contains a set of parameters that are able to be shared across multiple clusters.

Deploying a Self-Managed Cluster with ClusterClass

Users must first create a ClusterClass resource to deploy a self-managed cluster with ClusterClass. The ClusterClass resource defines the cluster topology, including the control plane and machine deployment templates. The ClusterClass resource also defines the parameters that can be used to customize the cluster topology.

Please refer to the Cluster API book for more information on how to write a ClusterClass topology: https://cluster-api.sigs.k8s.io/tasks/experimental-features/cluster-class/write-clusterclass.html

For a self-managed cluster, the AzureClusterTemplate defines the Azure infrastructure for the cluster. The following example shows a basic AzureClusterTemplate resource:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterTemplate
metadata:
  name: capz-clusterclass-cluster
  namespace: default
spec:
  template:
    spec:
      location: westus2
      networkSpec:
        subnets:
        - name: control-plane-subnet
          role: control-plane
        - name: node-subnet
          natGateway:
            name: node-natgateway
          role: node
      subscriptionID: 00000000-0000-0000-0000-000000000000

Deploying a Managed Cluster (AKS) with ClusterClass

Feature gate: MachinePool=true

Deploying an AKS cluster with ClusterClass is similar to deploying a self-managed cluster. However, both an AzureManagedClusterTemplate and AzureManagedControlPlaneTemplate must be used instead of the AzureClusterTemplate. Due to the nature of managed Kubernetes and the control plane implementation, the infrastructure provider (and therefore the AzureManagedClusterTemplate) for AKS cluster is basically a no-op. The AzureManagedControlPlaneTemplate is used to define the AKS cluster configuration, such as the Kubernetes version and the number of nodes. Finally, the AzureManagedMachinePoolTemplate defines the worker nodes (agentpools) for the AKS cluster.

The following example shows a basic AzureManagedClusterTemplate, AzureManagedControlPlaneTemplate, and AzureManagedMachinePoolTemplate resource:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedClusterTemplate
metadata:
  name: capz-clusterclass-cluster
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlaneTemplate
metadata:
  name: capz-clusterclass-control-plane
spec:
  location: westus2
  subscriptionID: 00000000-0000-0000-0000-000000000000
  version: 1.25.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePoolTemplate
metadata:
  name: capz-clusterclass-pool0
  namespace: default
spec:
  template:
    spec:
      mode: System
      name: pool0
      sku: Standard_D2s_v3

Supported Identity methods

Identities are used on the management cluster and the VMs/clusters/workloads which get provisioned by the management cluster. Also see relevant identities use cases, Azure Active Directory integration, and Multi-tenancy pages.

Deprecated Identity Types

Warning

The ability to set credentials using environment variables has been removed. Instead, use AzureClusterIdentity as described below.

Warning

The identity type ManualServicePrincipal has been deprecated because it is now identical to ServicePrincipal and therefore redundant. None of the identity types use AAD Pod Identity any longer.

For details on the deprecated identity types, see this page.

Workload Identity (Recommended)

Follow this link for a quick start guide on setting up workload identity.

Once you've set up the management cluster with the workload identity (see link above), the corresponding values should be used to create an AzureClusterIdentity resource. Create an azure-cluster-identity.yaml file with the following content:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: cluster-identity
spec:
  type: WorkloadIdentity
  tenantID: <your-tenant-id>
  clientID: <your-client-id>
  allowedNamespaces:
    list:
    - <cluster-namespace>

Service Principal

Service Principal identity uses the service principal's clientSecret in a Kubernetes Secret. To use this type of identity, set the identity type as ServicePrincipal in AzureClusterIdentity. For example,

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: ServicePrincipal
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-SP-identity>
  clientSecret: {"name":"<secret-name-for-client-password>","namespace":"default"}
  allowedNamespaces:
    list:
    - <cluster-namespace>

Deploy this resource to your cluster:

kubectl apply -f azure-cluster-identity.yaml

A Kubernetes Secret should also be created to store the client password:

kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}"

The resulting Secret should look similar to the following example:

apiVersion: v1
kind: Secret
metadata:
  name: <secret-name-for-client-password>
type: Opaque
data:
  clientSecret: <client-secret-of-SP-identity>

Service Principal With Certificate

Once a new SP Identity is created in Azure, the corresponding values should be used to create an AzureClusterIdentity resource:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: ServicePrincipalCertificate
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-SP-identity>
  clientSecret: {"name":"<secret-name-for-client-password>","namespace":"default"}
  allowedNamespaces:
    list:
    - <cluster-namespace>

If needed, convert the PEM file to PKCS12 and set a password:

openssl pkcs12 -export -in fileWithCertAndPrivateKey.pem -out ad-sp-cert.pfx -passout pass:<password>

Create a k8s secret with the certificate and password:

kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-file=certificate=ad-sp-cert.pfx --from-literal=password=<password>

The resulting Secret should look similar to the following example:

apiVersion: v1
kind: Secret
metadata:
  name: <secret-name-for-client-password>
type: Opaque
data:
  certificate: CERTIFICATE
  password: PASSWORD

Alternatively, the path to a certificate can be specified instead of the k8s secret:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: ServicePrincipalCertificate
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-SP-identity>
  certPath: <path-to-the-cert>
  allowedNamespaces:
    list:
    - <cluster-namespace>

User-Assigned Managed Identity

Prerequisites

Create a user-assigned managed identity in Azure.
Create a role assignment to give the identity Contributor access to the Azure subscription where the workload cluster will be created.
Configure the identity on the management cluster nodes by adding it to each worker node VM. If using AKS as the management cluster see these instructions.

Creating the AzureClusterIdentity

After a user-assigned managed identity is created in Azure and assigned to the management cluster, the corresponding values should be used to create an AzureClusterIdentity resource:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: UserAssignedMSI
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-user-assigned-identity>
  allowedNamespaces:
    list:
    - <cluster-namespace>

Assigning VM identities for cloud provider authentication (self-managed)

When using a user-assigned managed identity to create the workload cluster, a VM identity should also be assigned to each control plane machine in the workload cluster for Azure Cloud Provider to use. See here for more information.

User-Assigned Identity Credentials

General

This authentication type is similar to user assigned managed identity authentication combined with client certificate authentication. As a 1st party Microsoft application, one has access to pull a user assigned managed identity's backing certificate information from the MSI data plane. Using this data, a user can authenticate to Azure Cloud.

Prerequisites

A JSON file with information from the user assigned managed identity. It should be in this format:

        {
            "client_id": "0998...",
            "client_secret": "MIIKUA...",
            "client_secret_url": "https://control...",
            "tenant_id": "93b...",
            "object_id": "ae...",
            "resource_id": "/subscriptions/...",
            "authentication_endpoint": "https://login.microsoftonline.com/",
            "mtls_authentication_endpoint": "https://login.microsoftonline.com/",
            "not_before": "2025-02-07T13:29:00Z",
            "not_after": "2025-05-08T13:29:00Z",
            "renew_after": "2025-03-25T13:29:00Z",
            "cannot_renew_after": "2025-08-06T13:29:00Z"
        }

Note, the client secret should be a base64 encoded certificate.

The steps to get this information from the MSI data plane are as follows:

Make an unauthenticated GET or POST (no Authorization request headers) on the x-ms-identity-url received from ARM to get the token authority and, on older api versions, resource.
Get an Access Token from Azure AD using your Resource Provider applicationId and Certificate. The applicationId should match the one you added to your manifest. The response should give you an access token.
Perform a GET or POST to MSI on the same URL from earlier to get the Credentials using this bearer token.

Creating the AzureClusterIdentity

The corresponding values should be used to create an AzureClusterIdentity resource:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: UserAssignedIdentityCredential
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-user-assigned-identity>
  userAssignedIdentityCredentialsPath: <path-to-JSON-file-with-mi-certifcate-information>
  userAssignedIdentityCredentialsCloudType: "public"
  allowedNamespaces:
    list:
    - <cluster-namespace>

Azure Host Identity

The identity assigned to the Azure host which in the control plane provides the identity to Azure Cloud Provider, and can be used on all nodes to provide access to Azure services during cloud-init, etc.

User-assigned Managed Identity
System-assigned Managed Identity
Service Principal
See details about each type in the VM identity page

More details in Azure built-in roles documentation.

AAD Integration

CAPZ can be configured to use Azure Active Directory (AD) for user authentication. In this configuration, you can log into a CAPZ cluster using an Azure AD token. Cluster operators can also configure Kubernetes role-based access control (Kubernetes RBAC) based on a user's identity or directory group membership.

Create Azure AD server component

Create the Azure AD application

export CLUSTER_NAME=my-aad-cluster

export AZURE_SERVER_APP_ID=$(az ad app create \
    --display-name "${CLUSTER_NAME}Server" \
    --identifier-uris "https://${CLUSTER_NAME}Server" \
    --query appId -o tsv)

Update the application group membership claims

az ad app update --id ${AZURE_SERVER_APP_ID} --set groupMembershipClaims=All

Create a service principal

az ad sp create --id ${AZURE_SERVER_APP_ID}

Create Azure AD client component

AZURE_CLIENT_APP_ID=$(az ad app create \
    --display-name "${CLUSTER_NAME}Client" \
    --native-app \
    --reply-urls "https://${CLUSTER_NAME}Client" \
    --query appId -o tsv)

Create a service principal

az ad sp create --id ${AZURE_CLIENT_APP_ID}

Grant the application API permissions

oAuthPermissionId=$(az ad app show --id ${AZURE_SERVER_APP_ID} --query "oauth2Permissions[0].id" -o tsv)
az ad app permission add --id ${AZURE_CLIENT_APP_ID} --api ${AZURE_SERVER_APP_ID} --api-permissions ${oAuthPermissionId}=Scope
az ad app permission grant --id ${AZURE_CLIENT_APP_ID} --api ${AZURE_SERVER_APP_ID}

Create the cluster

To deploy a cluster with support for AAD, use the aad flavor.

Make sure that AZURE_SERVER_APP_ID is set to the ID of the server AD application created above.

Get the admin kubeconfig

clusterctl get kubeconfig ${CLUSTER_NAME} > ./kubeconfig
export KUBECONFIG=./kubeconfig

Create Kubernetes RBAC binding

Get the user principal name (UPN) for the user currently logged in using the az ad signed-in-user show command. This user account is enabled for Azure AD integration in the next step:

az ad signed-in-user show --query objectId -o tsv

Create a YAML manifest my-azure-ad-binding.yaml:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: my-cluster-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: your_objectId

Create the ClusterRoleBinding using the kubectl apply command and specify the filename of your YAML manifest:

kubectl apply -f my-azure-ad-binding.yaml

Accessing the cluster

Install kubelogin

kubelogin is a client-go credential (exec) plugin implementing Azure authentication. Follow the setup instructions here.

Set the config user context

kubectl config set-credentials ad-user --exec-command kubelogin --exec-api-version=client.authentication.k8s.io/v1beta1 --exec-arg=get-token --exec-arg=--environment --exec-arg=$AZURE_ENVIRONMENT --exec-arg=--server-id --exec-arg=$AZURE_SERVER_APP_ID --exec-arg=--client-id --exec-arg=$AZURE_CLIENT_APP_ID --exec-arg=--tenant-id --exec-arg=$AZURE_TENANT_ID
kubectl config set-context ${CLUSTER_NAME}-ad-user@${CLUSTER_NAME} --user ad-user --cluster ${CLUSTER_NAME}

To verify it works, run:

kubectl config use-context ${CLUSTER_NAME}-ad-user@${CLUSTER_NAME}
kubectl get pods -A

You will receive a sign in prompt to authenticate using Azure AD credentials using a web browser. After you've successfully authenticated, the kubectl command should display the pods in the CAPZ cluster.

Adding AAD Groups

To add a group to the admin role run:

AZURE_GROUP_OID=<Your Group ObjectID>
kubectl create clusterrolebinding aad-group-cluster-admin-binding --clusterrole=cluster-admin --group=${AZURE_GROUP_OID}

Adding users

To add another user, create a additional role binding for that user:

USER_OID=<Your User ObjectID or UserPrincipalName>
kubectl create clusterrolebinding aad-user-binding --clusterrole=cluster-admin --user ${USER_OID}

You can update the cluster role bindings to suit your needs for that user or group. See the default role bindings for more details, and the general guide to Kubernetes RBAC.

Known Limitations

The user must not be a member of more than 200 groups.

Identity User Stories

This describes some common user stories for identities being utilized with CAPZ. Please see the core identities page first. Also see related identities use cases and Multi-tenancy pages.

Story 1 - Locked down with Service Principal Per Subscription

Alex is an engineer in a large organization which has a strict Azure account architecture. This architecture dictates that Kubernetes clusters must be hosted in dedicated Subscriptions with AAD identity having RBAC rights to provision the infrastructure only in the Subscription. The workload clusters must run with a System Assigned machine identity. The organization has adopted Cluster API in order to manage Kubernetes infrastructure, and expects 'management' clusters running the Cluster API controllers to manage 'workload' clusters in dedicated Azure Subscriptions with an AAD account which only has access to that Subscription.

The current configuration exists:

Subscription for each cluster
AAD Service Principals with Subscription Owner rights for each Subscription
A management Kubernetes cluster running Cluster API Provider Azure controllers

Alex can provision a new workload cluster in the specified Subscription with the corresponding AAD Service Principal by creating new Cluster API resources in the management cluster. Each of the workload cluster machines would run as the System Assigned identity described in the Cluster API resources. The CAPZ controller in the management cluster uses the Service Principal credentials when reconciling the AzureCluster so that it can create/use/destroy resources in the workload cluster.

Story 2 - Locked down by Namespace and Subscription

Erin is a security engineer in the same company as Alex. Erin is responsible for provisioning identities. Erin will create a Service Principal for use by Alex to provision the infrastructure in Alex's cluster. The identity Erin creates should only be able to be used in a predetermined Kubernetes namespace where Alex will define the workload cluster. The identity should be able to be used by CAPZ to provision workload clusters in other namespaces.

The organization has adopted Cluster API in order to manage Kubernetes infrastructure, and expects 'management' clusters running the Cluster API controllers to manage 'workload' clusters in dedicated Azure Subscriptions with an AAD account which only has access to that Subscription.

The current configuration exists:

Subscription for each cluster
AAD Service Principals with Subscription Owner rights for each Subscription
A management Kubernetes cluster running Cluster API Provider Azure controllers

Alex can provision a new workload cluster in the specified Subscription with the corresponding AAD Service Principal by creating new Cluster API resources in the management cluster in the predetermined namespace. Each of the workload cluster machines would run as the System Assigned identity described in the Cluster API resources. The CAPZ controller in the management cluster uses the Service Principal credentials when reconciling the AzureCluster so that it can create/use/destroy resources in the workload cluster.

Erin can provision an identity in a namespace of limited access and define the allowed namespaces, which will include the predetermined namespace for the workload cluster.

Story 3 - Using an Azure User Assigned Identity

Erin is an engineer working in a large organization. Erin does not want to be responsible for ensuring Service Principal secrets are rotated on a regular basis. Erin would like to use an Azure User Assigned Identity to provision workload cluster infrastructure. The User Assigned Identity will have the RBAC rights needed to provision the infrastructure in Erin's subscription.

The current configuration exists:

Subscription for the workload cluster
A User Assigned Identity with RBAC with Subscription Owner rights for the Subscription
A management Kubernetes cluster running Cluster API Provider Azure controllers

Erin can provision a new workload cluster in the specified Subscription with the Azure User Assigned Identity by creating new Cluster API resources in the management cluster. The CAPZ controller in the management cluster uses the User Assigned Identity credentials when reconciling the AzureCluster so that it can create/use/destroy resources in the workload cluster.

Multi-tenancy

To enable single controller multi-tenancy, a different Identity can be added to the Azure Cluster that will be used as the Azure Identity when creating Azure resources related to that cluster.

This is achieved using workload identity.

Supported Identity Types

Please read the identities page for more information on the supported identity types.

allowedNamespaces

AllowedNamespaces is used to identify the namespaces the clusters are allowed to use the identity from. Namespaces can be selected either using an array of namespaces or with label selector. An empty allowedNamespaces object indicates that AzureClusters can use this identity from any namespace. If this object is nil, no namespaces will be allowed (default behavior, if this field is not provided) A namespace should be either in the NamespaceList or match with Selector to use the identity. Please note NamespaceList will take precedence over Selector if both are set.

Deprecated Identity Types

Warning

The ability to set credentials using environment variables has been removed. Instead, use AzureClusterIdentity as described in the identities documentation.

Warning

The identity type ManualServicePrincipal has been deprecated because it is now identical to ServicePrincipal and therefore redundant. None of the identity types use AAD Pod Identity any longer.

Manual Service Principal (deprecated)

Manual Service Principal Identity uses the service principal's clientSecret directly fetched from the secret containing it. To use this type of identity, set the identity type as ManualServicePrincipal in AzureClusterIdentity. For example,

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: example-identity
  namespace: default
spec:
  type: ManualServicePrincipal
  tenantID: <azure-tenant-id>
  clientID: <client-id-of-SP-identity>
  clientSecret: {"name":"<secret-name-for-client-password>","namespace":"default"}
  allowedNamespaces:
    list:
    - <cluster-namespace>

Deploy this resource to your cluster:

kubectl apply -f azure-cluster-identity.yaml

A Kubernetes Secret should also be created to store the client password:

kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}"

The resulting Secret should look similar to the following example:

apiVersion: v1
kind: Secret
metadata:
  name: <secret-name-for-client-password>
type: Opaque
data:
  clientSecret: <client-secret-of-SP-identity>

Workload Identity

Azure AD Workload identity is the next iteration of Azure AD Pod identity that enables Kubernetes applications such as CAPZ to access Azure cloud resources securely with Azure Active Directory.

Let's help you get started using workload identity. We assume you have access to Azure cloud services.

Quick start

Set up a management cluster with kind

Create a private and public key pair. For example, using OpenSSL:
```
openssl genrsa -out sa.key 2048
openssl rsa -in sa.key -pubout -out sa.pub
```
Set the environment variable SERVICE_ACCOUNT_SIGNING_KEY_FILE to the full path of the sa.key private key file you just generated, and set SERVICE_ACCOUNT_KEY_FILE to the generated sa.pub public key file.
```
export SERVICE_ACCOUNT_SIGNING_KEY_FILE=$(realpath sa.key)
export SERVICE_ACCOUNT_KEY_FILE=$(realpath sa.pub)
```
These environment variables will be used later, when creating the kind cluster.
Create and upload a discovery document

Create and upload a JWKS discovery document by following these instructions.
Create two federated identity credentials

Export environment variables used for creating a federated identity credential:
- SERVICE_ACCOUNT_NAMESPACE: Namespace where the capz-manager and azureserviceoperator-controller-manager pods will run. Default is capz-system.
- SERVICE_ACCOUNT_NAME: Name of the capz-manager or azureserviceoperator-default k8s service account. Default is capz-manager for CAPZ and azureserviceoperator-default for ASO.
- SERVICE_ACCOUNT_ISSUER: Path of the Azure storage container created in the previous step, specifically:
  - "https://${AZURE_STORAGE_ACCOUNT}.blob.core.windows.net/${AZURE_STORAGE_CONTAINER}/"
Create two federated identity credentials, one for CAPZ and one for ASO, by following these instructions. You'll need to set SERVICE_ACCOUNT_NAME and SERVICE_ACCOUNT_NAMESPACE to different values for each credential. Use either user-assigned-identity or AD application when creating the credentials, and add the contributor role to each.

Create a kind cluster with the following command:

cat <<EOF | kind create cluster --name azure-workload-identity --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: ${SERVICE_ACCOUNT_KEY_FILE}
      containerPath: /etc/kubernetes/pki/sa.pub
    - hostPath: ${SERVICE_ACCOUNT_SIGNING_KEY_FILE}
      containerPath: /etc/kubernetes/pki/sa.key
  kubeadmConfigPatches:
  - |
    kind: ClusterConfiguration
    apiServer:
      extraArgs:
        service-account-issuer: ${SERVICE_ACCOUNT_ISSUER}
        service-account-key-file: /etc/kubernetes/pki/sa.pub
        service-account-signing-key-file: /etc/kubernetes/pki/sa.key
    controllerManager:
      extraArgs:
        service-account-private-key-file: /etc/kubernetes/pki/sa.key
EOF

Initialize the kind cluster as a CAPZ management cluster using clusterctl:
```
clusterctl init --infrastructure azure
```
If you don't have clusterctl installed, follow these instructions to install it.

Creating a Workload Cluster

Create a user-assigned identity using the below steps:
- Create a user-assigned managed identity in Azure. Save its name which will be used later.
- Create a role assignment to give the identity Contributor access to the Azure subscription where the workload cluster will be created.

Before generating a workload cluster YAML configuration, set the following environment variables.

export AZURE_SUBSCRIPTION_ID=<your-azure-subscription-id>
# This is the client ID of the AAD app or user-assigned identity that you used to created the federated identity.
export AZURE_CLIENT_ID=<your-azure-client-id>
export AZURE_TENANT_ID=<your-azure-tenant-id>
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_B2s"
export AZURE_NODE_MACHINE_TYPE="Standard_B2s"
export AZURE_LOCATION="eastus"

# Identity secret. Though these are not used in workload identity, we still
# need to set them for the sake of generating the workload cluster YAML configuration
export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
export CLUSTER_IDENTITY_NAME="cluster-identity"
export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"

Generate a workload cluster template using the following command.

clusterctl generate cluster azwi-quickstart --kubernetes-version v1.27.3  --worker-machine-count=3 > azwi-quickstart.yaml

Edit the generated azwi-quickstart.yaml to make the following changes for workload identity to the AzureClusterIdentity object:

Change the type to WorkloadIdentity.
Remove the clientSecret spec.

The AzureClusterIdentity specification should look like the following.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  name: cluster-identity
spec:
  type: WorkloadIdentity
  allowedNamespaces:
    list:
    - <cluster-namespace>
  tenantID: <your-tenant-id>
  clientID: <your-client-id>

Change the AzureMachineTemplate for both control plane and worker to include user-assigned-identity by adding the following in its spec.

identity: UserAssigned
userAssignedIdentities:
- providerID: /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/${USER_ASSIGNED_IDENTITY_NAME}

A sample AzureMachineTemplate after the edit should look like the below:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      osDisk:
        diskSizeGB: 128
        osType: Linux
      sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
      identity: UserAssigned
      userAssignedIdentities:
      - providerID: /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/${USER_ASSIGNED_IDENTITY_NAME}
      vmSize: ${AZURE_NODE_MACHINE_TYPE}

At this stage, you can apply this yaml to create a workload cluster.

Notes:

Please follow this link to see a workload cluster yaml configuration that uses workload identity.
Creating a workload cluster via workload identity will be simplified after this issue is resolved.

Debugging

No matching federated identity record found

If you see logs like below, double check if the service account URL is exactly same on apiserver as that in the federated credential.

"error": "invalid_request",
"error_description": "AADSTS70021: No matching federated identity record found for presented assertion. Assertion

Authorization failed when using user-assigned identity

If you see error message similar to the following in the AzureCluster object, this can be because the user-assigned identity does not have required permission.

Message: group failed to create or update. err: failed to get existing resource
demo/demo(service: group): resources.GroupsClient#Get: Failure
responding to request: StatusCode=403 -- Original Error: autorest/azure:
Service returned an error. Status=403 Code="AuthorizationFailed"
Message="The client '<id-redacted>' with object id '<id-redacted>' does not have
authorization to perform action 'Microsoft.Resources/subscriptions/resourcegroups/read'
over scope '/subscriptions/<sub-id-redacted>/resourcegroups/ashu-test' or the
scope is invalid. If access was recently granted, please refresh your
credentials.". Object will be requeued after 15s

Add contributor role to the user-assigned identity and this should fix it.

Frequently Asked Questions

Does CAPZ support Feature X?

The best way to check if CAPZ supports a feature is by reviewing the roadmap and the public milestones. The public milestones will also provide insight into what's coming in the next 1-2 months. All open items for the next milestone are displayed in the Milestone-Open project board, which is updated at the start of each 2-month release cycle. Planning and discussions for these milestones typically happen during the Cluster API Azure Office Hours (every Thursday at 9am PT) after each major release.

For managed (AKS) clusters, all features should be available when using the AzureASOManaged API specification, which provides 100% API surface coverage via ASO. If there is no issue for a new feature in the repository, it likely does not exist yet.

You can also check the GitHub repository issues and milestones for more details on specific features.

How can I enable Feature X if it's not?

If CAPZ does not currently support Feature X, you have a couple of options:

Check for Experimental Features: Sometimes features not fully supported are available as experimental. You can experiment with said features by enabling them using feature gates. Refer to the Experimental Features section in the CAPI documentation for detailed instructions.
Contribute to CAPZ: If you're eager to use Feature X, consider contributing to the CAPZ project. Our community welcomes contributions, and your input can help accelerate support for new features.

Why doesn't CAPZ support Feature X?

CAPZ prioritizes features based on community demand, relevance to Azure Kubernetes deployments, and resource availability. Thus, Feature X may not be supported because:

Limited Demand: There might be insufficient demand from the community to prioritize its development.
Technical Constraints: Integrating Feature X could present technical challenges that require more time and resources.
Roadmap Alignment: Feature X may not align with our current strategic roadmap or immediate goals. We continuously evaluate and update our roadmap based on user feedback and evolving requirements, so your input is valuable in shaping future support.

How do I add Feature X to CAPZ?

To add Feature X to CAPZ, consider following these steps:

Review the Roadmap: Ensure that Feature X aligns with the CAPZ roadmap and that it's not already planned or in development.
Submit an Issue: Open an issue on our GitHub Repository to discuss Feature X with the maintainers and the community. Please provide detailed information about the feature and its benefits.
Contribute Code: If you have the capabilities, you can implement Feature X yourself. Fork the repository, develop the feature following our contribution guidelines, and submit a pull request for review.
Collaborate with the Community: Engage with and receive updates from other contributors and maintainers through our Slack channel or mailing lists to gather support and feedback for Feature X. By actively participating, you can enhance CAPZ's functionalilty and ensure it meets the needs of the Kubernetes community on Azure.

CAPZ creates a newer API version of an ASO resource, but kubectl shows me an older API version. Did the API version change?

No, because the API version is not an integral property of Kubernetes resources. The same resource can be represented by several different API versions: https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-groups-and-versioning

ASO's versioning scheme isn't compatible with Kubernetes's assumptions about API versions w.r.t. kubectl's default version selection, so you'll usually end up seeing an older version than you might have used to create the resource if you do a plain kubectl get managedcluster vs. specifying the version explicitly like kubectl get managedclusters.v1api20240402preview.containerservice.azure.com.

There is an open issue with ASO to smooth this out so the newest Azure API version is selected by default: https://github.com/Azure/azure-service-operator/issues/4147 for more context.

Managed Clusters

This section contains information specific to configuring managed (AKS) Kubernetes clusters using Cluster API Provider for Azure (CAPZ). See self-managed clusters for information specific to provisioning these clusters.

Documents under the Managed Clusters area:

CAPZ's Managed Cluster versus CAPZ's ASOManaged Cluster

There are two APIs which can create AKS clusters and it's important to understand the differences.

ManagedCluster - is the original API to provision AKS clusters introduced with the 0.4.4 release. This more closely matches the CAPI style for YAML code for other providers. The original code was based on directly using the Azure Go SDK, but in the 1.11.0 release was switched to utilize Azure Service Operator (ASO) as a dependency for provisioning. This supports the preview API, but does not natively support all AKS features available. The ManagedCluster API will eventually be deprecated, see the roadmap for more information.

ASOManagedCluster - was created in the 1.15 release of CAPZ and creates a CAPI-compliant wrapper around the existing ASO definitions for AKS clusters. It has 100% API coverage for the preview and current AKS APIs via the ASO AKS CRDs. This is the long-term plan to support provisioning AKS clusters using CAPZ. See the roadmap for more information

Why is ASOManagedCluster the future?

The biggest challenge with ManagedCluster is attempting to keep up with the velocity of feature changes to the frequently changing AKS API. This model requires every single feature to be added into code directly. Even with the simplification of code to utilize ASO as a dependency, there still is quite a bit of time required to keep up with these features. Ultimately, this is an unsustainable path long-term. The asoManaged*Patches feature enables patching/adding various AKS fields via ASO to help try to fill some of these gaps, but this comes at the cost of unknown full testing.

ASOManagedCluster enables 100% API coverage natively and is easy to keep up with since it is primarily a dependency update to CAPZ. Furthermore it has the advantage of making it easier to import existing deployed clusters which have no existing CAPZ or ASO YAML code defined.

For the complete history of this transition, see the Managing Azure Resources with Azure Service Operator and Automate AKS Features available in CAPZ design proposal documents.

ASO's AKS versus CAPZ's ASOManaged AKS

It is possible to not utilize CAPZ at all and simply utilize ASO to provision an AKS cluster definition directly. The advantages that CAPZ brings over this approach are the following:

Robust Testing - CAPZ is utilized to test Kubernetes and AKS using this code with numerous end-to-end tests. ASO has no AKS-specific testing.
Simplification of Infrastructure as Code (IaC) definitions - With ASO you have to figure out how to put together every field and there are some small examples. CAPZ provides kustomize template samples connected to clusterctl generate template as well as a helm chart.
Management scale - CAPZ enables use of ClusterClass so you can have a smaller chunk of code to manage numerous clusters with the same configuration.
Heterogeneous Kubernetes management - it is possible with CAPZ to manage self-managed (not possible at all with ASO) and managed clusters with a single management control plane and similar IaC definitions.
Multi-cloud IaC consistency - even though it's still wrapping ASO, there still is some consistency in the code contract to provision Kubernetes clusters with the ~30 other CAPI infrastructure providers.

General AKS Best Practices

A set of best practices for managing AKS clusters is documented here: https://learn.microsoft.com/azure/aks/best-practices

Adopting Existing AKS Clusters

Option 1: Using the new AzureASOManaged API

The AzureASOManagedControlPlane and related APIs support adoption as a first-class use case. Going forward, this method is likely to be easier, more reliable, include more features, and better supported for adopting AKS clusters than Option 2 below.

To adopt an AKS cluster into a full Cluster API Cluster, create an ASO ManagedCluster and associated ManagedClustersAgentPool resources annotated with sigs.k8s.io/cluster-api-provider-azure-adopt=true. The annotation may also be added to existing ASO resources to trigger adoption. CAPZ will automatically scaffold the Cluster API resources like the Cluster, AzureASOManagedCluster, AzureASOManagedControlPlane, MachinePools, and AzureASOManagedMachinePools. The asoctl import azure-resource command can help generate the required YAML.

This method can also be used to migrate from AzureManagedControlPlane and its associated APIs.

Caveats

CAPZ currently only records the ASO resources in the CAPZ resources' spec.resources that it needs to function, which include the ManagedCluster, its ResourceGroup, and associated ManagedClustersAgentPools. Other resources owned by the ManagedCluster like Kubernetes extensions or Fleet memberships are not currently imported to the CAPZ specs.
Configuring the automatically generated Cluster API resources is not currently possible. If you need to change something like the metadata.name of a resource from what CAPZ generates, create the Cluster API resources manually referencing the pre-existing resources.
Adopting existing clusters created with the GA AzureManagedControlPlane API to the experimental API with this method is theoretically possible, but untested. Care should be taken to prevent CAPZ from reconciling two different representations of the same underlying Azure resources.
This method cannot be used to import existing clusters as a ClusterClass or a topology, only as a standalone Cluster.

Option 2: Using the current AzureManagedControlPlane API

CAPZ can adopt some AKS clusters created by other means under its management. This works by crafting CAPI and CAPZ manifests which describe the existing cluster and creating those resources on the CAPI management cluster. This approach is limited to clusters which can be described by the CAPZ API, which includes the following constraints:

the cluster operates within a single Virtual Network and Subnet
the cluster's Virtual Network exists outside of the AKS-managed MC_* resource group
the cluster's Virtual Network and Subnet are not shared with any other resources outside the context of this cluster

To ensure CAPZ does not introduce any unwarranted changes while adopting an existing cluster, carefully review the entire AzureManagedControlPlane spec and specify every field in the CAPZ resource. CAPZ's webhooks apply defaults to many fields which may not match the existing cluster.

Specific AKS features not represented in the CAPZ API, like those from a newer AKS API version than CAPZ uses, do not need to be specified in the CAPZ resources to remain configured the way they are. CAPZ will still not be able to manage that configuration, but it will not modify any settings beyond those for which it has knowledge.

By default, CAPZ will not make any changes to or delete any pre-existing Resource Group, Virtual Network, or Subnet resources. To opt-in to CAPZ management for those clusters, tag those resources with the following before creating the CAPZ resources: sigs.k8s.io_cluster-api-provider-azure_cluster_<CAPI Cluster name>: owned. Managed Cluster and Agent Pool resources do not need this tag in order to be adopted.

After applying the CAPI and CAPZ resources for the cluster, other means of managing the cluster should be disabled to avoid ongoing conflicts with CAPZ's reconciliation process.

Pitfalls

The following describes some specific pieces of configuration that deserve particularly careful attention, adapted from https://gist.github.com/mtougeron/1e5d7a30df396cd4728a26b2555e0ef0#file-capz-md.

Make sure AzureManagedControlPlane.metadata.name matches the AKS cluster name
Set the AzureManagedControlPlane.spec.virtualNetwork fields to match your existing VNET
Make sure the AzureManagedControlPlane.spec.sshPublicKey matches what was set on the AKS cluster. (including any potential newlines included in the base64 encoding)
- NOTE: This is a required field in CAPZ, if you don't know what public key was used, you can change or set it via the Azure CLI however before attempting to import the cluster.
Make sure the Cluster.spec.clusterNetwork settings match properly to what you are using in AKS
Make sure the AzureManagedControlPlane.spec.dnsServiceIP matches what is set in AKS
Set the tag sigs.k8s.io_cluster-api-provider-azure_cluster_<clusterName> = owned on the AKS cluster
Set the tag sigs.k8s.io_cluster-api-provider-azure_role = common on the AKS cluster

NOTE: Several fields, like networkPlugin, if not set on the AKS cluster at creation time, will mean that CAPZ will not be able to set that field. AKS doesn't allow such fields to be changed if not set at creation. However, if it was set at creation time, CAPZ will be able to successfully change/manage the field.

ASO Managed Clusters (AKS)

Feature status: alpha, not experimental, fully supported
Feature gate: MachinePool=true

New in CAPZ v1.15.0 is a new flavor of APIs that addresses the following limitations of the existing CAPZ APIs for advanced use cases for provisioning AKS clusters:

A limited set of Azure resource types can be represented.
A limited set of Azure resource topologies can be expressed. e.g. Only a single Virtual Network resource can be reconciled for each CAPZ-managed AKS cluster.
For each Azure resource type supported by CAPZ, CAPZ generally only uses a single Azure API version to define resources of that type.
For each Azure API version known by CAPZ, only a subset of fields defined in that version by the Azure API spec are exposed by the CAPZ API.

This new API defines new AzureASOManagedCluster, AzureASOManagedControlPlane, and AzureASOManagedMachinePool resources. An AzureASOManagedCluster might look like this:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureASOManagedCluster
metadata:
  name: my-cluster
  namespace: default
spec:
  resources:
  - apiVersion: resources.azure.com/v1api20200601
    kind: ResourceGroup
    metadata:
      name: my-resource-group
    spec:
      location: eastus

See here for a full AKS example using all the new resources.

The main element of the new API is spec.resources in each new resource, which defines arbitrary, literal ASO resources inline to be managed by CAPZ. These inline ASO resource definitions take the place of almost all other configuration currently defined by CAPZ. e.g. Instead of a CAPZ-specific spec.location field on the existing AzureManagedControlPlane, the same value would be expected to be set on an ASO ManagedCluster resource defined in an AzureASOManagedControlPlane's spec.resources. This pattern allows users to define, in full, any ASO-supported version of a resource type in any of these new CAPZ resources.

The obvious tradeoff with this new style of API is that CAPZ resource definitions can become more verbose for basic use cases. To address this, CAPZ still offers flavor templates that use this API with all of the boilerplate predefined to serve as a starting point for customization.

The overall theme of this API is to leverage ASO as much as possible for representing Azure resources in the Kubernetes API, thereby making CAPZ the thinnest possible translation layer between ASO and Cluster API.

This experiment will help inform CAPZ whether this pattern may be a candidate for a potential v2 API. This functionality is enabled by default and can be disabled with the ASOAPI feature flag (set by the EXP_ASO_API environment variable). Please try it out and offer any feedback!

Disable Local Accounts

When local accounts are disabled, like for AKS Automatic clusters, the kubeconfig generated by AKS assumes clients have access to the kubelogin utility locally to authenticate with Entra. This is not the case for clients like the Cluster API controllers which need to access Nodes in the workload cluster. To allow those controllers access, CAPZ will augment the kubeconfig from AKS to remove the exec plugin and add a token which is an Entra ID access token that clients can handle natively by passing as an Authorization: Bearer ... token. CAPZ authenticates with Entra using the same ASO credentials used to create the ManagedCluster resource, which might be any of the options described in ASO's documentation and must be assigned the Azure Kubernetes Service RBAC Cluster Admin Role.

When defining the embedded ManagedCluster in an AzureASOManagedControlPlane, ASO will fail to retrieve adminCredentials when local accounts are disabled, so userCredentials must be specified instead. In order to leave room for CAPZ to manage the canonical ${CLUSTER_NAME}-kubeconfig secret well-known to Cluster API, another name must be specified for this Secret to avoid CAPZ and ASO overwriting each other:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureASOManagedControlPlane
metadata:
  name: ${CLUSTER_NAME}
spec:
  resources:
  - apiVersion: containerservice.azure.com/v1api20240901
    kind: ManagedCluster
    metadata:
      name: ${CLUSTER_NAME}
    spec:
      operatorSpec:
        secrets:
          userCredentials:
            name: ${CLUSTER_NAME}-user-kubeconfig # NOT ${CLUSTER_NAME}-kubeconfig
            key: value

Migrating existing Clusters to AzureASOManagedControlPlane

Existing CAPI Clusters using the AzureManagedControlPlane and associated APIs can be migrated to use the new AzureASOManagedControlPlane and its associated APIs. This process relies on CAPZ's ability to adopt existing clusters that may not have been created by CAPZ, which comes with some caveats that should be reviewed first.

To migrate one cluster to the ASO-based APIs:

Pause the cluster by setting the Cluster's spec.paused to true.
Wait for the cluster to be paused by waiting for the absence of the clusterctl.cluster.x-k8s.io/block-move annotation on the AzureManagedControlPlane and its AzureManagedMachinePools. This should be fairly instantaneous.
Create a new namespace to contain the new resources to avoid conflicting ASO definitions.
Adopt the underlying AKS resources from the new namespace, which creates the new CAPI and CAPZ resources.
Forcefully delete the old Cluster. This is more complicated than normal because CAPI controllers do not reconcile paused resources at all, even when they are deleted. The underlying Azure resources will not be affected.
- Delete the cluster: kubectl delete cluster <name> --wait=false
- Delete the cluster infrastructure object: kubectl delete azuremanagedcluster <name> --wait=false
- Delete the cluster control plane object: kubectl delete azuremanagedcontrolplane <name> --wait=false
- Delete the machine pools: kubectl delete machinepool <names...> --wait=false
- Delete the machine pool infrastructure resources: kubectl delete azuremanagedmachinepool <names...> --wait=false
- Remove finalizers from the machine pool infrastructure resources: kubectl patch azuremanagedmachinepool <names...> --type merge -p '{"metadata": {"finalizers": null}}'
- Remove finalizers from the machine pools: kubectl patch machinepool <names...> --type merge -p '{"metadata": {"finalizers": null}}'
- Remove finalizers from the cluster control plane object: kubectl patch azuremanagedcontrolplane <name> --type merge -p '{"metadata": {"finalizers": null}}'
- Note: the cluster infrastructure object should not have any finalizers and should already be deleted
- Remove finalizers from the cluster: kubectl patch cluster <name> --type merge -p '{"metadata": {"finalizers": null}}'
- Verify the old ASO resources like ResourceGroup and ManagedCluster managed by the old Cluster are deleted.

Migrating from v1alpha1 to v1beta1

With the introduction of v1beta1 for ASO Managed APIs in CAPZ, users should migrate their clusters and manifests from v1alpha1 to v1beta1. Note: v1alpha1 and v1beta1 are equivalent — this migration is straightforward and low risk.

Steps to Migrate

Upgrade CAPZ using clusterctl upgrade

The CRDs will be updated automatically as part of the upgrade process.
Update API Versions in Manifests

For each AzureASOManaged... resource, change the apiVersion from infrastructure.cluster.x-k8s.io/v1alpha1 to infrastructure.cluster.x-k8s.io/v1beta1. For example:
```
...
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureASOManagedCluster
...
```
Make these changes for AzureASOManagedCluster(Template), AzureASOManagedControlPlane(Template) and AzureASOManagedMachinePool(Template) definitions.

Update References in CAPI Objects

Update any references in CAPI objects (such as a Cluster’s spec.infrastructureRef and spec.controlPlaneRef) to point to the new apiVersion:

spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureASOManagedCluster
    name: my-cluster
  controlPlaneRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureASOManagedControlPlane
    name: my-cluster

Similarly, update any other references to the API version for those object kinds.

What to Expect After Migration

After completing the steps above, the following should be true:

All resources are healthy and visible. For an informative snapshot of your cluster and its resources, you can run:
```
clusterctl describe cluster <your-cluster-name>
```
The resources are now using v1beta1 and reconciliation is working as expected.
The CRD storage version is set to v1beta1.

Managed Clusters (AKS)

Feature status: GA
Feature gate: MachinePool=true

Cluster API Provider Azure (CAPZ) supports managing Azure Kubernetes Service (AKS) clusters. CAPZ implements this with three custom resources:

AzureManagedControlPlane
AzureManagedCluster
AzureManagedMachinePool

The combination of AzureManagedControlPlane/AzureManagedCluster corresponds to provisioning an AKS cluster. AzureManagedMachinePool corresponds one-to-one with AKS node pools. This also means that creating an AzureManagedControlPlane requires at least one AzureManagedMachinePool with spec.mode System, since AKS expects at least one system pool at creation time. For more documentation on system node pool refer AKS Docs

Sections in this document:

Deploy with clusterctl

A clusterctl flavor exists to deploy an AKS cluster with CAPZ. This flavor requires the following environment variables to be set before executing clusterctl.

# Kubernetes values
export CLUSTER_NAME="my-cluster"
export WORKER_MACHINE_COUNT=2
export KUBERNETES_VERSION="v1.32.2"

# Azure values
export AZURE_LOCATION="southcentralus"
export AZURE_RESOURCE_GROUP="${CLUSTER_NAME}"

NOTE: ${CLUSTER_NAME} should adhere to the RFC 1123 standard. This means that it must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character.

Create a new service principal and save to a local file:

az ad sp create-for-rbac --role Contributor --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}" --sdk-auth > sp.json

export the following variables in your current shell.

export AZURE_SUBSCRIPTION_ID="$(cat sp.json | jq -r .subscriptionId | tr -d '\n')"
export AZURE_CLIENT_SECRET="$(cat sp.json | jq -r .clientSecret | tr -d '\n')"
export AZURE_CLIENT_ID="$(cat sp.json | jq -r .clientId | tr -d '\n')"
export AZURE_TENANT_ID="$(cat sp.json | jq -r .tenantId | tr -d '\n')"
export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"
export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"
export CLUSTER_IDENTITY_NAME="cluster-identity"

Managed clusters require the Cluster API "MachinePool" feature flag enabled. The feature flag is enabled by default, but you can configure it using the following environment variable:

export EXP_MACHINE_POOL=true

Create a local kind cluster to run the management cluster components:

kind create cluster

Create an identity secret on the management cluster:

kubectl create secret generic "${AZURE_CLUSTER_IDENTITY_SECRET_NAME}" --from-literal=clientSecret="${AZURE_CLIENT_SECRET}"

Execute clusterctl to template the resources, then apply to your kind management cluster:

clusterctl init --infrastructure azure
clusterctl generate cluster ${CLUSTER_NAME} --kubernetes-version ${KUBERNETES_VERSION} --flavor aks > cluster.yaml

# assumes an existing management cluster
kubectl apply -f cluster.yaml

# check status of created resources
kubectl get cluster-api -o wide

Specification

We'll walk through an example to view available options.

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: my-cluster
spec:
  clusterNetwork:
    services:
      cidrBlocks:
      - 192.168.0.0/16
  controlPlaneRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureManagedControlPlane
    name: my-cluster-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureManagedCluster
    name: my-cluster
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: my-cluster-control-plane
spec:
  location: southcentralus
  resourceGroupName: foo-bar
  sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
  subscriptionID: 00000000-0000-0000-0000-000000000000 # fake uuid
  version: v1.21.2
  networkPolicy: azure # or calico
  networkPlugin: azure # or kubenet
  sku:
    tier: Free # or Standard
  addonProfiles:
  - name: azureKeyvaultSecretsProvider
    enabled: true
  - name: azurepolicy
    enabled: true
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedCluster
metadata:
  name: my-cluster
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: agentpool0
spec:
  clusterName: my-cluster
  replicas: 2
  template:
    spec:
      clusterName: my-cluster
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: agentpool0
        namespace: default
      version: v1.21.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  name: agentpool0
spec:
  mode: System
  osDiskSizeGB: 30
  sku: Standard_D2s_v3
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: agentpool1
spec:
  clusterName: my-cluster
  replicas: 2
  template:
    spec:
      clusterName: my-cluster
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: agentpool1
        namespace: default
      version: v1.21.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  name: agentpool1
spec:
  mode: User
  osDiskSizeGB: 40
  sku: Standard_D2s_v4

Please note that we don't declare a configuration for the apiserver endpoint. This configuration data will be populated automatically based on the data returned from AKS API during cluster create as .spec.controlPlaneEndpoint.Host and .spec.controlPlaneEndpoint.Port in both the AzureManagedCluster and AzureManagedControlPlane resources. Any user-provided data will be ignored and overwritten by data returned from the AKS API.

The CAPZ API reference documentation describes all of the available options. See also the AKS API documentation for Agent Pools and Managed Clusters.

The main features for configuration are:

networkPolicy
networkPlugin
addonProfiles - for additional addons not listed below, look for the *ADDON_NAME values in this code.

addon name	YAML value
http_application_routing	httpApplicationRouting
monitoring	omsagent
virtual-node	aciConnector
kube-dashboard	kubeDashboard
azure-policy	azurepolicy
ingress-appgw	ingressApplicationGateway
confcom	ACCSGXDevicePlugin
open-service-mesh	openServiceMesh
azure-keyvault-secrets-provider	azureKeyvaultSecretsProvider
gitops	Unsupported?
web_application_routing	Unsupported?

Use an existing Virtual Network to provision an AKS cluster

If you'd like to deploy your AKS cluster in an existing Virtual Network, but create the cluster itself in a different resource group, you can configure the AzureManagedControlPlane resource with a reference to the existing Virtual Network and subnet. For example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: my-cluster-control-plane
spec:
  location: southcentralus
  resourceGroupName: foo-bar
  sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
  subscriptionID: 00000000-0000-0000-0000-000000000000 # fake uuid
  version: v1.21.2
  virtualNetwork:
    cidrBlock: 10.0.0.0/8
    name: test-vnet
    resourceGroup: test-rg
    subnet:
      cidrBlock: 10.0.2.0/24
      name: test-subnet

Disable Local Accounts in AKS when using Azure Active Directory

When deploying an AKS cluster, local accounts are enabled by default. Even when you enable RBAC or Azure AD integration, --admin access still exists as a non-auditable backdoor option. Disabling local accounts closes the backdoor access to the cluster Example to disable local accounts for AAD enabled cluster.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  ...
spec:
  aadProfile:
    managed: true
    adminGroupObjectIDs:
    -  00000000-0000-0000-0000-000000000000 # group object id created in azure.
  disableLocalAccounts: true
  ...

Note: CAPZ and CAPI requires access to the target cluster to maintain and manage the cluster. Disabling local accounts will cut off direct access to the target cluster. CAPZ and CAPI can access target cluster only via the Service Principal, hence the user has to provide appropriate access to the Service Principal to access the target cluster. User can do that by adding the Service Principal to the appropriate group defined in Azure and add the corresponding group ID in spec.aadProfile.adminGroupObjectIDs. CAPI and CAPZ will be able to authenticate via AAD while accessing the target cluster.

AKS Fleet Integration

CAPZ supports joining your managed AKS clusters to a single AKS fleet. Azure Kubernetes Fleet Manager (Fleet) enables at-scale management of multiple Azure Kubernetes Service (AKS) clusters. For more documentation on Azure Kubernetes Fleet Manager, refer AKS Docs

To join a CAPZ cluster to an AKS fleet, you must first create an AKS fleet manager. For more information on how to create an AKS fleet manager, refer AKS Docs. This fleet manager will be your point of reference for managing any CAPZ clusters that you join to the fleet.

Once you have created an AKS fleet manager, you can join your CAPZ cluster to the fleet by adding the fleetsMember field to your AzureManagedControlPlane resource spec:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  fleetsMember:
    group: fleet-update-group
    managerName: fleet-manager-name
    managerResourceGroup: fleet-manager-resource-group

The managerName and managerResourceGroup fields are the name and resource group of your AKS fleet manager. The group field is the name of the update group for the cluster, not to be confused with the resource group.

When the fleetMember field is included, CAPZ will create an AKS fleet member resource which will join the CAPZ cluster to the AKS fleet. The AKS fleet member resource will be created in the same resource group as the CAPZ cluster.

AKS Extensions

CAPZ supports enabling AKS extensions on your managed AKS clusters. Cluster extensions provide an Azure Resource Manager driven experience for installation and lifecycle management of services like Azure Machine Learning or Kubernetes applications on an AKS cluster. For more documentation on AKS extensions, refer AKS Docs.

You can either provision official AKS extensions or Kubernetes applications through Marketplace. Please refer to AKS Docs for the list of currently available extensions.

To add an AKS extension to your managed cluster, simply add the extensions field to your AzureManagedControlPlane resource spec:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  extensions:
  - name: my-extension
    extensionType: "microsoft.flux"

To list all of the available extensions for your cluster as well as its plan details, use the following az cli command:

az k8s-extension extension-types list-by-cluster --resource-group my-resource-group --cluster-name mycluster --cluster-type managedClusters

For more details, please refer to the az k8s-extension cli reference.

Security Profile for AKS clusters

Example for configuring AzureManagedControlPlane with a security profile:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: my-cluster-control-plane
spec:
  location: southcentralus
  resourceGroupName: foo-bar
  sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
  subscriptionID: 00000000-0000-0000-0000-000000000000 # fake uuid
  version: v1.32.2
  identity:
    type: UserAssigned
    userAssignedIdentityResourceID: /subscriptions/00000000-0000-0000-0000-00000000/resourcegroups/<your-resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<your-managed-identity>
  oidcIssuerProfile:
    enabled: true
  securityProfile:
    workloadIdentity:
      enabled: true
    imageCleaner:
      enabled: true
      intervalHours: 48
    azureKeyVaultKms:
      enabled: true
      keyID: https://key-vault.vault.azure.net/keys/secret-key/00000000000000000
    defender:
      logAnalyticsWorkspaceResourceID: /subscriptions/00000000-0000-0000-0000-00000000/resourcegroups/<your-resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<your-managed-identity>
      securityMonitoring:
        enabled: true

Enabling Preview API Features for ManagedClusters

:warning: WARNING: This is meant to be used sparingly to enable features for development and testing that are not otherwise represented in the CAPZ API. Misconfiguration that conflicts with CAPZ's normal mode of operation is possible.

To enable preview features for managed clusters, you can use the enablePreviewFeatures field in the AzureManagedControlPlane resource spec. To use any of the new fields included in the preview API version, use the asoManagedClusterPatches field in the AzureManagedControlPlane resource spec and the asoManagedClustersAgentPoolPatches field in the AzureManagedMachinePool resource spec to patch in the new fields.

Please refer to the ASO Docs for the ContainerService API reference for the latest preview fields and their usage.

Example for enabling preview features for managed clusters:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  enablePreviewFeatures: true
  asoManagedClusterPatches:
  - '{"spec": {"enableNamespaceResources": true}}'
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  ...
spec:
  asoManagedClustersAgentPoolPatches:
  - '{"spec": {"enableCustomCATrust": true}}'

OIDC Issuer on AKS

Setting AzureManagedControlPlane.Spec.oidcIssuerProfile.enabled to true will enable OIDC issuer profile for the AzureManagedControlPlane. Once enabled, you will see a configmap named <cluster-name>-aso-oidc-issuer-profile in the same namespace as the AzureManagedControlPlane resource. This configmap will contain the OIDC issuer profile url under the oidc-issuer-profile-url key.

Once OIDC issuer is enabled on the cluster, it's not supported to disable it.

To learn more about OIDC and AKS refer AKS Docs on OIDC issuer.

Enable AKS features with custom headers (--aks-custom-headers)

CAPZ no longer supports passing custom headers to AKS APIs with infrastructure.cluster.x-k8s.io/custom-header- annotations. Custom headers are deprecated in AKS in favor of new features first landing in preview API versions:

https://github.com/Azure/azure-rest-api-specs/pull/18232

Joining self-managed VMSS nodes to an AKS control plane

Installing Addons

In order for the nodes to become ready, you'll need to install Cloud Provider Azure and a CNI.

AKS will install Cloud Provider Azure on the self-managed nodes as long as they have the appropriate labels. You can add the required label on the nodes by running the following command on the AKS cluster:

kubectl label node <node name> kubernetes.azure.com/cluster=<nodeResourceGroupName>

Repeat this for each node in the MachinePool.

For the CNI, you can install the CNI of your choice. For example, to install Azure CNI, run the following command on the AKS cluster:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/azure-cni-v1.yaml

Notes

Some notes about how this works under the hood:

CAPZ will fetch the kubeconfig for the AKS cluster and store it in a secret named ${CLUSTER_NAME}-kubeconfig in the management cluster. That secret is then used for discovery by the KubeadmConfig resource.
You can customize the MachinePool, AzureMachinePool, and KubeadmConfig resources to your liking. The example above is just a starting point. Note that the key configurations to keep are in the KubeadmConfig resource, namely the files, joinConfiguration, and preKubeadmCommands sections.
The KubeadmConfig resource will be used to generate a kubeadm join command that will be executed on each node in the VMSS. It uses the cluster kubeconfig for discovery. The kubeadm init phase upload-config all is run as a preKubeadmCommand to ensure that the kubeadm and kubelet configurations are uploaded to a ConfigMap. This step would normally be done by the kubeadm init command, but since we're not running kubeadm init we need to do it manually.

Creating the MachinePool

You can add a self-managed VMSS node pool to any CAPZ-managed AKS cluster by applying the following resources to the management cluster:

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: ${CLUSTER_NAME}-vmss
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfig
          name: ${CLUSTER_NAME}-vmss
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachinePool
        name: ${CLUSTER_NAME}-vmss
      version: ${KUBERNETES_VERSION}
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: ${CLUSTER_NAME}-vmss
  namespace: default
spec:
  location: ${AZURE_LOCATION}
  strategy:
    rollingUpdate:
      deletePolicy: Oldest
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
    vmSize: ${AZURE_NODE_MACHINE_TYPE}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
  name: ${CLUSTER_NAME}-vmss
  namespace: default
spec:
  files:
  - contentFrom:
      secret:
        key: worker-node-azure.json
        name: ${CLUSTER_NAME}-vmss-azure-json
    owner: root:root
    path: /etc/kubernetes/azure.json
    permissions: "0644"
  - contentFrom:
      secret:
        key: value
        name: ${CLUSTER_NAME}-kubeconfig
    owner: root:root
    path: /etc/kubernetes/admin.conf
    permissions: "0644"
  joinConfiguration:
    discovery:
      file:
        kubeConfigPath: /etc/kubernetes/admin.conf
    nodeRegistration:
      kubeletExtraArgs:
        cloud-provider: external
      name: '{{ ds.meta_data["local_hostname"] }}'
  preKubeadmCommands:
  - kubeadm init phase upload-config all

Troubleshooting Managed Clusters (AKS)

If a user tries to delete the MachinePool which refers to the last system node pool AzureManagedMachinePool webhook will reject deletion, so time stamp never gets set on the AzureManagedMachinePool. However the timestamp would be set on the MachinePool and would be in deletion state. To recover from this state create a new MachinePool manually referencing the AzureManagedMachinePool, edit the required references and finalizers to link the MachinePool to the AzureManagedMachinePool. In the AzureManagedMachinePool remove the owner reference to the old MachinePool, and set it to the new MachinePool. Once the new MachinePool is pointing to the AzureManagedMachinePool you can delete the old MachinePool. To delete the old MachinePool remove the finalizers in that object.

Here is an Example:

# MachinePool deleted
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  finalizers:             # remove finalizers once new object is pointing to the AzureManagedMachinePool
  - machinepool.cluster.x-k8s.io
  labels:
    cluster.x-k8s.io/cluster-name: capz-managed-aks
  name: agentpool0
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    name: capz-managed-aks
    uid: 152ecf45-0a02-4635-987c-1ebb89055fa2
  uid: ae4a235a-f0fa-4252-928a-0e3b4c61dbea
spec:
  clusterName: capz-managed-aks
  minReadySeconds: 0
  providerIDList:
  - azure:///subscriptions/9107f2fb-e486-a434-a948-52e2929b6f18/resourceGroups/MC_rg_capz-managed-aks_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool0-10226072-vmss/virtualMachines/0
  replicas: 1
  template:
    metadata: {}
    spec:
      bootstrap:
        dataSecretName: ""
      clusterName: capz-managed-aks
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: agentpool0
        namespace: default
      version: v1.21.2

---
# New Machinepool
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  finalizers:
  - machinepool.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: capz-managed-aks
  name: agentpool2    # change the name of the machinepool
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    name: capz-managed-aks
    uid: 152ecf45-0a02-4635-987c-1ebb89055fa2
  # uid: ae4a235a-f0fa-4252-928a-0e3b4c61dbea     # remove the uid set for machinepool
spec:
  clusterName: capz-managed-aks
  minReadySeconds: 0
  providerIDList:
  - azure:///subscriptions/9107f2fb-e486-a434-a948-52e2929b6f18/resourceGroups/MC_rg_capz-managed-aks_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool0-10226072-vmss/virtualMachines/0
  replicas: 1
  template:
    metadata: {}
    spec:
      bootstrap:
        dataSecretName: ""
      clusterName: capz-managed-aks
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: agentpool0
        namespace: default
      version: v1.21.2

Self-Managed Clusters

This section contains information specific to configuring self-managed Kubernetes clusters using Cluster API Provider for Azure (CAPZ). See managed clusters (AKS) for information specific to provisioning these clusters.

Overview

This section provides examples for addons for self-managed clusters. For managed cluster addons, please go to the managed cluster specifications.

Self managed cluster addon options covered here:

CNI - including Calico for IPv4, IPv6, dual stack, and Flannel
External Cloud provider - including Azure File, Azure Disk CSI storage drivers

CNI

By default, the CNI plugin is not installed for self-managed clusters, so you have to install your own.

Some of the instructions below use Helm to install the addons. If you're not familiar with using Helm to manage Kubernetes applications as packages, there's lots of good Helm documentation on the official website. You can install Helm by following the official instructions.

Calico

To install Calico on a self-managed cluster using the office Calico Helm chart, run the commands corresponding to the cluster network configuration.

For IPv4 Clusters

Grab the IPv4 CIDR from your cluster by running this kubectl statement against the management cluster:

export IPV4_CIDR_BLOCK=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[0]}')

Then install the Helm chart on the workload cluster:

helm repo add projectcalico https://docs.tigera.io/calico/charts && \
helm install calico projectcalico/tigera-operator --version v3.26.1 -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico/values.yaml --set-string "installation.calicoNetwork.ipPools[0].cidr=${IPV4_CIDR_BLOCK}" --namespace tigera-operator --create-namespace

For IPv6 Clusters

Grab the IPv6 CIDR from your cluster by running this kubectl statement against the management cluster:

export IPV6_CIDR_BLOCK=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[0]}')

Then install the Helm chart on the workload cluster:

helm repo add projectcalico https://docs.tigera.io/calico/charts && \
helm install calico projectcalico/tigera-operator --version v3.26.1 -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico-ipv6/values.yaml  --set-string "installation.calicoNetwork.ipPools[0].cidr=${IPV6_CIDR_BLOCK}" --namespace tigera-operator --create-namespace

For Dual-Stack Clusters

Grab the IPv4 and IPv6 CIDRs from your cluster by running this kubectl statement against the management cluster:

export IPV4_CIDR_BLOCK=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[0]}')
export IPV6_CIDR_BLOCK=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[1]}')

Then install the Helm chart on the workload cluster:

helm repo add projectcalico https://docs.tigera.io/calico/charts && \
helm install calico projectcalico/tigera-operator --version v3.26.1 -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico-dual-stack/values.yaml --set-string "installation.calicoNetwork.ipPools[0].cidr=${IPV4_CIDR_BLOCK}","installation.calicoNetwork.ipPools[1].cidr=${IPV6_CIDR_BLOCK}" --namespace tigera-operator --create-namespace

Note

For Windows nodes, you also need to copy the kubeadm-config configmap to the calico-system namespace so the calico-node-windows Daemonset can find it:

kubectl create ns calico-system
kubectl get configmap kubeadm-config --namespace=kube-system -o yaml \
| sed 's/namespace: kube-system/namespace: calico-system/' \
| kubectl create -f -

For more information, see the official Calico documentation.

Flannel

This section describes how to use Flannel as your CNI solution.

Modify the Cluster resources

Before deploying the cluster, change the KubeadmControlPlane value at spec.kubeadmConfigSpec.clusterConfiguration.controllerManager.extraArgs.allocate-node-cidrs to "true"

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      controllerManager:
        extraArgs:
          allocate-node-cidrs: "true"

Modify Flannel config

NOTE: This is based off of the instructions at: https://github.com/flannel-io/flannel#deploying-flannel-manually

You need to make an adjustment to the default flannel configuration so that the CIDR inside your CAPZ cluster matches the Flannel Network CIDR.

View your capi-cluster.yaml and make note of the Cluster Network CIDR Block. For example:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16

Download the file at https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml and modify the kube-flannel-cfg ConfigMap. Set the value at data.net-conf.json.Network value to match your Cluster Network CIDR Block.

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Edit kube-flannel.yml and change this section so that the Network section matches your Cluster CIDR

kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
data:
  net-conf.json: |
    {
      "Network": "192.168.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

Apply kube-flannel.yml

kubectl apply -f kube-flannel.yml

Using Azure CNI V1

While following the quick start steps in Cluster API book, Azure CNI v1 can be used in place of Calico as a container networking interface solution for your workload cluster.

Artifacts required for Azure CNI:

azure-cni.yaml

Limitations

Azure CNI v1 is only supported for Linux nodes. Refer to: CAPZ#3650
We can only configure one subnet per control-plane node. Refer to: CAPZ#3506
We can only configure one Network Interface per worker node. Refer to: Azure-container-networking#3611

Update Cluster Configuration

The following resources need to be updated when using capi-quickstart.yaml (the default cluster manifest generated while following the Cluster API quick start).

kind: AzureCluster

update spec.networkSpecs.subnets with the name and role of the subnets you want to use in your workload cluster.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  .
  .
  networkSpec:
    subnets:
    - name: control-plane-subnet # update this as per your nomenclature
      role: control-plane
    - name: node-subnet # update this as per your nomenclature
      role: node
  .
  .

kind: KubeadmControlPlane of control plane nodes

add max-pods: "30" to spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.kubeletExtraArgs.
add max-pods: "30" to spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.kubeletExtraArgs.

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: ${CLUSTER_NAME}-control-plane
  namespace: default
spec:
  kubeadmConfigSpec:
    .
    .
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          max-pods: "30"
          .
          .
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          max-pods: "30"
          .
          .

kind: AzureMachineTemplate of control-plane

Add networkInterfaces to controlplane's AzureMachineTemplate

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-control-plane
  namespace: default
spec:
  template:
    spec:
      .
      .
      networkInterfaces:
      - privateIPConfigs: 30
        subnetName: control-plane-subnet
      .
      .

kind: AzureMachineTemplate of worker node

Add networkInterfaces to worker node's AzureMachineTemplate

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      networkInterfaces:
      - privateIPConfigs: 30
        subnetName: node-subnet
      .
      .

kind: KubeadmControlPlane of worker nodes

add max-pods: "30" to spec.template.spec.joinConfiguration.nodeRegistration.kubeletExtraArgs.

apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      .
      .
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            max-pods: "30"
            .
            .

Disable Azure network discovery (if using custom images without image-builder)

By default, Azure assigns secondary IP Configurations to the host OS.

This behavior interfere with Azure CNI who needs those free to allocate them to pod netns's/veth's.

Simply create a file in /etc/cloud/cloud.cfg.d/15_azure-vnet.cfg with:

datasource:
  Azure:
    apply_network_config: false

For more information, here's a link to the entire discussion for context.

External Cloud Provider

The "external" or "out-of-tree" cloud provider for Azure is the recommended cloud provider for CAPZ clusters. The "in-tree" cloud provider has been deprecated since v1.20 and only bug fixes are allowed in its Kubernetes repository directory.

Below are instructions to install external cloud provider components on a self-managed cluster using the official helm chart. For more information see the official cloud-provider-azure helm chart documentation.

Grab the CIDR ranges from your cluster by running this kubectl statement against the management cluster:

export CCM_CIDR_BLOCK=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[0]}')
if DUAL_CIDR=$(kubectl get cluster "${CLUSTER_NAME}" -o=jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[1]}' 2> /dev/null); then
  export CCM_CLUSTER_CIDR="${CCM_CLUSTER_CIDR}\,${DUAL_CIDR}"
fi

Then install the Helm chart on the workload cluster:

helm install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=${CLUSTER_NAME} --set "cloudControllerManager.clusterCIDR=${CCM_CIDR_BLOCK}"

Note: When working with Flatcar machines, append --set-string cloudControllerManager.caCertDir=/usr/share/ca-certificates to the cloud-provider-azure helm command. The helm command to install cloud provider azure for Flatcar-flavored workload cluster will be:

helm install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=${CLUSTER_NAME} --set "cloudControllerManager.clusterCIDR=${CCM_CIDR_BLOCK}" --set-string "cloudControllerManager.caCertDir=/usr/share/ca-certificates"

The Helm chart will pick the right version of cloud-controller-manager and cloud-node-manager to work with the version of Kubernetes your cluster is running.

After running helm install, you should eventually see a set of pods like these in a Running state:

kube-system   cloud-controller-manager                                            1/1     Running   0          41s
kube-system   cloud-node-manager-5pklx                                            1/1     Running   0          26s
kube-system   cloud-node-manager-hbbqt                                            1/1     Running   0          30s
kube-system   cloud-node-manager-mfsdg                                            1/1     Running   0          39s
kube-system   cloud-node-manager-qrz74                                            1/1     Running   0          24s

To know more about configuring cloud-provider-azure, see Configuring the Kubernetes Cloud Provider for Azure.

Storage Drivers

Azure File CSI Driver

To install the Azure File CSI driver please refer to the installation guide

Repository: https://github.com/kubernetes-sigs/azurefile-csi-driver

Azure Disk CSI Driver

To install the Azure Disk CSI driver please refer to the installation guide

Repository: https://github.com/kubernetes-sigs/azuredisk-csi-driver

API Server Endpoint

This document describes how to configure your clusters' api server load balancer and IP.

Load Balancer Type

CAPZ supports two load balancer types, Public and Internal.

Public, which is also the default, means that your API Server Load Balancer will have a publicly accessible IP address. This Load Balancer type supports a "public cluster" configuration, which load balances internet source traffic to the apiserver across the cluster's control plane nodes.

Internal means that the API Server endpoint will only be accessible from within the cluster's virtual network (or peered VNets). This configuration supports a "private cluster" configuration, which load balances internal VNET source traffic to the apiserver across the cluster's control plane nodes.

For a more complete "private cluster" template example, you may refer to this reference template that the capz project maintains.

For more information on Azure load balancing, see Load Balancer documentation.

Here is an example of configuring the API Server LB type:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-private-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    apiServerLB:
      type: Internal

Private IP

When using an api server load balancer of type Internal, the default private IP address associated with that load balancer will be 10.0.0.100. If also specifying a custom virtual network, make sure you provide a private IP address that is in the range of your control plane subnet and not in use.

For example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-private-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 172.16.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 172.16.0.0/24
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 172.16.2.0/24
    apiServerLB:
      type: Internal
      frontendIPs:
        - name: lb-private-ip-frontend
          privateIP: 172.16.0.100

Public IP

When using an api server load balancer of type Public, a dynamic public IP address will be created, along with a unique FQDN.

You can also choose to provide your own public api server IP. To do so, specify the existing public IP as follows:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    apiServerLB:
      type: Public
      frontendIPs:
        - name: lb-public-ip-frontend
          publicIP:
            name: my-public-ip
            dnsName: my-cluster-986b4408.eastus.cloudapp.azure.com

Note that dns is the FQDN associated to your public IP address (look for "DNS name" in the Azure Portal).

When you BYO api server IP, CAPZ does not manage its lifecycle, ie. the IP will not get deleted as part of cluster deletion.

Load Balancer SKU

At this time, CAPZ only supports Azure Standard Load Balancers. See SKU comparison for more information on Azure Load Balancers SKUs.

Configuring the Kubernetes Cloud Provider for Azure

The Azure cloud provider has a number of configuration options driven by a file on cluster nodes. This file canonically lives on a node at /etc/kubernetes/azure.json. The Azure cloud provider documentation details the configuration options exposed by this file.

CAPZ automatically generates this file based on user-provided values in AzureMachineTemplate and AzureMachine. All AzureMachines in the same MachineDeployment or control plane will all share a single cloud provider secret, while AzureMachines created inidividually will have their own secret.

For AzureMachineTemplate and standalone AzureMachines, the generated secret will have the name "${RESOURCE}-azure-json", where "${RESOURCE}" is the name of either the AzureMachineTemplate or AzureMachine. The secret will have two data fields: control-plane-azure.json and worker-node-azure.json, with the raw content for that file containing the control plane and worker node data respectively. When the secret ${RESOURCE}-azure-json already exists in the same namespace as an AzureCluster and does not have the label "${CLUSTER_NAME}": "owned", CAPZ will not generate the default described above. Instead it will directly use whatever the user provides in that secret.

Overriding Cloud Provider Config

While many of the cloud provider config values are inferred from the capz infrastructure spec, there are other configuration parameters that cannot be inferred, and hence default to the values set by the azure cloud provider. In order to provider custom values to such configuration options through capz, you must use the spec.cloudProviderConfigOverrides in AzureCluster. The following example overrides the load balancer rate limit configuration:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  location: eastus
  networkSpec:
    vnet:
      name: ${CLUSTER_NAME}-vnet
  resourceGroup: cherry
  subscriptionID: ${AZURE_SUBSCRIPTION_ID}
  cloudProviderConfigOverrides:
    rateLimits:
      - name: "defaultRateLimit"
        config:
          cloudProviderRateLimit: true
          cloudProviderRateLimitBucket: 1
          cloudProviderRateLimitBucketWrite: 1
          cloudProviderRateLimitQPS: 1,
          cloudProviderRateLimitQPSWrite: 1,
      - name: "loadBalancerRateLimit"
        config:
          cloudProviderRateLimit: true
          cloudProviderRateLimitBucket: 2,
          CloudProviderRateLimitBucketWrite: 2,
          cloudProviderRateLimitQPS: 0,
          CloudProviderRateLimitQPSWrite: 0

Confidential VMs

This document describes how to deploy a cluster with Azure Confidential VM nodes.

Limitations

Before you begin, be aware of the following:

Confidential VM Images

One of the limitations of Confidential VMs is that they support specific OS images, as they need to get successfully attested during boot.

Confidential VM images are not included in the list of capi reference images. Before creating a cluster hosted on Azure Confidential VMs, you can create a custom image based on a Confidential VM supported OS image using image-builder. For example, you can run the following to create such an image based on Ubuntu Server 22.04 LTS for CVMs:

$ make -C images/capi build-azure-sig-ubuntu-2204-cvm
# many minutes later...
==> sig-ubuntu-2204-cvm:
Build 'sig-ubuntu-2204-cvm' finished.

==> Builds finished. The artifacts of successful builds are:
--> sig-ubuntu-2204-cvm: Azure.ResourceManagement.VMImage:

OSType: Linux
ManagedImageResourceGroupName: cluster-api-images
ManagedImageName: capi-ubuntu-2204-cvm-1684153817
ManagedImageId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/images/capi-ubuntu-2204-cvm-1684153817
ManagedImageLocation: southcentralus
ManagedImageSharedImageGalleryId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/galleries/ClusterAPI/images/capi-ubuntu-2204-cvm/versions/0.3.1684153817

Example

The below example shows how to deploy a cluster with the control-plane nodes as Confidential VMs. SecurityEncryptionType is set to VMGuestStateOnly (i.e. only the VMGuestState blob will be encrypted), while VTpmEnabled and SecureBootEnabled are both set to true. Make sure to choose a supported VM size (e.g. Standard_DC4as_v5) and OS (e.g. Ubuntu Server 22.04 LTS for Confidential VMs). NOTE: the same can be applied to worker nodes

kind: AzureMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
metadata:
  name: capz-confidential-vms-example
spec:
  template:
    spec:
      image:
        computeGallery:
          subscriptionID: "01234567-89ab-cdef-0123-4567890abcde"
          resourceGroup: "cluster-api-images"
          gallery: "ClusterAPI"
          name: "capi-ubuntu-2204-cvm-1684153817"
          version: "0.3.1684153817"
      securityProfile:
        securityType: "ConfidentialVM"
        uefiSettings:
          vTpmEnabled: true
          secureBootEnabled: true
      osDisk:
        diskSizeGB: 128
        osType: "Linux"
        managedDisk:
          storageAccountType: "Premium_LRS"
          securityProfile:
            securityEncryptionType: "VMGuestStateOnly"
      vmSize: "Standard_DC4as_v5"

Control Plane Outbound Load Balancer

This document describes how to configure your clusters' control plane outbound load balancer.

Public Clusters

For public clusters ie. clusters with api server load balancer type set to Public, CAPZ automatically does not support adding a control plane outbound load balancer. This is because the api server load balancer already allows for outbound traffic in public clusters.

Private Clusters

For private clusters ie. clusters with api server load balancer type set to Internal, CAPZ does not create a control plane outbound load balancer by default. To create a control plane outbound load balancer, include the controlPlaneOutboundLB section with the desired settings.

Here is an example of configuring a control plane outbound load balancer with 1 front end ip for a private cluster:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-private-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    apiServerLB:
      type: Internal
    controlPlaneOutboundLB:
      frontendIPsCount: 1

Custom images

This document will help you get a CAPZ Kubernetes cluster up and running with your custom image.

Reference images

An image defines the operating system and Kubernetes components that will populate the disk of each node in your cluster.

By default, images published by the Cluster API for Azure team are used. These images live in an Azure community gallery.

You can list these reference images with these commands:

# List the image definitions (distro and version)
az sig image-definition list-community \
  --public-gallery-name ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019 \
  --location northcentralus
# List the versions for an image definition (Ubuntu 24.04 for example)
# Version names are Kubernetes releases, such as "1.28.15" or "1.31.2".
az sig image-version list-community \
  --public-gallery-name ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019 \
  --gallery-image-definition capi-ubun2-2404 \
  --location northcentralus

The reference images are replicated to the set of regions used in CAPZ e2e tests. To see if a reference image is available in the location where you intend to provision a cluster, change the value of the --location argument in the previous command.

It is recommended to use the latest patch release of Kubernetes for a supported minor release.

Building a custom image

Cluster API uses the Kubernetes Image Builder tools. You should use the Azure images from that project as a starting point for your custom image.

The Image Builder Book explains how to build the images defined in that repository, with instructions for Azure CAPI Images in particular.

Operating system requirements

For your custom image to work with Cluster API, it must meet the operating system requirements of the bootstrap provider. For example, the default kubeadm bootstrap provider has a set of preflight checks that a VM is expected to pass before it can join the cluster.

Kubernetes version requirements

The reference images are each built to support a specific version of Kubernetes. When using your custom images based on them, take care to match the image to the version: field of the KubeadmControlPlane and MachineDeployment in the YAML template for your workload cluster.

To upgrade to a new Kubernetes release with custom images requires this preparation:

create a new custom image which supports the Kubernetes release version
copy the existing AzureMachineTemplate and change its image: section to reference the new custom image
create the new AzureMachineTemplate on the management cluster
modify the existing KubeadmControlPlane and MachineDeployment to reference the new AzureMachineTemplate and update the version: field to match

See Upgrading clusters for more details.

Creating a cluster from a custom image

To use a custom image, it needs to be referenced in an image: section of your AzureMachineTemplate. See below for more specific examples.

Using Azure Compute Gallery (Recommended)

To use an image from the Azure Compute Gallery, previously known as Shared Image Gallery (SIG), fill in the resourceGroup, name, subscriptionID, gallery, and version fields:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-compute-gallery-example
spec:
  template:
    spec:
      image:
        computeGallery:
          resourceGroup: "cluster-api-images"
          name: "capi-1234567890"
          subscriptionID: "01234567-89ab-cdef-0123-4567890abcde"
          gallery: "ClusterAPI"
          version: "0.3.1234567890"

If you build Azure CAPI images with the make targets in Image Builder, these required values are printed after a successful build. For example:

$ make -C images/capi/ build-azure-sig-ubuntu-1804
# many minutes later...
==> sig-ubuntu-1804:
Build 'sig-ubuntu-1804' finished.

==> Builds finished. The artifacts of successful builds are:
--> sig-ubuntu-1804: Azure.ResourceManagement.VMImage:

OSType: Linux
ManagedImageResourceGroupName: cluster-api-images
ManagedImageName: capi-1234567890
ManagedImageId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/images/capi-1234567890
ManagedImageLocation: southcentralus
ManagedImageSharedImageGalleryId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/galleries/ClusterAPI/images/capi-ubuntu-1804/versions/0.3.1234567890

Please also see the replication recommendations for the Azure Compute Gallery.

If the image you want to use is based on an image released by a third party publisher such as for example Flatcar Linux by Kinvolk, then you need to specify the publisher, offer, and sku fields as well:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-compute-gallery-example
spec:
  template:
    spec:
      image:
        computeGallery:
          resourceGroup: "cluster-api-images"
          name: "capi-1234567890"
          subscriptionID: "01234567-89ab-cdef-0123-4567890abcde"
          gallery: "ClusterAPI"
          version: "0.3.1234567890"
          plan:
            publisher: "kinvolk"
            offer: "flatcar-container-linux-free"
            sku: "stable"

This will make API calls to create Virtual Machines or Virtual Machine Scale Sets to have the Plan correctly set.

Using a community gallery

A "community gallery" is an Azure Compute Gallery with "community" permissions, but it has a globally unique name, is available to all Azure users, and is accessed differently.

The CAPZ project publishes reference images to a community gallery. But as mentioned at the top of this document, they are the default. You don't need to specify a custom image section in your template to use the reference images.

To use an image from a community gallery, set the name, gallery, and version fields:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-community-gallery-example
spec:
  template:
    spec:
      image:
        computeGallery:
          gallery: "ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019"
          name: "capi-ubun2-2404"
          version: "1.31.2"

Using image ID

To use a managed image resource by ID, only the id field must be set:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-image-id-example
spec:
  template:
    spec:
      image:
        id: "/subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/myResourceGroup/providers/Microsoft.Compute/images/myImage"

A managed image resource can be created from a Virtual Machine. Please refer to Azure documentation on creating a managed image for more detail.

Managed images support only 20 simultaneous deployments, so for most use cases Azure Compute Gallery is recommended.

Using Azure Marketplace

To use an image from Azure Marketplace, populate the publisher, offer, sku, and version fields and, if this image is published by a third party publisher, set the thirdPartyImage flag to true so an image Plan can be generated for it. In the case of a third party image, you must accept the license terms with the Azure CLI before consuming it.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-marketplace-example
spec:
  template:
    spec:
      image:
        marketplace:
          publisher: "example-publisher"
          offer: "example-offer"
          sku: "k8s-1dot18dot8-ubuntu-1804"
          version: "2020-07-25"
          thirdPartyImage: true

Using Azure Community Gallery

To use an image from Azure Community Gallery, set name field to gallery's public name and don't set subscriptionID and resourceGroup fields:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-community-gallery-example
spec:
  template:
    spec:
      image:
        computeGallery:
          gallery: testGallery-3282f15c-906a-4c4b-b206-eb3c51adb5be
          name: capi-flatcar-stable-3139.2.0
          version: 0.3.1651499183

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-community-gallery-example
spec:
  template:
    spec:
      image:
        computeGallery:
          gallery: testGallery-3282f15c-906a-4c4b-b206-eb3c51adb5be
          name: capi-flatcar-stable-3139.2.0
          version: 0.3.1651499183
          plan:
            publisher: kinvolk
            offer: flatcar-container-linux-free
            sku: stable

This will make API calls to create Virtual Machines or Virtual Machine Scale Sets to have the Plan correctly set.

In the case of a third party image, you must accept the license terms with the Azure CLI before consuming it.

Example: CAPZ with Mariner Linux

To clarify how to use a custom image, let's look at an example of using Mariner Linux with CAPZ.

Mariner is a minimal, open source Linux distribution, optimized for Azure. The image-builder project has support for building Mariner images.

Build Mariner with image-builder

Populate an az-creds.env file with your Azure credentials:

AZURE_SUBSCRIPTION_ID=xxxxxxx
AZURE_TENANT_ID=xxxxxxx
AZURE_CLIENT_ID=xxxxxxxx
AZURE_CLIENT_SECRET=xxxxxx

Then run image-builder, referencing those credentials as an environment file:

docker run -it --rm --env-file azure-creds.env registry.k8s.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.17 build-azure-sig-mariner-2

The entrypoint to this docker image is make. (You can clone the image-builder repository and run make -C images/capi build-azure-sig-mariner-2 locally if you prefer.)

This makefile target creates an Azure resource group called "cluster-api-images" in southcentralus by default. When it finishes, it will contain an Azure Compute Gallery with a Mariner image.

# skipping output to show just the end of the build...
==> azure-arm.sig-mariner-2: Resource group has been deleted.
==> azure-arm.sig-mariner-2: Running post-processor: manifest
Build 'azure-arm.sig-mariner-2' finished after 18 minutes 2 seconds.

==> Wait completed after 18 minutes 2 seconds

==> Builds finished. The artifacts of successful builds are:
--> azure-arm.sig-mariner-2: Azure.ResourceManagement.VMImage:

OSType: Linux
ManagedImageResourceGroupName: cluster-api-images
ManagedImageName: capi-mariner-2-1689801407
ManagedImageId: /subscriptions/xxxxxxx-xxxx-xxx-xxx/resourceGroups/cluster-api-images/providers/Microsoft.Compute/images/capi-mariner-2-1689801407
ManagedImageLocation: southcentralus
ManagedImageSharedImageGalleryId: /subscriptions/xxxxxxx-xxxx-xxx-xxx/resourceGroups/cluster-api-images/providers/Microsoft.Compute/galleries/ClusterAPI1689801353abcd/images/capi-mariner-2/versions/0.3.1689801407
SharedImageGalleryResourceGroup: cluster-api-images
SharedImageGalleryName: ClusterAPI1689801353abcd
SharedImageGalleryImageName: capi-mariner-2
SharedImageGalleryImageVersion: 0.3.1689801407
SharedImageGalleryReplicatedRegions: southcentralus

Add the Mariner image to a CAPZ cluster template

Edit your cluster template to add image fields to any AzureMachineTemplates:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-control-plane
  namespace: default
spec:
  template:
    spec:
      image:
        computeGallery:
          resourceGroup: cluster-api-images
          name: capi-mariner-2
          subscriptionID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
          gallery: ClusterAPI1689801353abcd
          version: "0.3.1689801407"

The last four fields are the SharedImageGalleryImageName, your Azure subscription ID, the SharedImageGalleryName, and the SharedImageGalleryImageVersion from the final output of the image-builder command above. Make sure to add this image section to both the control plane and worker node AzureMachineTemplates.

Deploy a Mariner cluster

Since our Compute Gallery image lives in southcentralus, our cluster should too. Set AZURE_LOCATION=southcentralus in your environment or in your template.

Now you can deploy your CAPZ Mariner cluster as usual with kubectl apply -f or other means.

Mariner stores CA certificates in an uncommon location, so we need to tell cloud-provider-azure's Helm chart where. Add this argument to the helm command you use to install cloud-provider-azure:

--set-string cloudControllerManager.caCertDir=/etc/pki/tls

That's it! You should now have a CAPZ cluster running Mariner Linux.

Custom Private DNS Zone Name

It is possible to set the DNS zone name to a custom value by setting PrivateDNSZoneName in the NetworkSpec. By default the DNS zone name is ${CLUSTER_NAME}.capz.io.

This feature is enabled only if the apiServerLB.type is Internal

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    privateDNSZoneName: "kubernetes.myzone.com"
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
    apiServerLB:
      type: Internal
      frontendIPs:
        - name: lb-private-ip-frontend
          privateIP: 172.16.0.100
  resourceGroup: cluster-example

Manage DNS Via CAPZ Tool

Private DNS when created by CAPZ can be managed by CAPZ tool itself automatically. To give the flexibility to have BYO as well as managed DNS zone, an enhancement is made that causes all the managed zones created in the CAPZ version before the enhancement changes to be treated as unmanaged. The enhancement is captured in PR 1791

To manage the private DNS via CAPZ please tag it manually from azure portal.

Steps to tag:

Go to azure portal and search for Private DNS zones.
Select the DNS zone that you want to be managed.
Go to Tags section and add key as sigs.k8s.io_cluster-api-provider-azure_cluster_<clustername> and value as owned. (Note: clustername is the name of the cluster that you created)

Custom VM Extensions

Overview

CAPZ allows you to specify custom extensions for your Azure resources. This is useful for running custom scripts or installing custom software on your machines. You can specify custom extensions for the following resources:

AzureMachine
AzureMachinePool

Discovering available extensions

The user is responsible for ensuring that the custom extension is compatible with the underlying image. Many VM extensions are available for use with Azure VMs. To see a complete list, use the Azure CLI command az vm extension image list.

$ az vm extension image list --location westus --output table

Warning

VM extensions are specific to the operating system of the VM. For example, a Linux extension will not work on a Windows VM and vice versa. See the Azure documentation for more information.

Custom extensions for AzureMachine

To specify custom extensions for AzureMachines, you can add them to the spec.template.spec.vmExtensions field of your AzureMachineTemplate. The following fields are available:

name (required): The name of the extension.
publisher (required): The name of the extension publisher.
version (required): The version of the extension.
settings (optional): A set of key-value pairs containing settings for the extension.
protectedSettings (optional): A set of key-value pairs containing protected settings for the extension. The information in this field is encrypted and decrypted only on the VM itself.

For example, the following AzureMachineTemplate spec specifies a custom extension that installs the CustomScript extension on the machine:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: test-machine-template
  namespace: default
spec:
  template:
    spec:
      vmExtensions:
      - name: CustomScript
        publisher: Microsoft.Azure.Extensions
        version: '2.1'
        settings:
          fileUris: https://raw.githubusercontent.com/me/project/hello.sh
        protectedSettings:
          commandToExecute: ./hello.sh

Custom extensions for AzureMachinePool

Similarly, to specify custom extensions for AzureMachinePools, you can add them to the spec.template.vmExtensions field of your AzureMachinePool. For example, the following AzureMachinePool spec specifies a custom extension that installs the CustomScript extension on the machine:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: test-machine-pool
  namespace: default
spec:
  template:
    vmExtensions:
      - name: CustomScript
        publisher: Microsoft.Azure.Extensions
        version: '2.1'
        settings:
          fileUris: https://raw.githubusercontent.com/me/project/hello.sh
        protectedSettings:
          commandToExecute: ./hello.sh

Disks

This section contains information about enabling and configuring different disk types for VMs provisioned in Azure.

Data Disks

This document describes how to specify data disks to be provisioned and attached to VMs provisioned in Azure.

Azure Machine Data Disks

Azure Machines support optionally specifying a list of data disks to be attached to the virtual machine. Each data disk must have:

nameSuffix - the name suffix of the disk to be created. Each disk will be named <machineName>_<nameSuffix> to ensure uniqueness.
diskSizeGB - the disk size in GB.
managedDisk - (optional) the managed disk for a VM (see below)
lun - the logical unit number (see below)

Managed Disk Options

See Introduction to Azure managed disks for more information.

Disk LUN

The LUN specifies the logical unit number of the data disk, between 0 and 63. Its value is used to identify data disks within the VM and therefore must be unique for each data disk attached to a VM.

When adding data disks to a Linux VM, you may encounter errors if a disk does not exist at LUN 0. It is therefore recommended to ensure that the first data disk specified is always added at LUN 0.

See Attaching a disk to a Linux VM on Azure for more information.

IMPORTANT! The lun specified in the AzureMachine Spec must match the LUN used to refer to the device in Kubeadm diskSetup. See below for an example.

Ultra disk support for data disks

If we use StorageAccountType as UltraSSD_LRS in Managed Disks, the ultra disk support will be enabled for the region and zone which supports the UltraSSDAvailable capability.

To check all available vm-sizes in a given region which supports availability zone that has the UltraSSDAvailable capability supported, execute following using Azure CLI:

az vm list-skus -l <location> -z -s <VM-size>

Provided that the chosen region and zone support Ultra disks, Azure Machine objects having Ultra disks specified as Data disks will have their virtual machines created with the AdditionalCapabilities.UltraSSDEnabled additional capability set to true. This capability can also be manually set on the Azure Machine spec and will override the automatically chosen value (if any).

When the chosen StorageAccountType is UltraSSD_LRS, caching is not supported for the disk and the corresponding cachingType field must be set to None. In this configuration, if no value is set, cachingType will be defaulted to None.

See Ultra disk for ultra disk performance and GA scope.

Ultra disk support for Persistent Volumes

First, to check all available vm-sizes in a given region which supports availability zone that has the UltraSSDAvailable capability supported, execute following using Azure CLI:

az vm list-skus -l <location> -z -s <VM-size>

Provided that the chosen region and zone support Ultra disks, Ultra disk based Persistent Volumes can be attached to Pods scheduled on specific Azure Machines, provided that the spec field .spec.additionalCapabilities.ultraSSDEnabled on those Machines has been set to true. NOTE: A misconfiguration or lack this field on the targeted Node's Machine will result in the Pod using the PV be unable to reach the Running Phase.

See Use ultra disks dynamically with a storage class for more information on how to configure an Ultra disk based StorageClass and PersistentVolumeClaim.

See Ultra disk for ultra disk performance and GA scope.

Configuring partitions, file systems and mounts

KubeadmConfig makes it easy to partition, format, and mount your data disk so your Linux VM can use it. Use the diskSetup and mounts options to describe partitions, file systems and mounts.

You may refer to your device as /dev/disk/azure/scsi1/lun<i> where <i> is the LUN.

See cloud-init documentation for more information about cloud-init disk setup.

Example

The below example shows how to create and attach a custom data disk "my_disk" at LUN 1 for every control plane machine, in addition to the etcd data disk. NOTE: the same can be applied to worker machines.

kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
    [...]
    diskSetup:
      partitions:
        - device: /dev/disk/azure/scsi1/lun0
          tableType: gpt
          layout: true
          overwrite: false
        - device: /dev/disk/azure/scsi1/lun1
          tableType: gpt
          layout: true
          overwrite: false
      filesystems:
        - label: etcd_disk
          filesystem: ext4
          device: /dev/disk/azure/scsi1/lun0
          extraOpts:
            - "-E"
            - "lazy_itable_init=1,lazy_journal_init=1"
        - label: ephemeral0
          filesystem: ext4
          device: ephemeral0.1
          replaceFS: ntfs
        - label: my_disk
          filesystem: ext4
          device: /dev/disk/azure/scsi1/lun1
    mounts:
      - - LABEL=etcd_disk
        - /var/lib/etcddisk
      - - LABEL=my_disk
        - /var/lib/mydir
---
kind: AzureMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
  template:
    spec:
      [...]
      dataDisks:
        - nameSuffix: etcddisk
          diskSizeGB: 256
          managedDisk:
            storageAccountType: Standard_LRS
          lun: 0
        - nameSuffix: mydisk
          diskSizeGB: 128
          lun: 1

Disk Encryption

This document describes how to configure different encryption options for disks allocated to VMs provisioned in Azure.

Azure Disk Storage Server-Side Encryption

Azure Disk Storage Server-Side Encryption (SSE) is also referred to as encryption-at-rest. This encryption option does not encrypt temporary disks or disk caches.

When enabled, Azure Disk Storage SSE encrypts data stored on Azure managed disks, i.e. OS and data disks. This option can be enabled using customer-managed keys.

Customer-managed keys must be configured through a Disk Encryption Set (DES) resource. For more information on Azure Disk Storage SSE, please see this link.

Example with OS Disk using DES

When using customer-managed keys, you only need to provide the DES ID within the managedDisk spec.

Note: The DES must be within the same subscription.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: <machine-template-name>
  namespace: <namespace>
spec:
  template:
    spec:
      [...]
      osDisk:
        managedDisk:
          diskEncryptionSet:
            id: <disk_encryption_set_id>
      [...]

Encryption at Host

This encryption option is a VM option enhancing Azure Disk Storage SSE to ensure any temp disk or disk cache is encrypted at rest.

For more information on encryption at host, please see this link.

Example with OS Disk and DES

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: <machine-template-name>
  namespace: <namespace>
spec:
  template:
    spec:
      [...]
      osDisk:
        managedDisk:
          diskEncryptionSet:
            id: <disk_encryption_set_id>
      securityProfile:
        encryptionAtHost: true
      [...]

OS Disk

This document describes how to configure the OS disk for VMs provisioned in Azure.

Managed Disk Options

Storage Account Type

By default, Azure will pick the supported storage account type for your AzureMachine based on the specified VM size. If you'd like to specify a specific storage type, you can do so by specifying a storageAccountType:

        managedDisk:
          storageAccountType: Premium_LRS

Supported values are Premium_LRS, Standard_LRS, and StandardSSDLRS. Note that UltraSSD_LRS can only be used with data disks, it cannot be used with OS Disk.

Also, note that not all Azure VM sizes support Premium storage. To learn more about which sizes are premium storage-compatible, see Sizes for virtual machines in Azure.

See Azure documentation on disk types to learn more about the different storage types.

See Introduction to Azure managed disks for more information on managed disks.

If the optional field diskSizeGB is not provided, it will default to 30GB.

Ephemeral OS

Ephemeral OS uses local VM storage for changes to the OS disk. Storage devices local to the VM host will not be bound by normal managed disk SKU limits. Instead they will always be capable of saturating the VM level limits. This can significantly improve performance on the OS disk. Ephemeral storage used for the OS will not persist between maintenance events and VM redeployments. This is ideal for stateless base OS disks, where any stateful data is kept elsewhere.

There are a few kinds of local storage devices available on Azure VMs. Each VM size will have a different combination. For example, some sizes support premium storage caching, some sizes have a temp disk while others do not, and some sizes have local nvme devices with direct access. Ephemeral OS uses the cache for the VM size, if one exists. Otherwise it will try to use the temp disk if the VM has one. These are the only supported options, and we do not expose the ability to manually choose between these two disks (the default behavior is typically most desirable). This corresponds to the placement property in the Azure Compute REST API.

See the Azure documentation for full details.

Azure Machine DiffDiskSettings

Azure Machines support optionally specifying a field called diffDiskSettings. This mirrors the Azure Compute REST API.

When diffDiskSettings.option is set to Local, ephemeral OS will be enabled. We use the API shape provided by compute directly as they expose other options, although this is the main one relevant at this time.

Known Limitations

Not all SKU sizes support ephemeral OS. CAPZ will query Azure's resource SKUs API to check if the requested VM size supports ephemeral OS. If not, the azuremachine controller will log an event with the corresponding error on the AzureMachine object.

Example

The below example shows how to enable ephemeral OS for a machine template. For control plane nodes, we strongly recommend using etcd data disks to avoid data loss.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      location: ${AZURE_LOCATION}
      osDisk:
        diffDiskSettings:
          option: Local
        diskSizeGB: 30
        managedDisk:
          storageAccountType: Standard_LRS
        osType: Linux
      sshPublicKey: ${AZURE_SSH_PUBLIC_KEY_B64:=""}
      vmSize: ${AZURE_NODE_MACHINE_TYPE}

Dual-stack clusters

Overview

CAPZ enables you to create dual-stack Kubernetes cluster on Microsoft Azure.

Dual-stack support is available for Kubernetes version 1.21.0 and later on Azure.

To deploy a cluster using dual-stack, use the dual-stack flavor template.

Things to try out after the cluster created:

Nodes have 2 internal IPs, one from each IP family.

kubectl get node <node name> -o go-template --template='{{range .status.addresses}}{{printf "%s: %s \n" .type .address}}{{end}}'
Hostname: capi-dual-stack-md-0-j96nr 
InternalIP: 10.1.0.4 
InternalIP: 2001:1234:5678:9abd::4

Nodes have 2 PodCIDRs, one from each IP family.

kubectl get node <node name> -o go-template --template='{{range .spec.podCIDRs}}{{printf "%s\n" .}}{{end}}'
10.244.2.0/24
2001:1234:5678:9a42::/64

Pods have 2 PodIP, one from each IP family.

kubectl get pods <pod name> -o go-template --template='{{range .status.podIPs}}{{printf "%s \n" .ip}}{{end}}' 
10.244.2.37 
2001:1234:5678:9a42::25

Able to reach other pods in cluster using IPv4 and IPv6.

# inside the nginx-pod
/ # ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 8A:B2:32:92:4F:87
          inet addr:10.244.2.2  Bcast:0.0.0.0  Mask:255.255.255.255
          inet6 addr: 2001:1234:5678:9a42::2/128 Scope:Global
          inet6 addr: fe80::88b2:32ff:fe92:4f87/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:1 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:906 (906.0 B)  TX bytes:840 (840.0 B)

/ # ping -c 2 10.244.1.2
PING 10.244.1.2 (10.244.1.2): 56 data bytes
64 bytes from 10.244.1.2: seq=0 ttl=62 time=1.366 ms
64 bytes from 10.244.1.2: seq=1 ttl=62 time=1.396 ms

--- 10.244.1.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.366/1.381/1.396 ms
/ # ping -c 2 2001:1234:5678:9a41::2
PING 2001:1234:5678:9a41::2 (2001:1234:5678:9a41::2): 56 data bytes
64 bytes from 2001:1234:5678:9a41::2: seq=0 ttl=62 time=1.264 ms
64 bytes from 2001:1234:5678:9a41::2: seq=1 ttl=62 time=1.233 ms

--- 2001:1234:5678:9a41::2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.233/1.248/1.264 ms

Externally managed Azure infrastructure

Normally, Cluster API will create infrastructure on Azure when standing up a new workload cluster. However, it is possible to have Cluster API reuse existing Azure infrastructure instead of creating its own infrastructure.

CAPZ supports externally managed cluster infrastructure. If the AzureCluster resource includes a "cluster.x-k8s.io/managed-by" annotation then the controller will skip any reconciliation. This is useful for scenarios where a different persona is managing the cluster infrastructure out-of-band while still wanting to use CAPI for automated machine management.

You should only use this feature if your cluster infrastructure lifecycle management has constraints that the reference implementation does not support. See user stories for more details.

Failure Domains

Failure domains in Azure

A failure domain in the Azure provider maps to an availability zone within an Azure region. In Azure an availability zone is a separate data center within a region that offers redundancy and separation from the other availability zones within a region.

To ensure a cluster (or any application) is resilient to failure it is best to spread instances across all the availability zones within a region. If a zone goes down, your cluster will continue to run as the other 2 zones are physically separated and can continue to run.

Full details of availability zones, regions can be found in the Azure docs.

How to use failure domains

Default Behaviour

By default, only control plane machines get automatically spread to all cluster zones. A workaround for spreading worker machines is to create N MachineDeployments for your N failure domains, scaling them independently. Resiliency to failures comes through having multiple MachineDeployments (see below).

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-0
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-0
      version: ${KUBERNETES_VERSION}
      failureDomain: "1"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-1
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-1
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-1
      version: ${KUBERNETES_VERSION}
      failureDomain: "2"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-2
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-2
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-2
      version: ${KUBERNETES_VERSION}
      failureDomain: "3"

The Cluster API controller will look for the FailureDomains status field and will set the FailureDomain field in a Machine if a value hasn't already been explicitly set. It will try to ensure that the machines are spread across all the failure domains.

The AzureMachine controller looks for a failure domain (i.e. availability zone) to use from the Machine first before failure back to the AzureMachine. This failure domain is then used when provisioning the virtual machine.

Explicit Placement

If you would rather control the placement of virtual machines into a failure domain (i.e. availability zones) then you can explicitly state the failure domain. The best way is to specify this using the FailureDomain field within the Machine (or MachineDeployment) spec.

DEPRECATION NOTE: Failure domains were introduced in v1alpha3. Prior to this you might have used the AvailabilityZone on the AzureMachine. This has been deprecated in v1alpha3, and now removed in v1beta1. Please update your definitions and use FailureDomain instead.

For example:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Machine
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: my-cluster
    cluster.x-k8s.io/control-plane: "true"
  name: controlplane-0
  namespace: default
spec:
  version: "v1.22.1"
  clusterName: my-cluster
  failureDomain: "1"
  bootstrap:
    configRef:
        apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
        kind: KubeadmConfigTemplate
        name: my-cluster-md-0
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureMachineTemplate
    name: my-cluster-md-0

If you can't use Machine (or MachineDeployment) to explicitly place your VMs (for example, KubeadmControlPlane does not accept those as an object reference but rather uses AzureMachineTemplate directly), then you can opt to restrict the announcement of discovered failure domains from the cluster's status itself.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-cluster
  namespace: default
spec:
  location: eastus
  failureDomains:
    1:
      controlPlane: true

Using Virtual Machine Scale Sets

You can use an AzureMachinePool object to deploy a Virtual Machine Scale Set which automatically distributes VM instances across the configured availability zones. Set the FailureDomains field to the list of availability zones that you want to use. Be aware that not all regions have the same availability zones. You can use az vm list-skus -l <location> --zone -o table to list all the available zones per vm size in that location/region.

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: my-cluster
  name: ${CLUSTER_NAME}-vmss-0
  namespace: default
spec:
  clusterName: my-cluster
  failureDomains:
    - "1"
    - "3"
  replicas: 3
  template:
    spec:
      clusterName: my-cluster
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-vmss-0
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachinePool
        name: ${CLUSTER_NAME}-vmss-0
      version: ${KUBERNETES_VERSION}
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: my-cluster
  name: ${CLUSTER_NAME}-vmss-0
  namespace: default
spec:
  location: westeurope
  template:
    osDisk:
      diskSizeGB: 30
      osType: Linux
    vmSize: Standard_B2s

Availability sets when there are no failure domains

Although failure domains provide protection against datacenter failures, not all azure regions support availability zones. In such cases, azure availability sets can be used to provide redundancy and high availability.

When cluster api detects that the region has no failure domains, it creates availability sets for different groups of virtual machines. The virtual machines, when created, are assigned an availability set based on the group they belong to.

The availability sets created are as follows:

For control plane vms, an availability set will be created and suffixed with the string "control-plane".
For worker node vms, an availability set will be created for each machine deployment or machine set, and suffixed with the name of the machine deployment or machine set. Important note: make sure that the machine deployment's Spec.Template.Labels field includes the "cluster.x-k8s.io/deployment-name" label. It will not have this label by default if the machine deployment was created with a custom Spec.Selector.MatchLabels field. A machine set should have a Spec.Template.Labels field which includes "cluster.x-k8s.io/set-name".

Consider the following cluster configuration:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  labels:
    cni: calico
  name: ${CLUSTER_NAME}
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: ${CLUSTER_NAME}-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureCluster
    name: ${CLUSTER_NAME}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-0
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-0
      version: ${KUBERNETES_VERSION}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-1
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-1
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-1
      version: ${KUBERNETES_VERSION}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: ${CLUSTER_NAME}-md-2
  namespace: default
spec:
  clusterName: ${CLUSTER_NAME}
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: ${CLUSTER_NAME}-md-2
      clusterName: ${CLUSTER_NAME}
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachineTemplate
        name: ${CLUSTER_NAME}-md-2
      version: ${KUBERNETES_VERSION}

In the example above, there will be 4 availability sets created, 1 for the control plane, and 1 for each of the 3 machine deployments.

Flatcar Clusters

Overview

CAPZ enables you to create Kubernetes clusters using Flatcar Container Linux on Microsoft Azure. Flatcar Container Linux comes in two flavors:

The `flatcar-sysext` flavor (recommended)

This variant relies on a vanilla Flatcar marketplace image which leverages the systemd-sysext feature to install and update Kubernetes components. The Kubernetes version is not bound to the Flatcar version (i.e. Flatcar can be upgraded independently from Kubernetes and vice versa).

The template comes with a systemd-sysupdate configuration file that will download each new patch version of Kubernetes (i.e. if you start with Kubernetes 1.x.y, systemd-sysupdate will automatically pull 1.x.y+1 but not 1.x+1.y). Please note that this behavior is disabled by default. To enable the Kubernetes auto-update you can:

Update the template to enable the systemd-sysupdate.timer
Or run the following command on the nodes: sudo systemctl enable --now systemd-sysupdate.timer

When the Kubernetes release reaches end-of-life it will not receive updates anymore. To switch to a new major version, do a sudo rm /etc/sysupdate.kubernetes.d/kubernetes-*.conf and download the new update config into the folder with cd /etc/sysupdate.kubernetes.d && sudo wget https://github.com/flatcar/sysext-bakery/releases/download/latest/kubernetes-${KUBERNETES_VERSION%.*}.conf.

To coordinate the node reboot, we recommend using Kured. Note that running kubeadm upgrade apply on the first controller and kubeadm upgrade node on all other nodes is not automated (yet): see the docs.

Find the latest published images:

az vm image list --offer flatcar-container-linux-corevm-amd64 --publisher kinvolk --sku stable-gen2 -o table --all
Architecture    Offer                                 Publisher    Sku          Urn                                                                Version
--------------  ------------------------------------  -----------  -----------  -----------------------------------------------------------------  ---------
...
x64             flatcar-container-linux-corevm-amd64  kinvolk      stable-gen2  kinvolk:flatcar-container-linux-corevm-amd64:stable-gen2:3975.2.0  3975.2.0
x64             flatcar-container-linux-corevm-amd64  kinvolk      stable-gen2  kinvolk:flatcar-container-linux-corevm-amd64:stable-gen2:3975.2.1  3975.2.1
x64             flatcar-container-linux-corevm-amd64  kinvolk      stable-gen2  kinvolk:flatcar-container-linux-corevm-amd64:stable-gen2:3975.2.2  3975.2.2

The `flatcar` flavor

This variant relies on a Flatcar image built using the image-builder project. The Kubernetes version is bound to the Flatcar version and a rebuild of the image is required for each Kubernetes or Flatcar upgrade.

Image creation

The testing reference images are built using image-builder by Flatcar maintainers and published to the Flatcar CAPI Community Gallery on Azure with community gallery name flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0.

Find the latest published images:

$ az sig image-definition list-community --location westeurope --public-gallery-name flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0 --only-show-errors -o table
HyperVGeneration    Location    Name                                OsState      OsType    UniqueId
------------------  ----------  ----------------------------------  -----------  --------  ---------------------------------------------------------------------------------------------------------------
V2                  westeurope  flatcar-stable-amd64-capi-v1.23.13  Generalized  Linux     /CommunityGalleries/flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0/Images/flatcar-stable-amd64-capi-v1.23.13
V2                  westeurope  flatcar-stable-amd64-capi-v1.25.4   Generalized  Linux     /CommunityGalleries/flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0/Images/flatcar-stable-amd64-capi-v1.25.4
V2                  westeurope  flatcar-stable-amd64-capi-v1.26.0   Generalized  Linux     /CommunityGalleries/flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0/Images/flatcar-stable-amd64-capi-v1.26.0
$
$ az sig image-version list-community --location westeurope --public-gallery-name flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0 --only-show-errors --gallery-image-definition flatcar-stable-amd64-capi-v1.26.0
ExcludeFromLatest    Location    Name      PublishedDate                     UniqueId
-------------------  ----------  --------  --------------------------------  --------------------------------------------------------------------------------------------------------------------------------
False                westeurope  3227.2.3  2022-12-09T18:05:58.830464+00:00  /CommunityGalleries/flatcar4capi-742ef0cb-dcaa-4ecb-9cb0-bfd2e43dccc0/Images/flatcar-stable-amd64-capi-v1.26.0/Versions/3227.2.3

If you would like customize your images please refer to the documentation on building your own custom images.

Trying it out

To create a cluster using Flatcar Container Linux, use flatcar or flatcar-sysext cluster flavor.

Note: When working with Flatcar machines, append --set-string cloudControllerManager.caCertDir=/usr/share/ca-certificates to the cloud-provider-azure helm command. Refer "External Cloud Provider's Note for flatcar-flavored machine"
- However, no changes are needed when using tilt to bring up Flatcar workload clusters.

GPU-enabled clusters

Overview

With CAPZ you can create GPU-enabled Kubernetes clusters on Microsoft Azure.

Before you begin, be aware that:

Scheduling GPUs is a Kubernetes beta feature
NVIDIA GPUs are supported on Azure NC-series, NV-series, and NVv3-series VMs
NVIDIA GPU Operator allows administrators of Kubernetes clusters to manage GPU nodes just like CPU nodes in the cluster.

To deploy a cluster with support for GPU nodes, use the nvidia-gpu flavor.

An example GPU cluster

Let's create a CAPZ cluster with an N-series node and run a GPU-powered vector calculation.

Generate an nvidia-gpu cluster template

Use the clusterctl generate cluster command to generate a manifest that defines your GPU-enabled workload cluster.

Remember to use the nvidia-gpu flavor with N-series nodes.

AZURE_CONTROL_PLANE_MACHINE_TYPE=Standard_B2s \
AZURE_NODE_MACHINE_TYPE=Standard_NC6s_v3 \
AZURE_LOCATION=southcentralus \
clusterctl generate cluster azure-gpu \
  --kubernetes-version=v1.22.1 \
  --worker-machine-count=1 \
  --flavor=nvidia-gpu > azure-gpu-cluster.yaml

Create the cluster

Apply the manifest from the previous step to your management cluster to have CAPZ create a workload cluster:

$ kubectl apply -f azure-gpu-cluster.yaml
cluster.cluster.x-k8s.io/azure-gpu serverside-applied
azurecluster.infrastructure.cluster.x-k8s.io/azure-gpu serverside-applied
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/azure-gpu-control-plane serverside-applied
azuremachinetemplate.infrastructure.cluster.x-k8s.io/azure-gpu-control-plane serverside-applied
machinedeployment.cluster.x-k8s.io/azure-gpu-md-0 serverside-applied
azuremachinetemplate.infrastructure.cluster.x-k8s.io/azure-gpu-md-0 serverside-applied
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/azure-gpu-md-0 serverside-applied

Wait until the cluster and nodes are finished provisioning...

$ kubectl get cluster azure-gpu
NAME        PHASE
azure-gpu   Provisioned
$ kubectl get machines
NAME                             PROVIDERID                                                                                                                                     PHASE     VERSION
azure-gpu-control-plane-t94nm    azure:////subscriptions/<subscription_id>/resourceGroups/azure-gpu/providers/Microsoft.Compute/virtualMachines/azure-gpu-control-plane-nnb57   Running   v1.22.1
azure-gpu-md-0-f6b88dd78-vmkph   azure:////subscriptions/<subscription_id>/resourceGroups/azure-gpu/providers/Microsoft.Compute/virtualMachines/azure-gpu-md-0-gcc8v            Running   v1.22.1

... and then you can install a CNI of your choice.

Once all nodes are Ready, install the official NVIDIA gpu-operator via Helm.

Install nvidia gpu-operator Helm chart

If you don't have helm, installation instructions for your environment can be found here.

First, grab the kubeconfig from your newly created cluster and save it to a file:

$ clusterctl get kubeconfig azure-gpu > ./azure-gpu-cluster.conf

Now we can use Helm to install the official chart:

$ helm install --kubeconfig ./azure-gpu-cluster.conf --repo https://helm.ngc.nvidia.com/nvidia gpu-operator --generate-name

The installation of GPU drivers via gpu-operator will take several minutes. Coffee or tea may be appropriate at this time.

After a time, you may run the following command against the workload cluster to check if all the gpu-operator resources are installed:

$ kubectl --kubeconfig ./azure-gpu-cluster.conf get pods -o wide | grep 'gpu\|nvidia'
NAMESPACE          NAME                                                              READY   STATUS      RESTARTS   AGE     IP               NODE                                      NOMINATED NODE   READINESS GATES
default            gpu-feature-discovery-r6zgh                                       1/1     Running     0          7m21s   192.168.132.75   azure-gpu-md-0-gcc8v            <none>           <none>
default            gpu-operator-1674686292-node-feature-discovery-master-79d8pbcg6   1/1     Running     0          8m15s   192.168.96.7     azure-gpu-control-plane-nnb57   <none>           <none>
default            gpu-operator-1674686292-node-feature-discovery-worker-g9dj2       1/1     Running     0          8m15s   192.168.132.66   gpu-md-0-gcc8v            <none>           <none>
default            gpu-operator-95b545d6f-rmlf2                                      1/1     Running     0          8m15s   192.168.132.67   gpu-md-0-gcc8v            <none>           <none>
default            nvidia-container-toolkit-daemonset-hstgw                          1/1     Running     0          7m21s   192.168.132.70   gpu-md-0-gcc8v            <none>           <none>
default            nvidia-cuda-validator-pdmkl                                       0/1     Completed   0          3m47s   192.168.132.74   azure-gpu-md-0-gcc8v            <none>           <none>
default            nvidia-dcgm-exporter-wjm7p                                        1/1     Running     0          7m21s   192.168.132.71   azure-gpu-md-0-gcc8v            <none>           <none>
default            nvidia-device-plugin-daemonset-csv6k                              1/1     Running     0          7m21s   192.168.132.73   azure-gpu-md-0-gcc8v            <none>           <none>
default            nvidia-device-plugin-validator-gxzt2                              0/1     Completed   0          2m49s   192.168.132.76   azure-gpu-md-0-gcc8v            <none>           <none>
default            nvidia-driver-daemonset-zww52                                     1/1     Running     0          7m46s   192.168.132.68   azure-gpu-md-0-gcc8v            <none>           <none>
default            nvidia-operator-validator-kjr6m                                   1/1     Running     0          7m21s   192.168.132.72   azure-gpu-md-0-gcc8v            <none>           <none>

You should see all pods in either a state of Running or Completed. If that is the case, then you know the driver installation and GPU node configuration is successful.

Then run the following commands against the workload cluster to verify that the NVIDIA device plugin has initialized and the nvidia.com/gpu resource is available:

$ kubectl --kubeconfig ./azure-gpu-cluster.conf get nodes
NAME                            STATUS   ROLES    AGE   VERSION
azure-gpu-control-plane-nnb57   Ready    master   42m   v1.22.1
azure-gpu-md-0-gcc8v            Ready    <none>   38m   v1.22.1
$ kubectl --kubeconfig ./azure-gpu-cluster.conf get node azure-gpu-md-0-gcc8v -o jsonpath={.status.allocatable} | jq
{
  "attachable-volumes-azure-disk": "12",
  "cpu": "6",
  "ephemeral-storage": "119716326407",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "memory": "115312060Ki",
  "nvidia.com/gpu": "1",
  "pods": "110"
}

Run a test app

Let's create a pod manifest for the cuda-vector-add example from the Kubernetes documentation and deploy it:

$ cat > cuda-vector-add.yaml << EOF
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
      image: "registry.k8s.io/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU
EOF
$ kubectl --kubeconfig ./azure-gpu-cluster.conf apply -f cuda-vector-add.yaml

The container will download, run, and perform a CUDA calculation with the GPU.

$ kubectl get po cuda-vector-add
cuda-vector-add   0/1     Completed   0          91s
$ kubectl logs cuda-vector-add
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

If you see output like the above, your GPU cluster is working!

IPv6 clusters

Overview

CAPZ enables you to create IPv6 Kubernetes clusters on Microsoft Azure.

IPv6 support is available for Kubernetes version 1.18.0 and later on Azure.
IPv6 support is in beta as of Kubernetes version 1.18 in Kubernetes community.

To deploy a cluster using IPv6, use the ipv6 flavor template.

Warning

Action required: The Azure DNS nameserver is only IPv4. If the coredns pod runs on the pod network, it will fail to resolve. The workaround is to edit the coredns deployment and add hostNetwork: true, so it can leverage host routes for the v4 network to do the DNS resolution.

kubectl patch deploy/coredns -n kube-system --type=merge -p '{"spec": {"template": {"spec":{"hostNetwork": true}}}}'

Things to try out after the cluster created:

Nodes are Kubernetes version 1.18.0 or later
Nodes have an IPv6 Internal-IP

kubectl get nodes -o wide
NAME                         STATUS   ROLES    AGE   VERSION   INTERNAL-IP              EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
ipv6-0-control-plane-8xqgw   Ready    master   53m   v1.18.8   2001:1234:5678:9abc::4   <none>        Ubuntu 18.04.5 LTS   5.3.0-1034-azure   containerd://1.3.4
ipv6-0-control-plane-crpvf   Ready    master   49m   v1.18.8   2001:1234:5678:9abc::5   <none>        Ubuntu 18.04.5 LTS   5.3.0-1034-azure   containerd://1.3.4
ipv6-0-control-plane-nm5v9   Ready    master   46m   v1.18.8   2001:1234:5678:9abc::6   <none>        Ubuntu 18.04.5 LTS   5.3.0-1034-azure   containerd://1.3.4
ipv6-0-md-0-7k8vm            Ready    <none>   49m   v1.18.8   2001:1234:5678:9abd::5   <none>        Ubuntu 18.04.5 LTS   5.3.0-1034-azure   containerd://1.3.4
ipv6-0-md-0-mwfpt            Ready    <none>   50m   v1.18.8   2001:1234:5678:9abd::4   <none>        Ubuntu 18.04.5 LTS   5.3.0-1034-azure   containerd://1.3.4

Nodes have 2 internal IPs, one from each IP family. IPv6 clusters on Azure run on dual-stack hosts. The IPv6 is the primary IP.

kubectl get nodes ipv6-0-md-0-7k8vm -o go-template --template='{{range .status.addresses}}{{printf "%s: %s \n" .type .address}}{{end}}'
Hostname: ipv6-0-md-0-7k8vm
InternalIP: 2001:1234:5678:9abd::5
InternalIP: 10.1.0.5

Nodes have an IPv6 PodCIDR

kubectl get nodes ipv6-0-md-0-7k8vm -o go-template --template='{{.spec.podCIDR}}'
2001:1234:5678:9a40:200::/72

Pods have an IPv6 IP

kubectl get pods nginx-f89759699-h65lt -o go-template --template='{{.status.podIP}}'
2001:1234:5678:9a40:300::1f

Able to reach other pods in cluster using IPv6

# inside the nginx-pod
#  # ifconfig eth0
  eth0      Link encap:Ethernet  HWaddr 3E:DA:12:82:4C:C2
            inet6 addr: fe80::3cda:12ff:fe82:4cc2/64 Scope:Link
            inet6 addr: 2001:1234:5678:9a40:100::4/128 Scope:Global
            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
            RX packets:15 errors:0 dropped:0 overruns:0 frame:0
            TX packets:20 errors:0 dropped:1 overruns:0 carrier:0
            collisions:0 txqueuelen:0
            RX bytes:1562 (1.5 KiB)  TX bytes:1832 (1.7 KiB)
# ping 2001:1234:5678:9a40::2
PING 2001:1234:5678:9a40::2 (2001:1234:5678:9a40::2): 56 data bytes
64 bytes from 2001:1234:5678:9a40::2: seq=0 ttl=62 time=1.690 ms
64 bytes from 2001:1234:5678:9a40::2: seq=1 ttl=62 time=1.009 ms
64 bytes from 2001:1234:5678:9a40::2: seq=2 ttl=62 time=1.388 ms
64 bytes from 2001:1234:5678:9a40::2: seq=3 ttl=62 time=0.925 ms

Kubernetes services have IPv6 ClusterIP and ExternalIP

kubectl get svc
NAME            TYPE           CLUSTER-IP   EXTERNAL-IP           PORT(S)          AGE
kubernetes      ClusterIP      fd00::1      <none>                443/TCP          94m
nginx-service   LoadBalancer   fd00::4a12   2603:1030:805:2::b    80:32136/TCP     40m

Able to reach the workload on IPv6 ExternalIP

NOTE: this will only work if your ISP has IPv6 enabled. Alternatively, you can connect from an Azure VM with IPv6.

curl [2603:1030:805:2::b] -v
* Rebuilt URL to: [2603:1030:805:2::b]/
*   Trying 2603:1030:805:2::b...
* TCP_NODELAY set
* Connected to 2603:1030:805:2::b (2603:1030:805:2::b) port 80 (#0)
> GET / HTTP/1.1
> Host: [2603:1030:805:2::b]
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.17.0
< Date: Fri, 18 Sep 2020 23:07:12 GMT
< Content-Type: text/html
< Content-Length: 612
< Last-Modified: Tue, 21 May 2019 15:33:12 GMT
< Connection: keep-alive
< ETag: "5ce41a38-264"
< Accept-Ranges: bytes

Known Limitations

The reference ipv6 flavor takes care of most of these for you, but it is important to be aware of these if you decide to write your own IPv6 cluster template, or use a different bootstrap provider.

Kubernetes version needs to be 1.18+
The coredns pod needs to run on the host network, so it can leverage host routes for the v4 network to do the DNS resolution. The workaround is to edit the coredns deployment and add hostNetwork: true:

kubectl patch deploy/coredns -n kube-system --type=merge -p '{"spec": {"template": {"spec":{"hostNetwork": true}}}}'

When using Calico CNI, the selected pod’s subnet should be part of your Azure virtual network IP range.

MachinePools

Feature status: Experimental (Beta)
Feature gate: MachinePool=true
Default value: true (enabled)

In Cluster API (CAPI) v1alpha2, users can create MachineDeployment, MachineSet or Machine custom resources. When you create a MachineDeployment or MachineSet, Cluster API components react and eventually Machine resources are created. Cluster API's current architecture mandates that a Machine maps to a single machine (virtual or bare metal) with the provider being responsible for the management of the underlying machine's infrastructure.

Nearly all infrastructure providers have a way for their users to manage a group of machines (virtual or bare metal) as a single entity. Each infrastructure provider offers their own unique features, but nearly all are concerned with managing availability, health, and configuration updates.

A MachinePool is similar to a MachineDeployment in that they both define configuration and policy for how a set of machines are managed. They Both define a common configuration, number of desired machine replicas, and policy for update. Both types also combine information from Kubernetes as well as the underlying provider infrastructure to give a view of the overall health of the machines in the set.

MachinePool diverges from MachineDeployment in that the MachineDeployment controller uses MachineSets to achieve the aforementioned desired number of machines and to orchestrate updates to the Machines in the managed set, while MachinePool delegates the responsibility of these concerns to an infrastructure provider specific resource such as AWS Auto Scale Groups, GCP Managed Instance Groups, and Azure Virtual Machine Scale Sets.

MachinePool is optional and doesn't replace the need for MachineSet/Machine since not every infrastructure provider will have an abstraction for managing multiple machines (i.e. bare metal). Users may always opt to choose MachineSet/Machine when they don't see additional value in MachinePool for their use case.

Source: MachinePool API Proposal

In Cluster API (CAPI) v1.7.0, MachinePool feature flag's default settings become true and promoted to Beta from Alpha. To disable this feature, you can use EXP_MACHINE_POOL=false.

Source: Cluster API v1.7.0 Release Note

AzureMachinePool

Cluster API Provider Azure (CAPZ) has experimental support for MachinePool through the infrastructure types AzureMachinePool and AzureMachinePoolMachine. An AzureMachinePool corresponds to a Virtual Machine Scale Set (VMSS), which provides the cloud provider-specific resource for orchestrating a group of Virtual Machines. The AzureMachinePoolMachine corresponds to a virtual machine instance within the VMSS.

Orchestration Modes

Azure Virtual Machine Scale Sets support two orchestration modes: Uniform and Flexible. CAPZ defaults to Uniform mode. See VMSS Orchestration modes in Azure for more information.

To use Flexible mode requires Kubernetes v1.26.0 or later. Ensure that orchestrationMode on the AzureMachinePool spec is set:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: capz-mp-0
spec:
  orchestrationMode: Flexible

Then, after applying the template to start provisioning, install the cloud-provider-azure Helm chart to the workload cluster.

Safe Rolling Upgrades and Delete Policy

AzureMachinePools provides the ability to safely deploy new versions of Kubernetes, or more generally, changes to the Virtual Machine Scale Set model, e.g., updating the OS image run by the virtual machines in the scale set. For example, if a cluster operator wanted to change the Kubernetes version of the MachinePool, they would update the Version field on the MachinePool, then AzureMachinePool would respond by rolling out the new OS image for the specified Kubernetes version to each of the virtual machines in the scale set progressively cordon, draining, then replacing the machine. This enables AzureMachinePools to upgrade the underlying pool of virtual machines with minimal interruption to the workloads running on them.

AzureMachinePools also provides the ability to specify the order of virtual machine deletion.

Describing the Deployment Strategy

Below we see a partially described AzureMachinePool. The strategy field describes the AzureMachinePoolDeploymentStrategy. At the time of writing this, there is only one strategy type, RollingUpdate, which provides the ability to specify delete policy, max surge, and max unavailable.

deletePolicy: provides three options for order of deletion Oldest, Newest, and Random
maxSurge: provides the ability to specify how many machines can be added in addition to the current replica count during an upgrade operation. This can be a percentage, or a fixed number.
maxUnavailable: provides the ability to specify how many machines can be unavailable at any time. This can be a percentage, or a fixed number.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: capz-mp-0
spec:
  strategy:
    rollingUpdate:
      deletePolicy: Oldest
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate

AzureMachinePoolMachines

AzureMachinePoolMachine represents a virtual machine in the scale set. AzureMachinePoolMachines are created by the AzureMachinePool controller and are used to track the life cycle of a virtual machine in the scale set. When a AzureMachinePool is created, each virtual machine instance will be represented as a AzureMachinePoolMachine resource. A cluster operator can delete the AzureMachinePoolMachine resource if they would like to delete a specific virtual machine from the scale set. This is useful if one would like to manually control upgrades and rollouts through CAPZ.

Using `clusterctl` to deploy

To deploy a MachinePool / AzureMachinePool via clusterctl generate there's a flavor for that.

Make sure to set up your Azure environment as described here.

clusterctl generate cluster my-cluster --kubernetes-version v1.22.0 --flavor machinepool > my-cluster.yaml

The template used for this flavor is located here.

Example MachinePool, AzureMachinePool and KubeadmConfig Resources

Below is an example of the resources needed to create a pool of Virtual Machines orchestrated with a Virtual Machine Scale Set.

---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: capz-mp-0
spec:
  clusterName: capz
  replicas: 2
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfig
          name: capz-mp-0
      clusterName: capz
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureMachinePool
        name: capz-mp-0
      version: v1.22.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: capz-mp-0
spec:
  location: westus2
  strategy:
    rollingUpdate:
      deletePolicy: Oldest
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${YOUR_SSH_PUB_KEY}
    vmSize: Standard_D2s_v3
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
  name: capz-mp-0
spec:
  files:
  - content: |
      {
        "cloud": "AzurePublicCloud",
        "tenantId": "tenantID",
        "subscriptionId": "subscriptionID",
        "aadClientId": "clientID",
        "aadClientSecret": "secret",
        "resourceGroup": "capz",
        "securityGroupName": "capz-node-nsg",
        "location": "westus2",
        "vmType": "vmss",
        "vnetName": "capz-vnet",
        "vnetResourceGroup": "capz",
        "subnetName": "capz-node-subnet",
        "routeTableName": "capz-node-routetable",
        "loadBalancerSku": "Standard",
        "maximumLoadBalancerRuleCount": 250,
        "useManagedIdentityExtension": false,
        "useInstanceMetadata": true
      }
    owner: root:root
    path: /etc/kubernetes/azure.json
    permissions: "0644"
  joinConfiguration:
    nodeRegistration:
      name: '{{ ds.meta_data["local_hostname"] }}'

Node Outbound

This document describes how to configure your clusters' node outbound traffic.

IPv4 Clusters

For IPv4 clusters ie. clusters with CIDR type is IPv4, CAPZ automatically configures a NAT gateway for node outbound traffic with the default settings. Default, the cluster is IPv4 type unless you specify the CIDR to be an IPv6 address.

To provide custom settings for a node NAT gateway, you can configure the NAT gateway in the node subnets section of cluster configuration by setting the NAT gateway's name. A Public IP will also be created for the NAT gateway once the NAT gateway name is provided.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-natgw
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
    subnets:
      - name: subnet-cp
        role: control-plane
      - name: subnet-node
        role: node
        natGateway:
          name: node-natgw
          NatGatewayIP:
            name: pip-cluster-natgw-subnet-node-natgw
  resourceGroup: cluster-natgw

You can also specify the Public IP name that should be used when creating the Public IP for the NAT gateway. If you don't specify it, CAPZ will automatically generate a name for it.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-natgw
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
    subnets:
      - name: subnet-cp
        role: control-plane
      - name: subnet-node-1
        role: node
        natGateway:
          name: node-natgw-1
          NatGatewayIP:
            name: pip-cluster-natgw-subnet-node-natgw-1
      - name: subnet-node-2
        role: node
        natGateway:
          name: node-natgw-2
          NatGatewayIP:
            name: pip-cluster-natgw-subnet-node-natgw-2
  resourceGroup: cluster-natgw

IPv6 Clusters

For IPv6 clusters ie. clusters with CIDR type is IPv6, NAT gateway is not supported for IPv6 cluster. IPv6 cluster uses load balancer for outbound connections.

Public IPv6 Clusters

For public IPv6 clusters ie. clusters with api server load balancer type set to Public and CIDR type set to IPv6, CAPZ automatically configures a node outbound load balancer with the default settings.

To provide custom settings for the node outbound load balancer, use the nodeOutboundLB section in cluster configuration.

The idleTimeoutInMinutes specifies the number of minutes to keep a TCP connection open for the outbound rule (defaults to 4). See here for more details.

Here is an example of a node outbound load balancer with frontendIPsCount set to 3. CAPZ will read this value and create 3 front end ips for this load balancer.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-public-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    apiServerLB:
      type: Public
    subnets:
    - cidrBlocks:
      - 2001:0DB8:0000:1/64
      name: subnet-node
      role: node
    nodeOutboundLB:
      frontendIPsCount: 3
      idleTimeoutInMinutes: 4

Private IPv6 Clusters

For private IPv6 clusters ie. clusters with api server load balancer type set to Internal and CIDR type set to IPv6, CAPZ does not create a node outbound load balancer by default. To create a node outbound load balancer, include the nodeOutboundLB section with the desired settings.

Here is an example of configuring a node outbound load balancer with 1 front end ip for a private IPv6 cluster:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: my-private-cluster
  namespace: default
spec:
  location: eastus
  networkSpec:
    apiServerLB:
      type: Internal
    subnets:
    - cidrBlocks:
      - 2001:0DB8:0000:1/64
      name: subnet-node
      role: node
    nodeOutboundLB:
      frontendIPsCount: 1

Spot Virtual Machines

Azure Spot Virtual Machines allow users to reduce the costs of their compute resources by utilising Azure's spare capacity for a lower price.

With this lower cost, comes the risk of preemption. When capacity within a particular Availability Zone is increased, Azure may need to reclaim Spot Virtual Machines to satisfy the demand on their data centres.

When should I use Spot Virtual Machines?

Spot Virtual Machines are ideal for workloads that can be interrupted. For example, short jobs or stateless services that can be rescheduled quickly, without data loss, and resume operation with limited degradation to a service.

How do I use Spot Virtual Machines?

To enable a Machine to be backed by a Spot Virtual Machine, add spotVMOptions to your AzureMachineTemplate:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: capz-md-0
spec:
  location: westus2
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${YOUR_SSH_PUB_KEY}
    vmSize: Standard_B2s
    spotVMOptions: {}

You may also add a maxPrice to the options to limit the maximum spend for the instance. It is however, recommended not to set a maxPrice as Azure will cap your spending at the on-demand price if this field is left empty and you will experience fewer interruptions.

spec:
  template:
    spotVMOptions:
      maxPrice: 0.04 # Price in USD per hour (up to 5 decimal places)

In addition, you are able to explicitly set the eviction policy for the Spot VM. The default policy is Deallocate which will deallocate the VM when it is evicted. You can also set the policy to Delete which will delete the VM when it is evicted.

spec:
  template:
    spotVMOptions:
      evictionPolicy: Delete # or Deallocate

The experimental MachinePool also supports using spot instances. To enable a MachinePool to be backed by spot instances, add spotVMOptions to your AzureMachinePool spec:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: capz-mp-0
spec:
  location: westus2
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${YOUR_SSH_PUB_KEY}
    vmSize: Standard_B2s
    spotVMOptions: {}

SSH access to nodes

This document describes how to get SSH access to virtual machines that are part of a CAPZ cluster.

In order to get SSH access to a Virtual Machine on Azure, two requirements have to be met:

get network-level access to the SSH service
get authentication sorted

This documents describe some possible strategies to fulfill both requirements.

Network Access

Default behavior

By default, control plane VMs have SSH access allowed from any source in their Network Security Groups. Also by default, VMs don't have a public IP address assigned.

To get SSH access to one of the control plane VMs you can use the API Load Balancer's IP, because by default an Inbound NAT Rule is created to route traffic coming to the load balancer on TCP port 22 (the SSH port) to one of the nodes with role master in the workload cluster.

This of course works only for clusters that are using a Public Load Balancer.

In order to reach all other VMs, you can use the NATted control plane VM as a bastion host and use the private IP address for the other nodes.

For example, let's consider this CAPZ cluster (using a Public Load Balancer) with two nodes:

NAME                        STATUS   ROLES                  AGE    VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
test1-control-plane-cn9lm   Ready    control-plane,master   111m   v1.18.16   10.0.0.4      <none>        Ubuntu 18.04.5 LTS   5.4.0-1039-azure   containerd://1.4.3
test1-md-0-scctm            Ready    <none>                 109m   v1.18.16   10.1.0.4      <none>        Ubuntu 18.04.5 LTS   5.4.0-1039-azure   containerd://1.4.3

You can SSH to the control plane node using the load balancer's public DNS name:

$ kubectl get azurecluster test1 -o json | jq '.spec.networkSpec.apiServerLB.frontendIPs[0].publicIP.dnsName'
test1-21192f78.eastus.cloudapp.azure.com

$ ssh username@test1-21192f78.eastus.cloudapp.azure.com hostname
test1-control-plane-cn9lm

As you can see, the Load Balancer routed the request to node test1-control-plane-cn9lm that is the only node with role control-plane in this workload cluster.

In order to SSH to node 'test1-md-0-scctm', you can use the other node as a bastion:

$ ssh -J username@test1-21192f78.eastus.cloudapp.azure.com username@10.1.0.4 hostname
test1-md-0-scctm

Clusters using an Internal Load Balancer (private clusters) can't use this approach. Network-level SSH access to those clusters has to be made on the private IP address of VMs by first getting access to the Virtual Network. How to do that is out of the scope of this document. A possible alternative that works for private clusters as well is described in the next paragraph.

Azure Bastion

A possible alternative to the process described above is to use the Azure Bastion feature. This approach works the same way for workload clusters using either type of Load Balancers.

In order to enable Azure Bastion on a CAPZ workload cluster, edit the AzureCluster CR and set the spec/bastionSpec/azureBastion field. It is enough to set the field's value to the empty object {} and the default configuration settings will be used while deploying the Azure Bastion.

For example this is an AzureCluster CR with the Azure Bastion feature enabled:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: test1
  namespace: default
spec:
  bastionSpec:
    azureBastion: {}
  ...

Once the Azure Bastion is deployed, it will be possible to SSH to any of the cluster VMs through the Azure Portal. Please follow the official documentation for a deeper explanation on how to do that.

Advanced settings

When the AzureBastion feature is enabled in a CAPZ cluster, 3 new resources will be deployed in the resource group:

The Azure Bastion resource;
A subnet named AzureBastionSubnet (the name is mandatory and can't be changed);
A public IP address.

The default values for the new resources should work for most use cases, but if you need to customize them you can provide your own values. Here is a detailed example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: test1
  namespace: default
spec:
  bastionSpec:
    azureBastion:
      name: "..." // The name of the Azure Bastion, defaults to '<cluster name>-azure-bastion'
      subnet:
        name: "..." // The name of the Subnet. The only supported name is `AzureBastionSubnet` (this is an Azure limitation).
        securityGroup: {} // No security group is assigned by default. You can choose to have one created and assigned by defining it. 
      publicIP:
        "name": "..." // The name of the Public IP, defaults to '<cluster name>-azure-bastion-pip'.
      sku: "..." // The SKU/tier of the Azure Bastion resource. The options are `Standard` and `Basic`. The default value is `Basic`.
      enableTunneling: "..." // Whether or not to enable tunneling/native client support. The default value is `false`.

If you specify a security group to be associated with the Azure Bastion subnet, it needs to have some networking rules defined or the Azure Bastion resource creation will fail. Please refer to the documentation for more details.

Authentication

With the networking part sorted, we still have to work out a way of authenticating to the VMs via SSH.

Provisioning SSH keys using Machine Templates

In order to add an SSH authorized key for user username and provide sudo access to the control plane VMs, you can adjust the KubeadmControlPlane CR as in the following example:

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
...
spec:
  ...
  kubeadmConfigSpec:
    ...
    users:
    - name: username
      sshAuthorizedKeys:
      - "ssh-rsa AAAA..."
    files:
    - content: "username ALL = (ALL) NOPASSWD: ALL"
      owner: root:root
      path: /etc/sudoers.d/username
      permissions: "0440"
    ...

Similarly, you can achieve the same result for Machine Deployments by customizing the KubeadmConfigTemplate CR:

apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: test1-md-0
  namespace: default
spec:
  template:
    spec:
      files:
      ...
      - content: "username ALL = (ALL) NOPASSWD: ALL"
        owner: root:root
        path: /etc/sudoers.d/username
        permissions: "0440"
      ...
      users:
      - name: username
        sshAuthorizedKeys:
        - "ssh-rsa AAAA..."

Setting SSH keys or passwords using the Azure Portal

An alternative way of gaining SSH access to VMs on Azure is to set the password or authorized key via the Azure Portal. In the Portal, navigate to the Virtual Machine details page and find the Reset password function in the left pane.

Troubleshooting Guide

Common issues users might run into when using Cluster API Provider for Azure. This list is work-in-progress. Feel free to open a PR to add to it if you find that useful information is missing.

Examples of troubleshooting real-world issues

No Azure resources are getting created

This is likely due to missing or invalid Azure credentials.

Check the CAPZ controller logs on the management cluster:

kubectl logs deploy/capz-controller-manager -n capz-system manager

If you see an error similar to this:

azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/123/providers/Microsoft.Compute/skus?%24filter=location+eq+%27eastus2%27&api-version=2019-04-01: StatusCode=401 -- Original Error: adal: Refresh request failed. Status Code = '401'. Response body: {\"error\":\"invalid_client\",\"error_description\":\"AADSTS7000215: Invalid client secret is provided.

Make sure the provided Service Principal client ID and client secret are correct and that the password has not expired.

The AzureCluster infrastructure is provisioned but no virtual machines are coming up

Your Azure subscription might have no quota for the requested VM size in the specified Azure location.

Check the CAPZ controller logs on the management cluster:

kubectl logs deploy/capz-controller-manager -n capz-system manager

If you see an error similar to this:

"error"="failed to reconcile AzureMachine: failed to create virtual machine: failed to create VM capz-md-0-qkg6m in resource group capz-fkl3tp: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=\u003cnil\u003e Code=\"OperationNotAllowed\" Message=\"Operation could not be completed as it results in exceeding approved standardDSv3Family Cores quota.

Follow the these steps. Alternatively, you can specify another Azure location and/or VM size during cluster creation.

A virtual machine is running but the k8s node did not join the cluster

Check the AzureMachine (or AzureMachinePool if using a MachinePool) status:

kubectl get azuremachines -o wide

If you see an output like this:

NAME                                       READY   STATE
default-template-md-0-w78jt                false   Updating

This indicates that the bootstrap script has not yet succeeded. Check the AzureMachine status.conditions field for more information.

Take a look at the cloud-init logs for further debugging.

One or more control plane replicas are missing

Take a look at the KubeadmControlPlane controller logs and look for any potential errors:

kubectl logs deploy/capi-kubeadm-control-plane-controller-manager -n capi-kubeadm-control-plane-system manager

In addition, make sure all pods on the workload cluster are healthy, including pods in the kube-system namespace.

Nodes are in NotReady state

Make sure you have installed a CNI on the workload cluster and that all the pods on the workload cluster are in running state.

Load Balancer service fails to come up

Check the cloud-controller-manager logs on the workload cluster.

If running the Azure cloud provider in-tree:

kubectl logs kube-controller-manager-<control-plane-node-name> -n kube-system

If running the Azure cloud provider out-of-tree:

kubectl logs cloud-controller-manager -n kube-system

Watching Kubernetes resources

To watch progression of all Cluster API resources on the management cluster you can run:

kubectl get cluster-api

Looking at controller logs

To check the CAPZ controller logs on the management cluster, run:

kubectl logs deploy/capz-controller-manager -n capz-system manager

Checking cloud-init logs (Ubuntu)

Cloud-init logs can provide more information on any issues that happened when running the bootstrap script.

Option 1: Using the Azure Portal

Located in the virtual machine blade (if enabled for the VM), the boot diagnostics option is under the Support and Troubleshooting section in the Azure portal.

For more information, see here

Option 2: Using the Azure CLI

az vm boot-diagnostics get-boot-log --name MyVirtualMachine --resource-group MyResourceGroup

For more information, see here.

Option 3: With SSH

Using the ssh information provided during cluster creation (environment variable AZURE_SSH_PUBLIC_KEY_B64):

connect to first control node - capi is default linux user created by deployment

API_SERVER=$(kubectl get azurecluster capz-cluster -o jsonpath='{.spec.controlPlaneEndpoint.host}')
ssh capi@${API_SERVER}

list nodes

kubectl get azuremachines
NAME                               READY   STATE
capz-cluster-control-plane-2jprg   true    Succeeded
capz-cluster-control-plane-ck5wv   true    Succeeded
capz-cluster-control-plane-w4tv6   true    Succeeded
capz-cluster-md-0-s52wb            false   Failed
capz-cluster-md-0-w8xxw            true    Succeeded

pick node name from output above:

node=$(kubectl get azuremachine capz-cluster-md-0-s52wb -o jsonpath='{.status.addresses[0].address}')
ssh -J capi@${apiserver} capi@${node}

look at cloud-init logs

less /var/log/cloud-init-output.log

Automated log collection

As part of CI there is a log collection tool which you can also leverage to pull all the logs for machines which will dump logs to ${PWD}/_artifacts} by default. The following works if your kubeconfig is configured with the management cluster. See the tool for more settings.

go run -tags e2e ./test/logger.go --name <workload-cluster-name> --namespace <workload-cluster-namespace>

There are also some provided scripts that can help automate a few common tasks.

Trusted launch for VMs

This document describes how to deploy a cluster with nodes that support trusted launch.

Limitations

Before you begin, be aware of the following:

Trusted Launch Images

One of the limitations of trusted launch for VMs is that they require generation 2 VMs.

Trusted launch supported OS images are not included in the list of capi reference images. Before creating a cluster hosted on VMs with trusted launch features enabled, you can create a custom image based on a one of the trusted launch supported OS images using image-builder. For example, you can run the following to create such an image based on Ubuntu Server 22.04 LTS:

$ make -C images/capi build-azure-sig-ubuntu-2204-gen2
# many minutes later...
==> sig-ubuntu-2204-gen2:
Build 'sig-ubuntu-2204-gen2' finished.

==> Builds finished. The artifacts of successful builds are:
--> sig-ubuntu-2204-gen2: Azure.ResourceManagement.VMImage:

OSType: Linux
ManagedImageResourceGroupName: cluster-api-images
ManagedImageName: capi-ubuntu-2204-gen2-1684153817
ManagedImageId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/images/capi-ubuntu-2204-gen2-1684153817
ManagedImageLocation: southcentralus
ManagedImageSharedImageGalleryId: /subscriptions/01234567-89ab-cdef-0123-4567890abcde/resourceGroups/cluster-api-images/providers/Microsoft.Compute/galleries/ClusterAPI/images/capi-ubuntu-2204-gen2/versions/0.3.1684153817

Example

The below example shows how to deploy a cluster with control-plane nodes that have SecureBoot and vTPM enabled. Make sure to choose a supported generation 2 VM size (e.g. Standard_B2s) and OS (e.g. Ubuntu Server 22.04 LTS). NOTE: the same can be applied to worker nodes

kind: AzureMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
metadata:
  name: capz-trusted-launch-example
spec:
  template:
    spec:
      image:
        computeGallery:
          subscriptionID: "01234567-89ab-cdef-0123-4567890abcde"
          resourceGroup: "cluster-api-images"
          gallery: "ClusterAPI"
          name: "capi-ubuntu-2204-gen2-1684153817"
          version: "0.3.1684153817"
      securityProfile:
        securityType: "TrustedLaunch"
        uefiSettings:
          vTpmEnabled: true
          secureBootEnabled: true
      osDisk:
        diskSizeGB: 128
        osType: "Linux"
      vmSize: "Standard_B2s"

Custom Virtual Networks

Pre-existing vnet and subnets

To deploy a cluster using a pre-existing vnet, modify the AzureCluster spec to include the name and resource group of the existing vnet as follows, as well as the control plane and node subnets as follows:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-byo-vnet
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      resourceGroup: custom-vnet
      name: my-vnet
    subnets:
      - name: my-control-plane-subnet
        role: control-plane
        securityGroup:
          name: my-control-plane-nsg
      - name: my-node-subnet
        role: node
        routeTable:
          name: my-node-routetable
        securityGroup:
          name: my-node-nsg
  resourceGroup: cluster-byo-vnet

When providing a vnet, it is required to also provide the two subnets that should be used for control planes and nodes.

If providing an existing vnet and subnets with existing network security groups, make sure that the control plane security group allows inbound to port 6443, as port 6443 is used by kubeadm to bootstrap the control planes. Alternatively, you can provide a custom control plane endpoint in the KubeadmConfig spec.

The pre-existing vnet can be in the same resource group or a different resource group in the same subscription as the target cluster. When deleting the AzureCluster, the vnet and resource group will only be deleted if they are "managed" by capz, ie. they were created during cluster deployment. Pre-existing vnets and resource groups will not be deleted.

Virtual Network Peering

Alternatively, pre-existing vnets can be peered with a cluster's newly created vnets by specifying each vnet by name and resource group.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-vnet-peering
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.255.0.0/16
      peerings:
      - resourceGroup: vnet-peering-rg
        remoteVnetName: existing-vnet-1
      - resourceGroup: vnet-peering-rg
        remoteVnetName: existing-vnet-2
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.255.0.0/24
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.255.1.0/24
  resourceGroup: cluster-vnet-peering

Currently, only virtual networks on the same subscription can be peered. Also, note that when creating workload clusters with internal load balancers, the management cluster must be in the same VNet or a peered VNet. See here for more details.

Custom Network Spec

It is also possible to customize the vnet to be created without providing an already existing vnet. To do so, simply modify the AzureCluster NetworkSpec as desired. Here is an illustrative example of a cluster with a customized vnet address space (CIDR) and customized subnets:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
  resourceGroup: cluster-example

If no CIDR block is provided, 10.0.0.0/8 will be used by default, with default internal LB private IP 10.0.0.100.

Custom Security Rules

Security rules can also be customized as part of the subnet specification in a custom network spec.

Note that ingress rules for the Kubernetes API Server port (default 6443) and SSH (22) are automatically added to the controlplane subnet if these security rules aren't specified. It is the responsibility of the user to override those rules themselves when the default configuration does not match expected ruleset.

These rules are identified by destinationPorts value:

<API_SERVER_PORT> for the API server access. Default port is 6443.
22 for the SSH access.

Here is an illustrative example of customizing rules that builds on the one above by adding an egress rule to the control plane nodes:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
        securityGroup:
          name: my-subnet-cp-nsg
          securityRules:
            - name: "allow_ssh"
              description: "allow SSH"
              direction: "Inbound"
              priority: 2200
              protocol: "*"
              destination: "*"
              destinationPorts: "22"
              source: "*"
              sourcePorts: "*"
              action: "Allow"
            - name: "allow_apiserver"
              description: "Allow K8s API Server"
              direction: "Inbound"
              priority: 2201
              protocol: "*"
              destination: "*"
              destinationPorts: "6443"
              source: "*"
              sourcePorts: "*"
              action: "Allow"
            - name: "allow_port_50000"
              description: "allow port 50000"
              direction: "Outbound"
              priority: 2202
              protocol: "Tcp"
              destination: "*"
              destinationPorts: "50000"
              source: "*"
              sourcePorts: "*"
              action: "Allow"
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
  resourceGroup: cluster-example

Alternatively, when default server securityRules apply, but the list needs to be extended, only required rules can be added, like so:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    additionalAPIServerLBPorts:
      - name: RKE2
        port: 9345
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
        securityGroup:
          name: my-subnet-cp-nsg
          securityRules:
            - name: "allow_port_9345"
              description: "RKE2 - allow node registration on port 9345"
              direction: "Inbound"
              priority: 2202
              protocol: "Tcp"
              destination: "*"
              destinationPorts: "9345"
              source: "*"
              sourcePorts: "*"
              action: "Allow"
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
  resourceGroup: cluster-example

Virtual Network service endpoints

Sometimes it's desirable to use Virtual Network service endpoints to establish secure and direct connectivity to Azure services from your subnet(s). Service Endpoints are configured on a per-subnet basis. Vnets managed by either AzureCluster or AzureManagedControlPlane can have serviceEndpoints optionally set on each subnet.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
        serviceEndpoints:
          - service: Microsoft.AzureActiveDirectory
            locations: ["*"]
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
        serviceEndpoints:
          - service: Microsoft.AzureActiveDirectory
            locations: ["*"]
          - service: Microsoft.Storage
            locations: ["southcentralus"]
  resourceGroup: cluster-example

Private Endpoints

A Private Endpoint is a network interface that uses a private IP address from your virtual network. This network interface connects you privately and securely to a service that's powered by Azure Private Link. Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage and SQL Database) and Azure hosted customer-owned/partner services over a private endpoint in your virtual network.

Private Endpoints are configured on a per-subnet basis. Vnets managed by either AzureCluster, AzureClusterTemplates or AzureManagedControlPlane can have privateEndpoints optionally set on each subnet.

AzureCluster example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: eastus2
  resourceGroup: cluster-example
  networkSpec:
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
    subnets:
      - name: my-subnet-cp
        role: control-plane
        cidrBlocks:
          - 10.0.1.0/24
      - name: my-subnet-node
        role: node
        cidrBlocks:
          - 10.0.2.0/24
        privateEndpoints:
         - name: my-pe
           privateLinkServiceConnections:
           - privateLinkServiceID: /subscriptions/<Subscription ID>/resourceGroups/<Remote Resource Group Name>/providers/Microsoft.Network/privateLinkServices/<Private Link Service Name>

AzureManagedControlPlane example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: cluster-example
  namespace: default
spec:
  version: v1.25.2
  sshPublicKey: ""
  identityRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureClusterIdentity
    name: cluster-identity
  location: eastus2
  resourceGroupName: cluster-example
  virtualNetwork:
    name: my-vnet
    cidrBlock: 10.0.0.0/16
    subnet:
      cidrBlock: 10.0.2.0/24
      name: my-subnet
      privateEndpoints:
      - name: my-pe
        customNetworkInterfaceName: nic-my-pe # optional
        applicationSecurityGroups: # optional
        - <ASG ID>
        privateIPAddresses: # optional
        - 10.0.2.10
        location: eastus2 # optional
        privateLinkServiceConnections:
        - name: my-pls # optional
          privateLinkServiceID: /subscriptions/<Subscription ID>/resourceGroups/<Remote Resource Group Name>/providers/Microsoft.Storage/storageAccounts/<Name>
          groupIds:
          - "blob"

Custom subnets

Sometimes it's desirable to use different subnets for different node pools. Several subnets can be specified in the networkSpec to be later referenced by name from other CR's like AzureMachine or AzureMachinePool. When more than one node subnet is specified, the subnetName field in those other CR's becomes mandatory because the controllers wouldn't know which subnet to use.

The subnet used for the control plane must use the role control-plane while the subnets for the worker nodes must use the role node.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
  name: cluster-example
  namespace: default
spec:
  location: southcentralus
  networkSpec:
    subnets:
    - name: control-plane-subnet
      role: control-plane
    - name: subnet-mp-1
      role: node
    - name: subnet-mp-2
      role: node
    vnet:
      name: my-vnet
      cidrBlocks:
        - 10.0.0.0/16
  resourceGroup: cluster-example
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: mp1
  namespace: default
spec:
  location: southcentralus
  strategy:
    rollingUpdate:
      deletePolicy: Oldest
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${YOUR_SSH_PUB_KEY}
    subnetName: subnet-mp-1
    vmSize: Standard_B2s
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: mp2
  namespace: default
spec:
  location: southcentralus
  strategy:
    rollingUpdate:
      deletePolicy: Oldest
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    osDisk:
      diskSizeGB: 30
      managedDisk:
        storageAccountType: Premium_LRS
      osType: Linux
    sshPublicKey: ${YOUR_SSH_PUB_KEY}
    subnetName: subnet-mp-2
    vmSize: Standard_B2s

If you don't specify any node subnets, one subnet with role node will be created and added to the networkSpec definition.

VM Identity

This document describes the available identities that be configured on the Azure host. For example, this is what grants permissions to the Azure Cloud Provider to provision LB services in Azure on the control plane nodes.

Flavors of Identities in Azure

All identities used in Azure are owned by Azure Active Directory (AAD). An identity, or principal, in AAD will provide the basis for each of the flavors of identities we will describe.

Managed Identities

Managed identity is a feature of Azure Active Directory (AAD) and Azure Resource Manager (ARM), which assigns ARM Role Base Access Control (RBAC) rights to AAD identities for use in Azure resources, like Virtual Machines. Each of the Azure services that support managed identities for Azure resources are subject to their own timeline. Make sure you review the availability status of managed identities for your resource and known issues before you begin.

Managed identity is used to create nodes which have an AAD identity provisioned onto the node by Azure Resource Manager (the Azure control plane) rather than providing credentials in the azure.json file. Managed identities are the preferred way to provide RBAC rights for a given resource in Azure as the lifespan of the identity is linked to the lifespan of the resource.

User-assigned managed identity (recommended)

A standalone Azure resource that is created by the user outside of the scope of this provider. The identity can be assigned to one or more Azure Machines. The lifecycle of a user-assigned identity is managed separately from the lifecycle of the Azure Machines to which it's assigned.

This lifecycle allows you to separate your resource creation and identity administration responsibilities. User-assigned identities and their role assignments can be configured in advance of the resources that require them. Users who create the resources only require the access to assign a user-assigned identity, without the need to create new identities or role assignments.

Full details on how to create and manage user assigned identities using Azure CLI can be found in the Azure docs.

System-assigned managed identity

A system-assigned identity is a managed identity which is tied to the lifespan of a resource in Azure. The identity is created by Azure in AAD for the resource it is applied upon and reaped when the resource is deleted. Unlike a service principal, a system assigned identity is available on the local resource through a local port service via the instance metadata service.

⚠️ When a Node is created with a System Assigned Identity, A role of Subscription contributor is added to this generated Identity

How to use managed identity

User-assigned

In Machines

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      identity: UserAssigned
      userAssignedIdentities:
      - providerID: ${USER_ASSIGNED_IDENTITY_PROVIDER_ID}
      ...

The CAPZ controller will look for UserAssigned value in identity field under AzureMachineTemplate, and assign the user identities listed in userAssignedIdentities to the virtual machine.

In Machine Pool

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: ${CLUSTER_NAME}-mp-0
  namespace: default
spec:
  identity: UserAssigned
  userAssignedIdentities:
  - providerID: ${USER_ASSIGNED_IDENTITY_PROVIDER_ID}
  ...

The CAPZ controller will look for UserAssigned value in identity field under AzureMachinePool, and assign the user identities listed in userAssignedIdentities to the virtual machine scale set.

Alternatively, you can also use the user-assigned-identity flavor to build a simple machine deployment-enabled cluster by using clusterctl generate cluster --flavor user-assigned-identity to generate a cluster template.

System-assigned

In Machines

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      identity: SystemAssigned
      ...

The CAPZ controller will look for SystemAssigned value in identity field under AzureMachineTemplate, and enable system-assigned managed identity in the virtual machine.

For more granularity regarding permissions, you can specify the scope and the role assignment of the system-assigned managed identity by setting the scope and definitionID fields inside the systemAssignedIdentityRole struct. In the following example, we assign the Owner role to the system-assigned managed identity on the resource group. IDs for the role assignments can be found in the Azure docs.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
  namespace: default
spec:
  template:
    spec:
      identity: SystemAssigned
      systemAssignedIdentityRole:
        scope: /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP_NAME}
        definitionID: $/subscriptions/${AZURE_SUBSCRIPTION_ID}/providers/Microsoft.Authorization/roleDefinitions/8e3af657-a8ff-443c-a75c-2fe8c4bcb635
      ...

In Machine Pool

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachinePool
metadata:
  name: ${CLUSTER_NAME}-mp-0
  namespace: default
spec:
  identity: SystemAssigned
  ...

The CAPZ controller will look for SystemAssigned value in identity field under AzureMachinePool, and enable system-assigned managed identity in the virtual machine scale set.

Alternatively, you can also use the system-assigned-identity flavor to build a simple machine deployment-enabled cluster by using clusterctl generate cluster --flavor system-assigned-identity to generate a cluster template.

Service Principal (not recommended)

A service principal is an identity in AAD which is described by a tenant ID and client (or "app") ID. It can have one or more associated secrets or certificates. The set of these values will enable the holder to exchange the values for a JWT token to communicate with Azure. The user generally creates a service principal, saves the credentials, and then uses the credentials in applications. To read more about Service Principals and AD Applications see "Application and service principal objects in Azure Active Directory".

To use a client id/secret for authentication for Cloud Provider, simply leave the identity empty, or set it to None. The autogenerated cloud provider config secret will contain the client id and secret used in your AzureClusterIdentity for AzureCluster creation as aadClientID and aadClientSecret.

To use a certificate/password for authentication, you will need to write the certificate file on the VM (for example using the files option if using CABPK/cloud-init) and mount it to the cloud-controller-manager, then refer to it as aadClientCertPath, along with aadClientCertPassword, in your cloud provider config. Please consider using a user-assigned identity instead before going down that route as they are more secure and flexible, as described above.

Creating a Service Principal

With the Azure CLI

Subscription level Scope

az login
az account set --subscription="${AZURE_SUBSCRIPTION_ID}"
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}"

Resource group level scope

az login
az account set --subscription="${AZURE_SUBSCRIPTION_ID}"
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_RESOURCE_GROUP}"

This will output your appId, password, name, and tenant. The name or appId is used for the AZURE_CLIENT_ID and the password is used for AZURE_CLIENT_SECRET.

Confirm your service principal by opening a new shell and run the following commands substituting in name, password, and tenant:

az login --service-principal -u NAME -p PASSWORD --tenant TENANT
az vm list-sizes --location eastus

WebAssembly / WASI Workloads

Overview

CAPZ enables you to create WebAssembly (Wasm) / WASI pod workloads targeting the Deislabs Slight, Fermyon Spin, Lunatic, or VMware Wasm Workers Server frameworks for building and running fast, secure microservices on Kubernetes.

NOTE: Images built with image-builder version v0.1.22 or later support all four Wasm runtimes.

All of the runtimes (lunatic, slight, spin, and wws) for running Wasm workloads are embedded in containerd shims by the deislabs/containerd-wasm-shims project which is built upon containerd/runwasi. These containerd shims enable Kubernetes to run Wasm workloads without needing to embed the Wasm runtime in each OCI image.

Deislabs Slight (SpiderLightning)

Slight (or SpiderLightning) is an open source wasmtime-based runtime that provides cloud capabilities to Wasm microservices. These capabilities include key/value, pub/sub, and much more.

Fermyon Spin

Spin is an open source framework for building and running fast, secure, and composable cloud microservices with WebAssembly. It aims to be the easiest way to get started with WebAssembly microservices, and takes advantage of the latest developments in the WebAssembly component model and Wasmtime runtime.

Lunatic

Lunatic is a universal open source runtime for fast, robust and scalable server-side applications. It's inspired by Erlang and can be used from any language that compiles to WebAssembly.

VMware Wasm Workers Server

Wasm Workers Server is an open source framework that allows you to develop and run serverless applications using a lightweight construct called "workers". The server itself is implemented as a self-contained binary that routes HTTP requests to a WebAssembly runtime that hosts the workers.

Applying the Wasm Runtime Classes

By default, CAPZ virtual machine images include containerd shims to run lunatic, slight, spin, and wws workloads. To inform Kubernetes about the ability to run Wasm workloads on CAPZ nodes, you must apply a runtime class for one or more runtimes to your workload cluster.

Create a wasm-runtimes.yaml file with the following contents to enable all four runtimes:

---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-lunatic-v1"
handler: "lunatic"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-slight-v1"
handler: "slight"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-spin-v2"
handler: "spin"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-wws-v1"
handler: "wws"

Deploy these resources to your workload cluster:

kubectl --kubeconfig=<workload-kubeconfig> apply -f wasm-runtimes.yaml

The preceding YAML document will register runtime classes for lunatic, slight, spin, and wws, which will direct containerd to use the appropriate shim when a pod workload is scheduled onto a cluster node.

Running an Example Spin Workload

With the runtime classes registered, we can now schedule Wasm workloads on our nodes by creating a Kubernetes Deployment and Service. Create a spin.yaml file with the following contents:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wasm-spin
spec:
  replicas: 3
  selector:
    matchLabels:
      app: wasm-spin
  template:
    metadata:
      labels:
        app: wasm-spin
    spec:
      runtimeClassName: wasmtime-spin-v2
      containers:
        - name: spin-hello
          image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:latest
          command: ["/"]
          resources:
            requests:
              cpu: 10m
              memory: 10Mi
            limits:
              cpu: 500m
              memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
  name: wasm-spin
spec:
  type: LoadBalancer
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  selector:
    app: wasm-spin

Deploy these resources to your workload cluster:

kubectl --kubeconfig=<workload-kubeconfig> apply -f spin.yaml

The preceding deployment and service will create a load-balanced "hello world" service with 3 Spin microservices. Note the runtimeClassName applied to the Deployment, wasmtime-spin-v2, which informs containerd on the cluster node to run the workload with the Spin v2 shim.

A Running Spin Microservice

With the service and the deployment applied, you should now have a Spin microservice running in your workload cluster. If you run the following command against the workload cluster, you can find the IP for the wasm-spin service.

kubectl --kubeconfig=<workload-kubeconfig> get services -w
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
kubernetes   ClusterIP      10.96.0.1       <none>          443/TCP        14m
wasm-spin    LoadBalancer   10.105.51.137   20.121.244.48   80:30197/TCP   3m8s

In the preceding output, we can see the wasm-spin service with an external IP of 20.121.244.48. Your external IP will be different, but that is expected.

Next, let's connect to the service and get a response from our Wasm microservice. You will need to replace the placeholder IP address below with the your external IP address from the preceding output.

curl http://20.121.244.48/hello
Hello world from Spin!

In the preceding output, we see the HTTP response from our Spin microservice, "Hello world from Spin!".

Building a Lunatic, Spin, Slight, or WWS Application

At this point, you might be asking "How do I build my own Wasm microservice?" Here are some pointers to help you get started.

Example `lunatic` Application

To learn more about building lunatic applications, see the Lunatic README.

Example `slight` Application

The slight example in deislabs/containerd-wasm-shims repo demonstrates a project layout for creating a container image consisting of a slight app.wasm and a slightfile.toml, both of which are needed to run the microservice.

To learn more about building slight applications, see Deislabs Slight.

Example `spin` Application

The spin example in deislabs/containerd-wasm-shims repo demonstrates a project layout for creating a container image consisting of two spin apps, spin_rust_hello.wasm and spin_go_hello.wasm, and a spin.toml file.

To learn more about building spin applications, see Fermyon Spin.

Example `wws` Application

The wws examples in vmware-labs/wasm-workers-server repo demonstrate project layouts for wws workers in multiple languages.

To learn more about building wws applications, see VMware Wasm Workers Server.

Constraining Scheduling of Wasm Workloads

You may have a cluster where not all nodes are able to run Wasm workloads. In this case, you would want to constrain the nodes that are able to have Wasm workloads scheduled.

If you would like to constrain the nodes that will run the Wasm workloads, you can apply a node label selector to the runtime classes, then apply node labels to the cluster nodes you'd like to run the workloads.

Create a wasm-runtimes-constrained.yaml file with the following contents:

---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-lunatic-v1"
handler: "lunatic"
scheduling:
  nodeSelector:
    "cluster.x-k8s.io/wasmtime-lunatic-v1": "true"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-slight-v1"
handler: "slight"
scheduling:
  nodeSelector:
    "cluster.x-k8s.io/wasmtime-slight-v1": "true"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-spin-v2"
handler: "spin"
scheduling:
  nodeSelector:
    "cluster.x-k8s.io/wasmtime-spin-v2": "true"
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: "wasmtime-wws-v1"
handler: "wws"
scheduling:
  nodeSelector:
    "cluster.x-k8s.io/wasmtime-wws-v1": "true"

Deploy these resources to your workload cluster:

kubectl --kubeconfig=<workload-kubeconfig> apply -f wasm-runtimes-constrained.yaml

In the preceding YAML, note the nodeSelector and the label. The Kubernetes scheduler will select nodes with the cluster.x-k8s.io/wasmtime-lunatic-v1: "true", cluster.x-k8s.io/wasmtime-slight-v1: "true", cluster.x-k8s.io/wasmtime-spin-v2: "true", or cluster.x-k8s.io/wasmtime-wws-v1: "true" label to determine where to schedule Wasm workloads.

You will also need to pair the above runtime classes with labels applied to your cluster nodes. To label your nodes, use a command like the following:

kubectl --kubeconfig=<workload-kubeconfig> label nodes <your-node-name> <label>

Once you have applied node labels, you can safely schedule Wasm workloads to a constrained set of nodes in your cluster.

Windows Clusters

Overview

CAPZ enables you to create Windows Kubernetes clusters on Microsoft Azure. We recommend using Containerd for the Windows runtime in Cluster API for Azure.

Using Containerd for Windows Clusters

To deploy a cluster using Windows, use the Windows flavor template.

Deploy a workload

After you Windows VM is up and running you can deploy a workload. Using the deployment file below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: iis-1809
  labels:
    app: iis-1809
spec:
  replicas: 1
  template:
    metadata:
      name: iis-1809
      labels:
        app: iis-1809
    spec:
      containers:
      - name: iis
        image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
        resources:
          limits:
            cpu: 1
            memory: 800m
          requests:
            cpu: .1
            memory: 300m
        ports:
          - containerPort: 80
      nodeSelector:
        "kubernetes.io/os": windows
  selector:
    matchLabels:
      app: iis-1809
---
apiVersion: v1
kind: Service
metadata:
  name: iis
spec:
  type: LoadBalancer
  ports:
  - protocol: TCP
    port: 80
  selector:
    app: iis-1809

Save this file to iis.yaml then deploy it:

kubectl apply -f .\iis.yaml

Get the Service endpoint and curl the website:

kubectl get services
NAME         TYPE           CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
iis          LoadBalancer   10.0.9.47    <pending>     80:31240/TCP   1m
kubernetes   ClusterIP      10.0.0.1     <none>        443/TCP        46m

curl <EXTERNAL-IP>

Kube-proxy and CNIs for Containerd

The Windows HostProcess Container feature is Alpha for Kubernetes v1.22 and Beta for v1.23. In v1.28, this feature is on by default and the WindowsHostProcessContainers feature gate is no longer recognized. See the Windows Hostprocess KEP for more details. Kube-proxy and other CNI's have been updated to run in HostProcess containers. The current implementation is using kube-proxy and Calico CNI built by sig-windows. Sig-windows is working to upstream the kube-proxy, cni implementations, and improve kubeadm support in the next few releases.

Current requirements:

Kubernetes 1.23+
containerd 1.6+
WindowsHostProcessContainers feature-gate (Beta / on-by-default for v1.23) turned on for kube-apiserver and kubelet, omitted in v1.28 and later

These requirements are satisfied by the Windows Containerd Template and Azure Marketplace reference image cncf-upstream:capi-windows:k8s-1dot22dot1-windows-2019-containerd:2021.10.15

Details

See the CAPI proposal for implementation details: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20200804-windows-support.md

VM and VMSS naming

Azure does not support creating Windows VM's with names longer than 15 characters (see additional details historical restrictions).

When creating a cluster with AzureMachine if the AzureMachine is longer than 15 characters then the first 9 characters of the cluster name and appends the last 5 characters of the machine to create a unique machine name.

When creating a cluster with Machinepool if the Machine Pool name is longer than 9 characters then the Machine pool uses the prefix win and appends the last 5 characters of the machine pool name.

VM password and access

The VM password is random generated by Cloudbase-init during provisioning of the VM. For Access to the VM you can use ssh, which can be configured with a public key you provide during deployment. It's required to specify the SSH key using the users property in the Kubeadm config template. Specifying the sshPublicKey on AzureMachine / AzureMachinePool resources only works with Linux instances.

For example like this:

apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: test1-md-0
  namespace: default
spec:
  template:
    spec:
      ...
      users:
      - name: username
        groups: Administrators
        sshAuthorizedKeys:
        - "ssh-rsa AAAA..."

To SSH:

ssh -t -i .sshkey -o 'ProxyCommand ssh -i .sshkey -W %h:%p capi@<api-server-ip>' capi@<windows-ip>

Refer to SSH Access for nodes for more instructions on how to connect using SSH.

There is also a CAPZ kubectl plugin that automates the ssh connection using the Management cluster

To RDP you can proxy through the api server:

ssh -L 5555:<windows-ip>:3389 capi@<api-server-ip>

And then open an RDP client on your local machine to localhost:5555

Image creation

The images are built using image-builder and published the the Azure Market place. They use Cloudbase-init to bootstrap the machines via Kubeadm.

Find the latest published images:

az vm image list --publisher cncf-upstream --offer capi-windows -o table --all
Offer         Publisher      Sku                                     Urn                                                                           Version
------------  -------------  ----------------------------            ------------------------------------------------------------------            ----------
capi-windows  cncf-upstream  k8s-1dot22dot1-windows-2019-containerd  cncf-upstream:capi-windows:k8s-1dot22dot1-windows-2019-containerd:2021.10.15  2021.10.15
capi-windows  cncf-upstream  k8s-1dot22dot2-windows-2019-containerd  cncf-upstream:capi-windows:k8s-1dot22dot2-windows-2019-containerd:2021.10.15  2021.10.15

If you would like customize your images please refer to the documentation on building your own custom images.

Developing Cluster API Provider Azure

Setting up

Base requirements

Install go
- Get the latest patch version for go v1.22.
Install jq
- brew install jq on macOS.
- sudo apt install jq on Windows + WSL2
- sudo apt install jq on Ubuntu Linux.
Install gettext package
- brew install gettext && brew link --force gettext on macOS.
- sudo apt install gettext on Windows + WSL2.
- sudo apt install gettext on Ubuntu Linux.
Install KIND
- GO111MODULE="on" go get sigs.k8s.io/kind@v0.18.0.
Install Kustomize
- brew install kustomize on macOS.
- install instructions on Windows + WSL2.
- install instructions on Linux.
Install Python 3.x or 2.7.x, if neither is already installed.
Install pip
- pip installation instruction
Install make.
- brew install make on MacOS.
- sudo apt install make on Windows + WSL2.
- sudo apt install make on Linux.
Install timeout
- brew install coreutils on macOS.
Install pre-commit framework

brew install pre-commit Or pip install pre-commit. Installs pre-commit globally.
run pre-commit install at the root of the project to install pre-commit hooks to read .pre-commit-config.yaml
Note: use git commit --no-verify to skip running pre-commit workflow as and when needed.

When developing on Windows, it is suggested to set up the project on Windows + WSL2 and the file should be checked out on as wsl file system for better results.

Get the source

git clone https://github.com/kubernetes-sigs/cluster-api-provider-azure
cd cluster-api-provider-azure

Get familiar with basic concepts

This provider is modeled after the upstream Cluster API project. To get familiar with Cluster API resources, concepts and conventions (such as CAPI and CAPZ), refer to the Cluster API Book.

Dev manifest files

Part of running cluster-api-provider-azure is generating manifests to run. Generating dev manifests allows you to test dev images instead of the default releases.

Dev images

Container registry

Any public container registry can be leveraged for storing cluster-api-provider-azure container images.

Developing

Change some code!

Modules and dependencies

This repository uses Go Modules to track and vendor dependencies.

To pin a new dependency:

Run go get <repository>@<version>.
(Optional) Add a replace statement in go.mod.

Makefile targets and scripts are offered to work with go modules:

make verify-modules checks whether go module files are out of date.
make modules runs go mod tidy to ensure proper vendoring.
hack/ensure-go.sh checks that the Go version and environment variables are properly set.

Setting up the environment

You must have the Azure credentials as outlined in the getting started prerequisites section.

Tilt Requirements

Install Tilt:

brew install tilt-dev/tap/tilt on macOS or Linux
scoop bucket add tilt-dev https://github.com/tilt-dev/scoop-bucket & scoop install tilt on Windows
for alternatives you can follow the installation instruction for macOS, Linux or Windows

After the installation is done, verify that you have installed it correctly with: tilt version

Install Helm:

brew install helm on MacOS
choco install kubernetes-helm on Windows
Install Instruction on Linux

You would require installation of Helm for successfully setting up Tilt.

Using Tilt

Both of the Tilt setups below will get you started developing CAPZ in a local kind cluster. The main difference is the number of components you will build from source and the scope of the changes you'd like to make. If you only want to make changes in CAPZ, then follow CAPZ instructions. This will save you from having to build all of the images for CAPI, which can take a while. If the scope of your development will span both CAPZ and CAPI, then follow the CAPI and CAPZ instructions.

Tilt for dev in CAPZ

If you want to develop in CAPZ and get a local development cluster working quickly, this is the path for you.

Create a file named tilt-settings.yaml in the root of the CAPZ repository with the following contents:

kustomize_substitutions:
  AZURE_SUBSCRIPTION_ID: <subscription-id>
  AZURE_TENANT_ID: <tenant-id>
  AZURE_CLIENT_SECRET: <client-secret>
  AZURE_CLIENT_ID: <client-id>

You should have these values saved from the getting started prerequisites section.

To build a kind cluster and start Tilt, just run:

make kind-create tilt-up

Note: You can also choose an AKS cluster as a management cluster by using aks-create instead of kind-create.

By default, the Cluster API components deployed by Tilt have experimental features turned off. If you would like to enable these features, add extra_args as specified in The Cluster API Book.

Once your kind management cluster is up and running, you can deploy a workload cluster.

To tear down the kind cluster built by the command above, just run:

make kind-reset

Tilt for dev in both CAPZ and CAPI

If you want to develop in both CAPI and CAPZ at the same time, then this is the path for you.

To use Tilt for a simplified development workflow, follow the instructions in the cluster-api repo. The instructions will walk you through cloning the Cluster API (CAPI) repository and configuring Tilt to use kind to deploy the cluster api management components.

you may wish to checkout out the correct version of CAPI to match the version used in CAPZ

Note that tilt up will be run from the cluster-api repository directory and the tilt-settings.yaml file will point back to the cluster-api-provider-azure repository directory. Any changes you make to the source code in cluster-api or cluster-api-provider-azure repositories will automatically redeployed to the kind cluster.

After you have cloned both repositories, your folder structure should look like:

|-- src/cluster-api-provider-azure
|-- src/cluster-api (run `tilt up` here)

After configuring the environment variables, run the following to generate your tilt-settings.yaml file:

cat <<EOF > tilt-settings.yaml
default_registry: "${REGISTRY}"
provider_repos:
- ../cluster-api-provider-azure
enable_providers:
- azure
- docker
- kubeadm-bootstrap
- kubeadm-control-plane
kustomize_substitutions:
  AZURE_SUBSCRIPTION_ID: <subscription-id>
  AZURE_TENANT_ID: <tenant-id>
  AZURE_CLIENT_SECRET: <client-secret>
  AZURE_CLIENT_ID: <client-id>
EOF

Make sure to replace the credentials with the values from the getting started prerequisites section.

$REGISTRY should be in the format docker.io/<dockerhub-username>

The cluster-api management components that are deployed are configured at the /config folder of each repository respectively. Making changes to those files will trigger a redeploy of the management cluster components.

Deploying a workload cluster

⚠️ Note that when developing with tilt as described above, some clusterctl commands won't work. Specifically, clusterctl config and clusterctl generate may fail. These commands expect specific releases of CAPI and CAPZ to be installed, but the tilt environment dynamically updates and installs these components from your local code. clusterctl get kubeconfig will still work, however.

After your kind management cluster is up and running with Tilt, you can deploy a workload cluster by opening the tilt web UI and clicking the clockwise arrow icon ⟳ on a resource listed, such as "aks-aad," "ipv6," or "windows."

Deploying a workload cluster from Tilt UI is also termed as flavor cluster deployment. Note that each time a flavor is deployed, it deploys a new workload cluster in addition to the existing ones. All the workload clusters must be manually deleted by the user. Please refer to Running flavor clusters as a tilt resource to learn more about this.

Or you can configure workload cluster settings and deploy a workload cluster with the following command:

make create-workload-cluster

To delete the cluster:

make delete-workload-cluster

Check out the self-managed and managed troubleshooting guides for common errors you might run into.

Viewing Telemetry

The CAPZ controller emits tracing and metrics data. When run in Tilt, the KinD management cluster is provisioned with development deployments of OpenTelemetry for collecting distributed traces, Jaeger for viewing traces, and Prometheus for scraping and visualizing metrics.

The OpenTelemetry, Jaeger, and Prometheus deployments are for development purposes only. These illustrate the hooks for tracing and metrics, but lack the robustness of production cluster deployments. For example, the Jaeger "all-in-one" component only keeps traces in memory, not in a persistent store.

To view traces in the Jaeger interface, wait until the Tilt cluster is fully initialized. Then open the Tilt web interface, select the "traces: jaeger-all-in-one" resource, and click "View traces" near the top of the screen. Or visit http://localhost:16686/ in your browser.

To view traces in App Insights, follow the tracing documentation before running make tilt-up. Then open the Azure Portal in your browser. Find the App Insights resource you specified in AZURE_INSTRUMENTATION_KEY, choose "Transaction search" on the left, and click "Refresh" to see recent trace data.

To view metrics in the Prometheus interface, open the Tilt web interface, select the "metrics: prometheus-operator" resource, and click "View metrics" near the top of the screen. Or visit http://localhost:9090/ in your browser.

To view cluster resources using the Cluster API Visualizer, select the "visualize-cluster" resource and click "View visualization" or visit "http://localhost:8000/" in your browser.

Debugging

You can debug CAPZ (or another provider / core CAPI) by running the controllers with delve. When developing using Tilt this is easily done by using the debug configuration section in your tilt-settings.yaml file. For example:

default_registry: "${REGISTRY}"
provider_repos:
- ../cluster-api-provider-azure
enable_providers:
- azure
- docker
- kubeadm-bootstrap
- kubeadm-control-plane
kustomize_substitutions:
  AZURE_SUBSCRIPTION_ID: <subscription-id>
  AZURE_TENANT_ID: <tenant-id>
  AZURE_CLIENT_SECRET: <client-secret>
  AZURE_CLIENT_ID: <client-id>
debug:
  azure:
    continue: true
    port: 30000

Note you can list multiple controllers or core CAPI and expose metrics as well in the debug section. Full details of the options can be seen here.

If you then start Tilt you can connect to delve via the port defined (i.e. 30000 in the sample). If you are using VSCode then you can use a launch configuration similar to this:

{
   "name": "Connect to CAPZ",
   "type": "go",
   "request": "attach",
   "mode": "remote",
   "remotePath": "",
   "port": 30000,
   "host": "127.0.0.1",
   "showLog": true,
   "trace": "log",
   "logOutput": "rpc"
}

Manual Testing

Creating a dev cluster

The steps below are provided in a convenient script in hack/create-dev-cluster.sh. Be sure to set AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_SUBSCRIPTION_ID, and AZURE_TENANT_ID before running. Optionally, you can override the different cluster configuration variables. For example, to override the workload cluster name:

CLUSTER_NAME=<my-capz-cluster-name> ./hack/create-dev-cluster.sh

NOTE: CLUSTER_NAME can only include letters, numbers, and hyphens and can't be longer than 44 characters.

Building and pushing dev images

To build images with custom tags, run the make docker-build as follows:

export REGISTRY="<container-registry>"
export MANAGER_IMAGE_TAG="<image-tag>" # optional - defaults to `dev`.
PULL_POLICY=IfNotPresent make docker-build

(optional) Push your docker images:

2.1. Login to your container registry using docker login.

e.g., docker login quay.io

2.2. Push to your custom image registry:
```
REGISTRY=${REGISTRY} MANAGER_IMAGE_TAG=${MANAGER_IMAGE_TAG:="dev"} make docker-push
```
NOTE: make create-cluster will fetch the manager image locally and load it onto the kind cluster if it is present.

Customizing the cluster deployment

Here is a list of required configuration parameters (the full list is available in templates/cluster-template.yaml):

# Cluster settings.
export CLUSTER_NAME="capz-cluster"
export AZURE_VNET_NAME=${CLUSTER_NAME}-vnet

# Azure settings.
export AZURE_LOCATION="southcentralus"
export AZURE_RESOURCE_GROUP=${CLUSTER_NAME}
export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"

# Machine settings.
export CONTROL_PLANE_MACHINE_COUNT=3
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_B2s"
export AZURE_NODE_MACHINE_TYPE="Standard_B2s"
export WORKER_MACHINE_COUNT=2
export KUBERNETES_VERSION="v1.32.2"

# Identity secret.
export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
export CLUSTER_IDENTITY_NAME="cluster-identity"
export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"

# Generate SSH key.
# If you want to provide your own key, skip this step and set AZURE_SSH_PUBLIC_KEY_B64 to your existing file.
SSH_KEY_FILE=.sshkey
rm -f "${SSH_KEY_FILE}" 2>/dev/null
ssh-keygen -t rsa -b 2048 -f "${SSH_KEY_FILE}" -N '' 1>/dev/null
echo "Machine SSH key generated in ${SSH_KEY_FILE}"
# For Linux the ssh key needs to be b64 encoded because we use the azure api to set it
# Windows doesn't support setting ssh keys so we use cloudbase-init to set which doesn't require base64
export AZURE_SSH_PUBLIC_KEY_B64=$(cat "${SSH_KEY_FILE}.pub" | base64 | tr -d '\r\n')
export AZURE_SSH_PUBLIC_KEY=$(cat "${SSH_KEY_FILE}.pub" | tr -d '\r\n')

⚠️ Please note the generated templates include default values and therefore require the use of clusterctl to create the cluster or the use of envsubst to replace these values

Creating the cluster

⚠️ Make sure you followed the previous two steps to build the dev image and set the required environment variables before proceeding.

Ensure dev environment has been reset:

make clean kind-reset

Create the cluster:

make create-cluster

Check out the self-managed and managed troubleshooting guides for common errors you might run into

Instrumenting Telemetry

Telemetry is the key to operational transparency. We strive to provide insight into the internal behavior of the system through observable traces and metrics.

Distributed Tracing

Distributed tracing provides a hierarchical view of how and why an event occurred. CAPZ is instrumented to trace each controller reconcile loop. When the reconcile loop begins, a trace span begins and is stored in loop context.Context. As the context is passed on to functions below, new spans are created, tied to the parent span by the parent span ID. The spans form a hierarchical representation of the activities in the controller.

These spans can also be propagated across service boundaries. The span context can be passed on through metadata such as HTTP headers. By propagating span context, it creates a distributed, causal relationship between services and functions.

For tracing, we use OpenTelemetry.

Here is an example of staring a span in the beginning of a controller reconcile.

ctx, logger, done := tele.StartSpanWithLogger(ctx, "controllers.AzureMachineReconciler.Reconcile",
   tele.KVP("namespace", req.Namespace),
   tele.KVP("name", req.Name),
   tele.KVP("kind", "AzureMachine"),
)
defer done()

The code above creates a context with a new span stored in the context.Context value bag. If a span already existed in the ctx argument, then the new span would take on the parentID of the existing span, otherwise the new span becomes a "root span", one that does not have a parent. The span is also created with labels, or tags, which provide metadata about the span and can be used to query in many distributed tracing systems.

It also creates a logger that logs messages both to the span and STDOUT. The span is not returned directly, but closure of the span is handled by the final done value. This is a simple nil-ary function (func()) that should be called as appropriate. Most likely, this should be done in a defer -- as shown in the above code sample -- to ensure that the span is closed at the end of your function or scope.

Consider adding tracing if your func accepts a context.

Metrics

Metrics provide quantitative data about the operations of the controller. This includes cumulative data like counters, single numerical values like guages, and distributions of counts / samples like histograms & summaries.

In CAPZ we expose metrics using the Prometheus client. The Kubebuilder project provides a guide for metrics and for exposing new ones.

Submitting PRs and testing

Pull requests and issues are highly encouraged! If you're interested in submitting PRs to the project, please be sure to run some initial checks prior to submission:

make lint # Runs a suite of quick scripts to check code structure
make lint-fix # Runs a suite of quick scripts to fix lint errors
make verify # Runs a suite of verifying binaries
make test # Runs tests on the Go code

Executing unit tests

make test executes the project's unit tests. These tests do not stand up a Kubernetes cluster, nor do they have external dependencies.

Automated Testing

Mocks

Mocks for the services tests are generated using GoMock.

To generate the mocks you can run

make generate-go

E2E Testing

To run E2E locally, set AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_SUBSCRIPTION_ID, AZURE_TENANT_ID, and run:

./scripts/ci-e2e.sh

Note: Users that have a restrictive environment and want to leverage API Server ILB in their flavors and to run e2e tests locally should refer to the detailed explanation in running e2e tests locally leveraging apiserver ilb solution

You can optionally set the following variables:

Variable	Description	Default
`E2E_CONF_FILE`	The path of the E2E configuration file.	`${GOPATH}/src/sigs.k8s.io/cluster-api-provider-azure/test/e2e/config/azure-dev.yaml`
`SKIP_LOG_COLLECTION`	Set to `true` if you do not want logs to be collected after running E2E tests. This is highly recommended for developers with Azure subscriptions that block SSH connections.	`false`
`SKIP_CLEANUP`	Set to `true` if you do not want the bootstrap and workload clusters to be cleaned up after running E2E tests.	`false`
`SKIP_CREATE_MGMT_CLUSTER`	Skip management cluster creation. If skipping management cluster creation you must specify `KUBECONFIG` and `SKIP_CLEANUP`	`false`
`USE_LOCAL_KIND_REGISTRY`	Use Kind local registry and run the subset of tests which don't require a remotely pushed controller image. If set, `REGISTRY` is also set to `localhost:5000/ci-e2e`.	`true`
`REGISTRY`	Registry to push the controller image.	`capzci.azurecr.io/ci-e2e`
`IMAGE_NAME`	The name of the CAPZ controller image.	`cluster-api-azure-controller`
`CONTROLLER_IMG`	The repository/full name of the CAPZ controller image.	`${REGISTRY}/${IMAGE_NAME}`
`ARCH`	The image architecture argument to pass to Docker, allows for cross-compiling.	`${GOARCH}`
`TAG`	The tag of the CAPZ controller image. If `BUILD_MANAGER_IMAGE` is set, then `TAG` is set to `$(date -u '+%Y%m%d%H%M%S')` instead of `dev`.	`dev`
`BUILD_MANAGER_IMAGE`	Build the CAPZ controller image. If not set, then we will attempt to load an image from `${CONTROLLER_IMG}-${ARCH}:${TAG}`.	`true`
`CLUSTER_NAME`	Name of an existing workload cluster. Must be set to run specs against existing workload cluster. Use in conjunction with `SKIP_CREATE_MGMT_CLUSTER`, `GINKGO_FOCUS`, `CLUSTER_NAMESPACE` and `KUBECONFIG`. Must specify only one e2e spec to run against with `GINKGO_FOCUS` such as `export GINKGO_FOCUS=Creating.a.VMSS.cluster.with.a.single.control.plane.node`.
`CLUSTER_NAMESPACE`	Namespace of an existing workload cluster. Must be set to run specs against existing workload cluster. Use in conjunction with `SKIP_CREATE_MGMT_CLUSTER`, `GINKGO_FOCUS`, `CLUSTER_NAME` and `KUBECONFIG`. Must specify only one e2e spec to run against with `GINKGO_FOCUS` such as `export GINKGO_FOCUS=Creating.a.VMSS.cluster.with.a.single.control.plane.node`.
`KUBECONFIG`	Used with `SKIP_CREATE_MGMT_CLUSTER` set to true. Location of kubeconfig for the management cluster you would like to use. Use `kind get kubeconfig --name capz-e2e > kubeconfig.capz-e2e` to get the capz e2e kind cluster config	'~/.kube/config'

You can also customize the configuration of the CAPZ cluster created by the E2E tests (except for CLUSTER_NAME, AZURE_RESOURCE_GROUP, AZURE_VNET_NAME, CONTROL_PLANE_MACHINE_COUNT, and WORKER_MACHINE_COUNT, since they are generated by individual test cases). See Customizing the cluster deployment for more details.

Conformance Testing

To run the Kubernetes Conformance test suite locally, you can run

./scripts/ci-conformance.sh

Optional settings are:

Environment Variable	Default Value	Description
`WINDOWS`	`false`	Run conformance against Windows nodes
`CONFORMANCE_NODES`	`1`	Number of parallel ginkgo nodes to run
`CONFORMANCE_FLAVOR`	`""`	The flavor of the cluster to run conformance against. If not set, the default flavor will be used.
`IP_FAMILY`	`IPv4`	Set to `IPv6` to run conformance against single-stack IPv6, or `dual` for dual-stack.

With the following environment variables defined, you can build a CAPZ cluster from the HEAD of Kubernetes main branch or release branch, and run the Conformance test suite against it.

Environment Variable	Value
`E2E_ARGS`	`-kubetest.use-ci-artifacts`
`KUBERNETES_VERSION`	`latest` - extract Kubernetes version from https://dl.k8s.io/ci/latest.txt (main's HEAD) `latest-1.<MINOR>` - extract Kubernetes version from dl.k8s.io/ci/latest-1..txt (release branch's HEAD)
`WINDOWS_SERVER_VERSION`	Optional, can be `windows-2019` (default) or `windows-2022`
`KUBETEST_WINDOWS_CONFIG`	Default is `upstream-windows.yaml`. CAPZ contains various other configuration recipes in the `test/e2e/data/` directory; you may use any of those by referencing their file names as the value of `KUBETEST_WINDOWS_CONFIG` (e.g., `conformance-fast.yaml`), or you may drop in your own config files into `test/e2e/data/` and reference those.
`WINDOWS_CONTAINERD_URL`	Optional, can be any url to a `tar.gz` file containing binaries for containerd in the same format as upstream package

With the following environment variables defined, CAPZ runs ./scripts/ci-build-kubernetes.sh as part of ./scripts/ci-conformance.sh, which allows developers to build Kubernetes from source and run the Kubernetes Conformance test suite against a CAPZ cluster based on the custom build:

Environment Variable	Value
`AZURE_STORAGE_ACCOUNT`	Your Azure storage account name
`AZURE_STORAGE_KEY`	Your Azure storage key
`USE_LOCAL_KIND_REGISTRY`	`false`
`REGISTRY`	Your Registry
`TEST_K8S`	`true`

Running custom test suites on CAPZ clusters

To run a custom test suite on a CAPZ cluster locally, set AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_SUBSCRIPTION_ID, AZURE_TENANT_ID and run:

./scripts/ci-entrypoint.sh bash -c "cd ${GOPATH}/src/github.com/my-org/my-project && make e2e"

You can optionally set the following variables:

Variable	Description
`AZURE_SSH_PUBLIC_KEY_FILE`	Use your own SSH key.
`SKIP_CLEANUP`	Skip deleting the cluster after the tests finish running.
`KUBECONFIG`	Provide your existing cluster kubeconfig filepath. If no kubeconfig is provided, `./kubeconfig` will be used.
`KUBERNETES_VERSION`	Desired Kubernetes version to test. You can pass in a definitive released version, e.g., "v1.24.0". If you want to use pre-released CI bits of a particular release you may use the "latest-" prefix, e.g., "latest-1.24"; you may use the very latest built CI bits from the kubernetes/kubernetes master branch by passing in "latest". If you provide a `KUBERNETES_VERSION` environment variable, you may not also use `CI_VERSION` (below). Use only one configuration variable to declare the version of Kubernetes to test.
`CI_VERSION`	Provide a custom CI version of Kubernetes (e.g., `v1.25.0-alpha.0.597+aa49dffc7f24dc`). If not specified, this will be determined from the KUBERNETES_VERSION above if it is an unreleased version. If you provide a `CI_VERSION` environment variable, you may not also use `KUBERNETES_VERSION` (above).
`TEST_CCM`	Build a cluster that uses custom versions of the Azure cloud-provider cloud-controller-manager and node-controller-manager images
`EXP_MACHINE_POOL`	Use Machine Pool for worker machines. Defaults to true.
`TEST_WINDOWS`	Build a cluster that has Windows worker nodes.
`REGISTRY`	Registry to push any custom k8s images or cloud provider images built.
`CLUSTER_TEMPLATE`	Use a custom cluster template. It can be a path to a template under templates/, a path on the host or a link. If the value is not set, the script will choose the appropriate cluster template based on existing environment variables.
`CCM_COUNT`	Set the number of cloud-controller-manager only when `TEST_CCM` is set. Besides it should not be more than control plane Node number.

You can also customize the configuration of the CAPZ cluster (assuming that SKIP_CREATE_WORKLOAD_CLUSTER is not set). See Customizing the cluster deployment for more details.

For Kubernetes Developers

If you are working on Kubernetes upstream, you can use the Cluster API Azure Provider to test your build of Kubernetes in an Azure environment.

Kubernetes 1.17+

Kubernetes has removed make WHAT=cmd/hyperkube command you will have to build individual Kubernetes components and deploy them separately. That includes:

Run the following commands to build Kubernetes and upload artifacts to a registry and Azure blob storage:

export AZURE_STORAGE_ACCOUNT=<AzureStorageAccount>
export AZURE_STORAGE_KEY=<AzureStorageKey>
export REGISTRY=<Registry>
export TEST_K8S="true"

source ./scripts/ci-build-kubernetes.sh

A template is provided that enables building clusters from custom built Kubernetes components:

export CLUSTER_TEMPLATE="test/dev/cluster-template-custom-builds.yaml"
./hack/create-dev-cluster.sh

Testing the out-of-tree cloud provider

To test changes made to the Azure cloud provider, first build and push images for cloud-controller-manager and/or cloud-node-manager from the branch of the cloud-provider-azure repo that the desired changes are in. Based on the repository, image name, and image tag you produce from your custom image build and push, set the appropriate environment variables below:

$ export IMAGE_REGISTRY=docker.io/myusername
$ export CCM_IMAGE_NAME=azure-cloud-controller-manager
$ export CNM_IMAGE_NAME=azure-node-controller-manager
$ export IMAGE_TAG=canary

Then, create a cluster:

$ export CLUSTER_NAME=my-cluster
$ make create-workload-cluster

Once your cluster deploys, you should receive the kubeconfig to the workload cluster. Set your KUBECONFIG environment variable to point to the kubeconfig file, then use the official cloud-provider-azure Helm chart to deploy the cloud-provider-azure components using your custom built images:

$ helm install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=${CLUSTER_NAME} \
--set cloudControllerManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudNodeManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudControllerManager.imageName="${CCM_IMAGE_NAME}" \
--set cloudNodeManager.imageName="${CNM_IMAGE_NAME}" \
--set cloudControllerManager.imageTag="${IMAGE_TAG}" \
--set cloudNodeManager.imageTag="${IMAGE_TAG}"

The helm command above assumes that you want to test custom images of both cloud-controller-manager and cloud-node-manager. If you only wish to test one component, you may omit the other one referenced in the example above to produce the desired helm install command (for example, if you wish to only test a custom cloud-controller-manager image, omit the three --set cloudNodeManager... arguments above).

Once you have installed the components via Helm, you should see the relevant pods running in your test cluster under the kube-system namespace. To iteratively develop on this test cluster, you may manually edit the cloud-controller-manager Deployment resource, and/or the cloud-node-manager Daemonset resource delivered via helm install. Or you may issue follow-up helm commands with each test iteration. For example:

$ export IMAGE_TAG=canaryv2
$ helm upgrade --install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=${CLUSTER_NAME} \
--set cloudControllerManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudNodeManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudControllerManager.imageName="${CCM_IMAGE_NAME}" \
--set cloudNodeManager.imageName="${CNM_IMAGE_NAME}" \
--set cloudControllerManager.imageTag="${IMAGE_TAG}" \
--set cloudNodeManager.imageTag="${IMAGE_TAG}"
$ export IMAGE_TAG=canaryv3
$ helm upgrade --install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=${CLUSTER_NAME} \
--set cloudControllerManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudNodeManager.imageRepository="${IMAGE_REGISTRY}" \
--set cloudControllerManager.imageName="${CCM_IMAGE_NAME}" \
--set cloudNodeManager.imageName="${CNM_IMAGE_NAME}" \
--set cloudControllerManager.imageTag="${IMAGE_TAG}" \
--set cloudNodeManager.imageTag="${IMAGE_TAG}"

Each successive helm upgrade --install command will release a new version of the chart, which will have the effect of replacing the Deployment and/or Daemonset image configurations (and thus replace the pods running in the cluster) with the new image version built and pushed for each test iteration.

Tilt with AKS as Management Cluster with Internal Load Balancer

Introduction

This guide is explaining how to set up and use Azure Kubernetes Service (AKS) as a management cluster for Cluster API Provider Azure (CAPZ) development using Tilt and an internal load balancer (ILB).

While the default Tilt setup recommends using a KIND cluster as the management cluster for faster development and experimentation, this guide demonstrates using AKS as an alternative management cluster. We also cover additional steps for working with stricter network policies - particularly useful for organizations that need to maintain all cluster communications within their Azure Virtual Network (VNet) infrastructure with enhanced access controls.

Who is this for?

Developers who want to use AKS as the management cluster for CAPZ development.
Developers working in environments with strict network security requirements.
Teams that need to keep all Kubernetes API traffic within Azure VNet

Note: This is not a production ready solution and should not be used in a production environment. This is only meant to be used for development/testing purposes.

Prerequisites

Requirements

A Microsoft Azure account
- Note: If using a new subscription, make sure to register the following resource providers:
  - Microsoft.Compute
  - Microsoft.Network
  - Microsoft.ContainerService
  - Microsoft.ManagedIdentity
  - Microsoft.Authorization
  - Microsoft.ResourceHealth (if the EXP_AKS_RESOURCE_HEALTH feature flag is enabled)
Install the Azure CLI
A supported version of clusterctl
Basic understanding of Azure networking concepts, Cluster API, and CAPZ.
go, wget, and tilt installed on your development machine.
If tilt-settings.yaml file exists in the root of your repo, clear out any values in kustomize_settings unless you want to use them instead of the values that will be set by running make aks-create.

Managed Identity & Registry Setup

Have a managed identity created from Azure Portal.

Add the following lines to your shell config such as ~/.bashrc or ~/.zshrc

export USER_IDENTITY="<user-assigned-managed-identity-name>"
export AZURE_CLIENT_ID_USER_ASSIGNED_IDENTITY="<user-assigned-managed-identity-client-id>"
export AZURE_CLIENT_ID="${AZURE_CLIENT_ID_USER_ASSIGNED_IDENTITY}"
export AZURE_OBJECT_ID_USER_ASSIGNED_IDENTITY="<user-assigned-managed-identity--object-id>"
export AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID="<resource-id-of-user-assigned-managed-id-from-json-view>"
export AZURE_LOCATION="<azure-location-having-quota-for-B2s-and-D4s_v3-SKU>"
export REGISTRY=<your-container-registry>

Be sure to reload with source ~/.bashrc or source ~/.zshrc and then verify the correct env vars values return with echo $AZURE_CLIENT_ID and echo $REGISTRY.

Steps to Use Tilt with AKS as the Management Cluster

Ensure that the tilt-settings.yaml in root of the repository looks like below
```
   kustomize_substitutions: {}
   allowed_contexts:
   - "kind-capz"
   container_args:
      capz-controller-manager:
         - "--v=4"
```
- Add env variables in kustomize_substitutions if you want the added env variables to take precedence over the env values exported by running make aks-create.
- Port over an variables set in tilt-settings.json to tilt-settings.yaml's kustomize_substitution:{} and delete tilt-settings.json if present in your local.
make clean
- This make target does not need to be run every time. Run it to remove bin and kubeconfigs.
make generate
- This make target does not need to be run every time. Run it to update your go related targets, manifests, flavors, e2e-templates and addons.
- Run it periodically upon updating your branch or if you want to make changes in your templates.
make acr-login
- Run this make target only if you have REGISTRY set to an Azure Container Registry. If you used DockerHub like we recommend, you can skip this step.
make aks-create
- Run this target to bring up an AKS cluster.
- Once the AKS cluster is created, you can reuse the cluster as many times as you like. Tilt ensures that the new image gets deployed every time there are changes in the Tiltfile and/or dependent files.
- Running make aks-create cleans up any existing variables from aks_as_mgmt_settings from the tilt-settings.yaml.
make tilt-up
- Run this target to use underlying cluster being pointed by your KUBECONFIG.
Once the tilt UI is up and running click on the allow required ports on mgmt cluster task (checkmark the box and reload) to allow the required ports on the management cluster's API server.
- Note: This task will wait for the NSG rules to be created and then update them to allow the required ports.
- This task will take a few minutes to complete. Wait for this to finish to avoid race conditions.
Check the flavors you want to deploy and CAPZ will deploy the workload cluster with the selected flavor.
- Flavors that leverage internal load balancer and are available for development in CAPZ for MSFT Tenant:
  - apiserver-ilb: VM-based default flavor that brings up native K8s clusters with Linux nodes.
  - apiserver-ilb-windows: VM-based flavor that brings up native K8s clusters with Linux and Windows nodes.

Running e2e tests locally using API Server ILB's networking solution

Running an e2e test locally in a restricted environment calls for some workarounds in the prow templates and the e2e test itself.

We need to add the apiserver ILB with private endpoints and predeterimined CIDRs to the workload cluster's VNet & Subnets, and pre-kubeadm commands updating the /etc/hosts file of the nodes of the workload cluster.
Once the template has been modified to be run in local environment using AKS as management cluster, we need to be able to peer the vnets, create private DNS zone for the FQDN of the workload cluster and re-enable blocked NSG ports.

Note:

The following guidance is only for debugging, and is not a recommendation for any production environment.
The below steps are for self-managed templates only and do not apply to AKS workload clusters.
If you are going to run the local tests from a dev machine in Azure, you will have to use user-assigned managed identity and assign it to the management cluster. Follow the below steps before proceeding.
1. Create a user-assigned managed identity
2. Assign that managed identity a contributor role to your subscription
3. Set AZURE_CLIENT_ID_USER_ASSIGNED_IDENTITY, AZURE_OBJECT_ID_USER_ASSIGNED_IDENTITY, and AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID to the user-assigned managed identity.

Update prow template with apiserver ILB networking solution

There are three sections of a prow template that need an update.

AzureCluster
- /spec/networkSpec/apiServerLB
  - Add FrontendIP
  - Add an associated private IP to be leveraged by an internal ILB
- /spec/networkSpec/vnet/cidrBlocks
  - Add VNet CIDR
- /spec/networkSpec/subnets/0/cidrBlocks
  - Add Subnet CIDR for the control plane
- /spec/networkSpec/subnets/1/cidrBlocks
  - Add Subnet CIDR for the worker node
KubeadmConfigTemplate - linux node; Identifiable by name: .*-md-0
- /spec/template/spec/preKubeadmCommands/0
  - Add a prekubeadm command updating the /etc/hosts of worker nodes of type "linux".
KubeadmConfigTemplate - windows node; Identifiable by name: .*-md-win
- /spec/template/spec/preKubeadmCommands/0
  - Add a prekubeadm command updating the /etc/hosts of worker nodes of type "windows".

A sample kustomize command for updating a prow template via its kustomization.yaml is pasted below.

- target:
    kind: AzureCluster
  patch: |-
    - op: add
      path: /spec/networkSpec/apiServerLB
      value:
        frontendIPs:
        - name: ${CLUSTER_NAME}-api-lb
          publicIP:
            dnsName: ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com
            name: ${CLUSTER_NAME}-api-lb
        - name: ${CLUSTER_NAME}-internal-lb-private-ip
          privateIP: ${AZURE_INTERNAL_LB_PRIVATE_IP}
- target:
    kind: AzureCluster
  patch: |-
    - op: add
      path: /spec/networkSpec/vnet/cidrBlocks
      value: []
    - op: add
      path: /spec/networkSpec/vnet/cidrBlocks/-
      value: ${AZURE_VNET_CIDR}
- target:
    kind: AzureCluster
  patch: |-
    - op: add
      path: /spec/networkSpec/subnets/0/cidrBlocks
      value: []
    - op: add
      path: /spec/networkSpec/subnets/0/cidrBlocks/-
      value: ${AZURE_CP_SUBNET_CIDR}
- target:
    kind: AzureCluster
  patch: |-
    - op: add
      path: /spec/networkSpec/subnets/1/cidrBlocks
      value: []
    - op: add
      path: /spec/networkSpec/subnets/1/cidrBlocks/-
      value: ${AZURE_NODE_SUBNET_CIDR}
- target:
    kind: KubeadmConfigTemplate
    name: .*-md-0
  patch: |-
    - op: add
      path: /spec/template/spec/preKubeadmCommands
      value: []
    - op: add
      path: /spec/template/spec/preKubeadmCommands/-
      value: echo '${AZURE_INTERNAL_LB_PRIVATE_IP}   ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com' >> /etc/hosts
- target:
    kind: KubeadmConfigTemplate
    name: .*-md-win
  patch: |-
    - op: add
      path: /spec/template/spec/preKubeadmCommands/-
      value:
        powershell -Command "Add-Content -Path 'C:\\Windows\\System32\\drivers\\etc\\hosts' -Value '${AZURE_INTERNAL_LB_PRIVATE_IP} ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com'"

Peer VNets of the management cluster and the workload cluster

Peering VNets, creating a private DNS zone with the FQDN of the workload cluster, and updating NSGs of the management and workload clusters can be achieved by running scripts/peer-vnets.sh.

This script, scripts/peer-vnets.sh, should be run after triggering the test run locally and from a separate terminal.

Running the test locally

We recommend running each test individually while debugging the test failure. This implies that GINKGO_FOCUS is as unique as possible. So for instance if you want to run periodic-cluster-api-provider-azure-e2e-main's "With 3 control-plane nodes and 2 Linux and 2 Windows worker nodes" test,

We first need to add the following environment variables to the test itself. For example:

Expect(os.Setenv("EXP_APISERVER_ILB", "true")).To(Succeed())
Expect(os.Setenv("AZURE_INTERNAL_LB_PRIVATE_IP", "10.0.0.101")).To(Succeed())
Expect(os.Setenv("AZURE_VNET_CIDR", "10.0.0.0/8")).To(Succeed())
Expect(os.Setenv("AZURE_CP_SUBNET_CIDR", "10.0.0.0/16")).To(Succeed())
Expect(os.Setenv("AZURE_NODE_SUBNET_CIDR", "10.1.0.0/16")).To(Succeed())

The above lines should be added before the clusterctl.ApplyClusterTemplateAndWait() is invoked.

Open the terminal and run the below command:
```
GINKGO_FOCUS="With 3 control-plane nodes and 2 Linux and 2 Windows worker nodes" USE_LOCAL_KIND_REGISTRY=false SKIP_CLEANUP="true" SKIP_LOG_COLLECTION="true" REGISTRY="<>" MGMT_CLUSTER_TYPE="aks" EXP_APISERVER_ILB=true AZURE_LOCATION="<>" ARCH="amd64" scripts/ci-e2e.sh
```
Note:
- Set MGMT_CLUSTER_TYPE to "aks" to leverage AKS as the management cluster.
- Set EXP_APISERVER_ILB to true to enable the API Server ILB feature gate.
- Set AZURE_CLIENT_ID_USER_ASSIGNED_IDENTITY, AZURE_OBJECT_ID_USER_ASSIGNED_IDENTITY and AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID to use the user-assigned managed identity instead of the AKS-created managed identity.
In a new terminal, wait for AzureClusters to be created by the above command. Check it using kubectl get AzureClusters -A. Note that this command will fail or will not output anything unless the above command, GINKGO_FOCUS..., has deployed the worker template and initiated workload cluster creation.

Once the worker cluster has been created, export the CLUSTER_NAME and CLUSTER_NAMESPACE. It is recommended that AZURE_INTERNAL_LB_PRIVATE_IP is set an IP of 10.0.0.x, say 10.0.0.101, to avoid any test updates.

Then open a new terminal at the root of the cluster api provider azure repo and run the below command.
```
AZURE_INTERNAL_LB_PRIVATE_IP="<Internal-IP-from-the-e2e-test>" CLUSTER_NAME="<e2e workload-cluster-name>" CLUSTER_NAMESPACE="<e2e-cluster-namespace>" ./scripts/peer-vnets.sh ./tilt-settings.yaml
```

You will see that the test progresses in the first terminal window that invoked GINKGO_FOCUS=....

Leveraging internal load balancer

By default using Tilt with Cluster API Provider Azure (CAPZ), the management cluster is exposed via a public endpoint. This works well for many development scenarios but presents challenges in environments with strict network security requirements.

Challenges and Solutions

Management to Workload Cluster Connectivity

Scenario:
- Management cluster cannot connect to workload cluster's API server via workload cluster's FQDN.
Solution:
- Peer management and workload cluster VNets.
- Set Workload cluster API server replica count to 3. (Default is 1 when using KIND as the management cluster).
  - This is done by setting CONTROL_PLANE_MACHINE_COUNT to 3 in the tilt-settings.yaml file.
  - make aks-create will set this value to 3 for you.
- Create a internal load balancer (ILB) with the workload cluster's apiserver VMs as its backend pool in the workload cluster's VNet.
  - This is achieved by setting EXP_INTERNAL_LB=true. EXP_INTERNAL_LB is set to true by default when running make tilt-up.
- Create private DNS zone with workload cluster's apiserver FQDN as its record to route management cluster calls to workload cluster's API server private IP.
  - As of current release, a private DNS zone is automatically created in the tilt workflow for apiserver-ilb and windows-apiserver-ilb flavors.
Workload Cluster Node Communication

Scenario:
- Workload cluster worker nodes should not be able to communicate with their workload cluster's API server's FQDN.
Solution:
- Update /etc/hosts on worker nodes via preKubeadmCommands in the KubeadmConfigTemplate with - echo '${AZURE_INTERNAL_LB_PRIVATE_IP} ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com' >> /etc/hosts
  - This essentially creates a static route (using the ILB's private IP) to the workload cluster's API server (FQDN) in the worker nodes.
  - Deploying apiserver-ilb or windows-apiserver-ilb flavor will deploy worker nodes of the workload cluster with the private IP of the ILB as their default route.
Network Security Restrictions

Scenario:
- Critical ports needed for management cluster to communicate with workload cluster.
- ports:
  - TCP: 443, 6443
  - UDP: 53
Solution:
- Once tilt UI is up and running, click on the allow required ports on mgmt cluster task to allow the required ports on the management cluster's API server.
Load Balancer Hairpin Routing

Challenge:
- Workload cluster's control plane fails to bootstrap due to internal LB connectivity timeouts.
- Single control plane node setup causes hairpin routing issues.
Solution:
- Use 3 control plane nodes in a stacked etcd setup.
  - Using aks as management cluster sets CONTROL_PLANE_MACHINE_COUNT to 3 by default.

Getting started with cluster-api-provider-azure

Prerequisites

Requirements

A Microsoft Azure account
- Note: If using a new subscription, make sure to register the following resource providers:
  - Microsoft.Compute
  - Microsoft.Network
  - Microsoft.ContainerService
  - Microsoft.ManagedIdentity
  - Microsoft.Authorization
  - Microsoft.ResourceHealth (if the EXP_AKS_RESOURCE_HEALTH feature flag is enabled)
Install the Azure CLI
A supported version of clusterctl

Setting up your Azure environment

An Azure Service Principal is needed for deploying Azure resources. The below instructions utilize environment-based authentication.

az login

List your Azure subscriptions.

az account list -o table

If more than one account is present, select the account that you want to use.

az account set -s <SubscriptionId>

Save your Subscription ID in an environment variable.

export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"

Create an Azure Service Principal by running the following command or skip this step and use a previously created Azure Service Principal. NOTE: the "owner" role is required to be able to create role assignments for system-assigned managed identity.

az ad sp create-for-rbac --role contributor --scopes="/subscriptions/${AZURE_SUBSCRIPTION_ID}"

Save the output from the above command somewhere easily accessible and secure. You will need to save the tenantID, clientID, and client secret. When creating a Cluster, you will need to provide these values as a part of the AzureClusterIdentity object. Note that authentication via environment variables is now removed and an AzureClusterIdentity is required to be created. An example AzureClusterIdentity object is shown below:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
  labels:
    clusterctl.cluster.x-k8s.io/move-hierarchy: "true"
  name: <cluster-identity-name>
  namespace: default
spec:
  allowedNamespaces: {}
  clientID: <clientID>
  clientSecret:
    name: <client-secret-name>
    namespace: <client-secret-namespace>
  tenantID: <tenantID>
  type: ServicePrincipal

Building your first cluster

The recommended way to build a cluster is to install a CAPZ management cluster using Cluster API Operator, then utilizing a Helm chart to create a workload cluster, specifically an ASO Managed Cluster.

To create a cluster manually, check out the Cluster API Quick Start for in-depth instructions. Make sure to select the "Azure" tabs. If you are looking to install additional ASO CRDs, set ADDITIONAL_ASO_CRDS to the list of CRDs you want to install. Refer to adding additional CRDs for Azure Service Operator here.

Creating a CAPZ Management Cluster with CAPI Operator

First, it is a common practice to create a temporary, local bootstrap cluster to deploy the management cluster. You can either use an existing Kubernetes cluster, or create a kind cluster for this purpose.

kind create cluster

Add the CAPI Operator and cert-manager Helm repositories, which we will use to install the management cluster.

helm repo add capi-operator https://kubernetes-sigs.github.io/cluster-api-operator
helm repo add jetstack https://charts.jetstack.io --force-update
helm repo update

Install cert manager:

helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set crds.enabled=true

Create a values.yaml file for the CAPI Operator Helm chart like so:

core: "cluster-api:v1.10.3"
infrastructure: "azure:v1.17.2"
addon: "helm:v0.2.5"
manager:
  featureGates:
    core:
      ClusterTopology: true

Install the CAPI Operator:

helm install capi-operator capi-operator/cluster-api-operator --create-namespace -f values.yaml --wait --timeout 90s

Creating an ASO Managed workload cluster

Once your management cluster is up and running, you can create an ASO Managed Cluster using the azure-aks-aso Helm chart.

Add the cluster-api-charts Helm repository:

helm repo add capi https://mboersma.github.io/cluster-api-charts

Specify values for the CAPZ AKS-ASO Helm chart in a values.yaml file. For example:

credentialSecretName: "aso-credentials"
createCredentials: true
subscriptionID: "subscription-id"
tenantID: "tenant-id"
clientID: "client-id"
# Leave clientSecret blank if using WorkloadIdentity
clientSecret: "client-secret"
# set to podIdentity for managed identity or service principal even if NOT using pod identity
authMode: "podIdentity"

# clusterName defaults to the name of the Helm release
clusterName: ""
location: westus3

managedMachinePoolSpecs:
  pool0:
    count: 1
    mode: System
    vmSize: Standard_DS2_v2
    type: VirtualMachineScaleSets
  pool1:
    count: 1
    mode: User
    vmSize: Standard_DS2_v2
    type: VirtualMachineScaleSets

Install the Helm chart:

helm install quick-start capi/azure-aks-aso -f values.yaml

For more information on the azure-aks-aso Helm chart, see the chart documentation.

Warning

Not all versions of clusterctl are supported. Please see which versions are currently supported

Documentation

Please see the CAPZ book for in-depth user documentation.

CAPZ Releases

Release Cadence

CAPZ minor versions (that is, 1.5.0 versus 1.4.x) are typically released every two months. In order to be practical and flexible, we will consider a more rapid minor release (for example, earlier than two months following the latest minor release) if any of the following conditions are true:

Significant set of well-tested features introduced since the last release.
- The guiding principle here is to avoid minor releases that introduce excess feature changes, to aid release testing and validation, and to allow users the option of adopting fewer changes when upgrading.
User demand for rapid adoption of a specific feature or feature.

Additionally, we will consider delaying a minor release if no significant features have landed during the normal two-month release cycle.

CAPZ patch versions (for example, 1.5.2 versus 1.5.1) are released as often as weekly. Each week at the open office hours meeting, maintainers decide whether or not a patch release is called for based on community input. A patch release may bypass this cadence if circumstances warrant.

Release Support

The two most recent minor releases of CAPZ will be supported with bug fixes. Assuming minor releases arrive every two months on average, each minor release will receive fixes for four months.

For example, let's assume CAPZ v1.4.2 is the current release, and v1.3.2 is the latest in the previous minor release line. When v1.5.0 is released, it becomes the current release. v1.4.2 becomes the previous release line and remains supported. And v1.3.2 reaches end-of-life and no longer receives support through bug fixes.

Note that "support" in this context refers strictly to whether or not bug fixes are backported to a release line. Please see the support documentation for more general information about how to get help with CAPZ.

Bug Fixes and Test Improvements

Any significant user-facing bug fix that lands in the main branch should be backported to the current and previous release lines. Security-related fixes are automatically considered significant and user-facing.

Improvements or significant changes to tests should be backported to the current release line. This is intended to minimize friction in the event of a critical test fix. Test improvements or changes may sometimes need to be backported to the previous release line in the event that tests break on all release branches.

Experimental API Changes

Experimental Cluster API features (for example, AzureManagedCluster) may evolve more rapidly than graduated v1 features. CAPZ allows general changes, enhancements, or additions in this area to be cherry-picked into the current release branch for inclusion in patch releases. This will accelerate the effort to graduate experimental features to the stable API by allowing faster adoption and iteration.

Breaking changes are also allowed in experimental APIs; those changes will not be included in a patch release, but will be introduced in a new minor release, with appropriate release notes.

Timing of Merges

Sometimes pull requests touch a large number of files and are more likely to create challenges for the automated cherry-pick process. In such cases, maintainers may prefer to delay merging such changes until the end of a minor release cycle.

Release Process

The release process can be assisted by any contributor, but requires some specific steps to be be done by a maintainer as shown by (maintainer) to the right of the step title. The process is as follows:

1. Update main metadata.yaml (skip for patch releases)

Make sure the metadata.yaml file in the root of the project is up to date and contains the new release with the correct cluster-api contract version.
- If not, open a PR to add it.

This must be done prior to generating release artifacts, so the release contains the correct metadata information for clusterctl to use.

2. Change milestone (skip for patch releases) (maintainer)

Create a new GitHub milestone for the next release.
Change the milestone applier so new changes can be applied to the appropriate release. A sample PR in test infra to update the release.

Versioning

cluster-api-provider-azure follows the semantic versionining specification.

Example versions:

Pre-release: v0.1.1-alpha.1
Minor release: v0.1.0
Patch release: v0.1.1
Major release: v1.0.0

3. Open a PR for release notes

If you don't have a GitHub token, create one by going to your GitHub settings, in Personal access tokens. Make sure you give the token the repo scope. If you would like the next step (promote image) to automatically create a PR from your fork, then the token will also need pull request permissions, else you can create the PR manually.
Fetch the latest changes from upstream and check out the main branch:
```
git fetch upstream
git checkout main
```

Generate release notes by running the following commands on the main branch:

export GITHUB_TOKEN=<your GH token>
export RELEASE_TAG=v1.2.3 # change this to the tag of the release to be cut
make release-notes

Review the release notes file generated at CHANGELOG/<RELEASE_TAG>.md and make any necessary changes:
- Move items out of "Uncategorized" into an appropriate section.
- Change anything attributed to "k8s-cherrypick-robot" to credit the original author.
- Fix any typos or other errors.
- Add a "Details" section with a link to the full diff:
```
## Details

https://github.com/kubernetes-sigs/cluster-api-provider-azure/compare/v1.14.4...v1.14.5
```
  Be sure to replace the versions in the URL with the appropriate tags.
Open a pull request against the main branch with the release notes.

Merging the PR will automatically trigger a Github Action to create a release branch (if needed), push a tag, and publish a draft release.

4. Promote image to prod repo

Images are built by the post push images job. This will push the image to a staging repository.
Wait for the above job to complete for the tag commit and for the image to exist in the staging directory, then create a PR to promote the image and tag. Assuming you're on the main branch and that $RELEASE_TAG is still set in your environment:
- make promote-images

This will automatically create a PR in k8s.io and assign the CAPZ maintainers. (See an example PR.) If the GITHUB_TOKEN doesn't have permissions for PR, it should still create the branch and code then a manual PR can be created.

Note

make promote-images assumes your git remote entries are using https:// URLs. Using git@ URLs will cause the command to fail and instead manually change the USER_FORK value in the Makefile to your forked root repository URL e.g. 'dtzar'.

5. Review and approve promoted prod image (maintainer)

For reviewers of the above-created PR, to confirm that the resultant image SHA-to-tag addition is valid, you can check against the staging repository.

Using the above example PR, to verify that the image identified by SHA d0636fad7f4ced58b5385615a53b7cb2053f79c4788bd299e0ac9e46a25b5053 has the expected v1.4.3, tag, you would inspect the image metadata by viewing it in the Google Container Registry UI:

https://console.cloud.google.com/gcr/images/k8s-staging-cluster-api-azure/global/cluster-api-azure-controller@sha256:d0636fad7f4ced58b5385615a53b7cb2053f79c4788bd299e0ac9e46a25b5053

6. Release in GitHub (maintainer)

Proofread the GitHub release content and fix any remaining errors. (This is copied from the release notes generated earlier.) If you made changes, save it as a draft–don't publish it yet.
Ensure that the promoted release image is live:
```
docker pull registry.k8s.io/cluster-api-azure/cluster-api-azure-controller:${RELEASE_TAG}
```
Don't move on to the next step until the above command succeeds.
Check expected artifacts
1. A release yaml file infrastructure-components.yaml containing the resources needed to deploy to Kubernetes
2. A cluster-templates.yaml for each supported flavor
3. A metadata.yaml which maps release series to cluster-api contract version
4. Release notes
Publish the release in GitHub. Check Set as the latest release if appropriate.

7. Update docs (skip for patch releases) (maintainer)

Go to the Netlify branches and deploy contexts in site settings and click "edit settings". Update the "Production branch" to the new release branch and click "Save". The, go to the Netlify site deploys and trigger a new deploy.

Netlify settings screenshot

Note: this step requires access to the Netlify site. If you don't have access, please ask a maintainer to update the branch.

8. Announce the new release

Patch Releases

Announce the release in Kubernetes Slack on the #cluster-api-azure channel.

Minor/Major Releases

Follow the communications process for patch-releases
An announcement email is sent to kubernetes-sig-azure@googlegroups.com and kubernetes-sig-cluster-lifecycle@googlegroups.com with the subject [ANNOUNCE] cluster-api-provider-azure <version> has been released

Post release steps (maintainer)

Open a PR in https://github.com/kubernetes/test-infra to change this line.
- See an example PR.

Update test provider versions (skip for patch releases)

This can be done in parallel with release publishing and does not impact the release or its artifacts.

Update test metadata.yaml

Using that same next release version used to create a new milestone, update the the CAPZ provider metadata.yaml that we use to run PR and periodic cluster E2E tests against the main branch templates. (This metadata.yaml is in the test/e2e/data/shared/v1beta1_provider directory; it's not the one in the project's root that we edited earlier.)

For example, if the latest stable API version of CAPZ that we run E2E tests against is v1beta, and we're releasing v1.12.0, and our next release version is v1.13.0, then we want to ensure that the metadata.yaml defines a contract between v1.13.0 and v1beta1:

apiVersion: clusterctl.cluster.x-k8s.io/v1alpha3
releaseSeries:
  - major: 1
    minor: 11
    contract: v1beta1
  - major: 1
    minor: 12
    contract: v1beta1
  - major: 1
    minor: 13
    contract: v1beta1

Additionally, we need to update the type: InfrastructureProvider spec in azure-dev.yaml to express that our intent is to test (using the above example) 1.13. By convention we use a sentinel patch version "99" to express "any patch version". In this example we want to look for the type: InfrastructureProvider with a name value of v1.12.99 and update it to v1.13.99:

    - name: v1.13.99 # "vNext"; use manifests from local source files

Update clusterctl API version upgrade tests

Update the API version upgrade tests to use the oldest supported release versions of CAPI and CAPZ after the release is cut as "Init" provider versions. See this PR for more details.

Update Upstream Tests (skip for patch releases)

For major and minor releases we will need to update the set of capz-dependent test-infra jobs so that they use our latest release branch. For example, if we cut a new 1.3.0 minor release, from a newly created release-1.3 git branch, then we need to update all test jobs to use capz at release-1.3 instead of release-1.2.

Here is a reference PR that applied the required test job changes following the 1.3.0 minor release described above:

Reference test-infra PR

Roadmap (maintainer)

Consider whether anything should be updated in the roadmap document by answering the following questions:

Have any of the Epics listed been entirely or largely achieved? If so, then the Epic should likely be removed and highlighted during the release communications.
Are there any new Epics we want to highlight? If so, then consider opening a PR to add them and bringing them up in the next office hours planning meeting with the milestone review.
Have any updates to the roadmap document occurred in the past 6 months? If not, it should be updated in some form.

If any changes need to be made, it should not block the release itself.

Jobs

This document provides an overview of our jobs running via Prow and Github actions.

Builds and Tests running on the default branch

Legend

🟢 REQUIRED - Jobs that have to run successfully to get the PR merged.

Presubmits

Prow Presubmits:

🟢 pull-cluster-api-provider-azure-build ./scripts/ci-build.sh
🟢 pull-cluster-api-provider-azure-test ./scripts/ci-test.sh
🟢 pull-cluster-api-provider-azure-e2e
- GINKGO_FOCUS="Workload cluster creation" GINKGO_SKIP="Creating a GPU-enabled cluster|.*Windows.*|.*AKS.*|Creating a cluster that uses the external cloud provider" ./scripts/ci-e2e.sh
🟢 pull-cluster-api-provider-azure-windows
- GINKGO_FOCUS=".*Windows.*" GINKGO_SKIP="" ./scripts/ci-e2e.sh
🟢 pull-cluster-api-provider-azure-verify make verify
pull-cluster-api-provider-azure-e2e-exp
- GINKGO_FOCUS=".*AKS.*" GINKGO_SKIP="" ./scripts/ci-e2e.sh
pull-cluster-api-provider-azure-apidiff ./scripts/ci-apidiff.sh
pull-cluster-api-provider-azure-coverage ./scripts/ci-test-coverage
pull-cluster-api-provider-e2e-full
- GINKGO_FOCUS="Workload cluster creation" GINKGO_SKIP="" ./scripts/ci-e2e.sh
pull-cluster-api-provider-capi-e2e
- GINKGO_FOCUS="Cluster API E2E tests" GINKGO_SKIP="" ./scripts/ci-e2e.sh
pull-cluster-api-provider-azure-conformance-v1beta1 ./scripts/ci-conformance.sh
pull-cluster-api-provider-azure-upstream-v1beta1-windows
- WINDOWS="true" CONFORMANCE_NODES="4" ./scripts/ci-conformance.sh
pull-cluster-api-provider-azure-conformance-with-ci-artifacts
- E2E_ARGS="-kubetest.use-ci-artifacts" ./scripts/ci-conformance.sh
pull-cluster-api-provider-azure-windows-upstream-with-ci-artifacts
- E2E_ARGS="-kubetest.use-ci-artifacts" WINDOWS="true" CONFORMANCE_NODES="4" ./scripts/ci-conformance.sh
pull-cluster-api-provider-azure-ci-entrypoint
- Validates cluster creation with ./scripts/ci-entrypoint.sh - does not run any tests

Github Presubmits Workflows:

Markdown-link-check find . -name \*.md | xargs -I{} markdown-link-check -c .markdownlinkcheck.json {}

Postsubmits

Prow Postsubmits:

post-cluster-api-provider-azure-push-images /run.sh
- args:
  - project=k8s-staging-cluster-api-azure
  - scratch-bucket=gs://k8s-staging-cluster-api-azure-gcb
  - env-passthrough=PULL_BASE_REF
postsubmits-cluster-api-provider-azure-e2e-full-main
- GINKGO_FOCUS="Workload cluster creation" GINKGO_SKIP="" ./scripts/ci-e2e.sh

Github Postsubmits Workflows:

Code-coverage-check make test-cover

Periodics

Prow Periodics:

periodic-cluster-api-provider-azure-conformance-v1beta1 ./scripts/ci-conformance.sh
periodic-cluster-api-provider-azure-conformance-v1beta1-with-ci-artifacts ./scripts/ci-conformance.sh
- E2E_ARGS="-kubetest.use-ci-artifacts" ./scripts/ci-conformance.sh
periodic-cluster-api-provider-azure-capi-e2e
- GINKGO_FOCUS="Cluster API E2E tests" GINKGO_SKIP="" ./scripts/ci-e2e.sh
periodic-cluster-api-provider-azure-coverage bash , ./scripts/ci-test-coverage.sh
periodic-cluster-api-provider-azure-e2e-full
- GINKGO_FOCUS="Workload cluster creation" GINKGO_SKIP="" ./scripts/ci-e2e.sh

Reference

This section contains reference documentation for CAPZ API types.

v1beta1

Metadata	Value
Group	infrastructure.cluster.x-k8s.io
Version
Module	sigs.k8s.io/cluster-api-provider-azure/api/v1beta1
Property Optionality

AADProfile

AADProfile - AAD integration managed by AKS. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
adminGroupObjectIDs	AdminGroupObjectIDs - AAD group object IDs that will have admin role of the cluster.	string[] Required
managed	Managed - Whether to enable managed AAD.	bool Required

AddonProfile

AddonProfile represents a managed cluster add-on.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
config	Config - Key-value pairs for configuring the add-on.	map[string]string
enabled	Enabled - Whether the add-on is enabled or not.	bool
name	Name - The name of the managed cluster add-on.	string

AddressRecord

AddressRecord specifies a DNS record mapping a hostname to an IPV4 or IPv6 address.

Property	Description	Type
Hostname		string
IP		string

AKSAssignedIdentity

AKSAssignedIdentity defines the AKS assigned-identity of the aks marketplace extension, if configured.

Used by: AKSExtension.

AKSExtension

AKSExtension represents the configuration for an AKS cluster extension. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
aksAssignedIdentityType	AKSAssignedIdentityType is the type of the AKS assigned identity.	AKSAssignedIdentity
autoUpgradeMinorVersion	AutoUpgradeMinorVersion is a flag to note if this extension participates in auto upgrade of minor version, or not.	bool
configurationSettings	ConfigurationSettings are the name-value pairs for configuring this extension.	map[string]string
extensionType	ExtensionType is the type of the Extension of which this resource is an instance. It must be one of the Extension Types registered with Microsoft.KubernetesConfiguration by the Extension publisher.	string
identity	Identity is the identity type of the Extension resource in an AKS cluster.	ExtensionIdentity
name	Name is the name of the extension.	string
plan	Plan is the plan of the extension.	ExtensionPlan
releaseTrain	ReleaseTrain is the release train this extension participates in for auto-upgrade (e.g. Stable, Preview, etc.) This is only used if autoUpgradeMinorVersion is ‘true’.	string
scope	Scope is the scope at which this extension is enabled.	ExtensionScope
version	Version is the version of the extension.	string

AKSSku

AKSSku - AKS SKU.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
tier	Tier - Tier of an AKS cluster.	AzureManagedControlPlaneSkuTier

APIServerAccessProfile

APIServerAccessProfile tunes the accessibility of the cluster's control plane. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
APIServerAccessProfileClassSpec
authorizedIPRanges	AuthorizedIPRanges - Authorized IP Ranges to kubernetes API server.	string[]

APIServerAccessProfileClassSpec

APIServerAccessProfileClassSpec defines the APIServerAccessProfile properties that may be shared across several API server access profiles.

Property	Description	Type
enablePrivateCluster	EnablePrivateCluster indicates whether to create the cluster as a private cluster or not.	bool
enablePrivateClusterPublicFQDN	EnablePrivateClusterPublicFQDN indicates whether to create additional public FQDN for private cluster or not.	bool
privateDNSZone	PrivateDNSZone enables private dns zone mode for private cluster.	string

AutoScalerProfile

AutoScalerProfile parameters to be applied to the cluster-autoscaler. See also AKS doc, K8s doc.
Default values are from https://learn.microsoft.com/azure/aks/cluster-autoscaler#using-the-autoscaler-profile

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
balanceSimilarNodeGroups	BalanceSimilarNodeGroups - Valid values are 'true' and 'false'. The default is false.	BalanceSimilarNodeGroups
expander	Expander - If not specified, the default is 'random'. See expanders for more information.	Expander
maxEmptyBulkDelete	MaxEmptyBulkDelete - The default is 10.	string
maxGracefulTerminationSec	MaxGracefulTerminationSec - The default is 600.	string
maxNodeProvisionTime	MaxNodeProvisionTime - The default is '15m'. Values must be an integer followed by an 'm'. No unit of time other than minutes (m) is supported.	string
maxTotalUnreadyPercentage	MaxTotalUnreadyPercentage - The default is 45. The maximum is 100 and the minimum is 0.	string
newPodScaleUpDelay	NewPodScaleUpDelay - For scenarios like burst/batch scale where you don't want CA to act before the kubernetes scheduler could schedule all the pods, you can tell CA to ignore unscheduled pods before they're a certain age. The default is '0s'. Values must be an integer followed by a unit ('s' for seconds, 'm' for minutes, 'h' for hours, etc).	string
okTotalUnreadyCount	OkTotalUnreadyCount - This must be an integer. The default is 3.	string
scaleDownDelayAfterAdd	ScaleDownDelayAfterAdd - The default is '10m'. Values must be an integer followed by an 'm'. No unit of time other than minutes (m) is supported.	string
scaleDownDelayAfterDelete	ScaleDownDelayAfterDelete - The default is the scan-interval. Values must be an integer followed by an 's'. No unit of time other than seconds (s) is supported.	string
scaleDownDelayAfterFailure	ScaleDownDelayAfterFailure - The default is '3m'. Values must be an integer followed by an 'm'. No unit of time other than minutes (m) is supported.	string
scaleDownUnneededTime	ScaleDownUnneededTime - The default is '10m'. Values must be an integer followed by an 'm'. No unit of time other than minutes (m) is supported.	string
scaleDownUnreadyTime	ScaleDownUnreadyTime - The default is '20m'. Values must be an integer followed by an 'm'. No unit of time other than minutes (m) is supported.	string
scaleDownUtilizationThreshold	ScaleDownUtilizationThreshold - The default is '0.5'.	string
scanInterval	ScanInterval - How often cluster is reevaluated for scale up or down. The default is '10s'.	string
skipNodesWithLocalStorage	SkipNodesWithLocalStorage - The default is false.	SkipNodesWithLocalStorage
skipNodesWithSystemPods	SkipNodesWithSystemPods - The default is true.	SkipNodesWithSystemPods

AzureASOManagedCluster

AzureASOManagedCluster is the Schema for the azureasomanagedclusters API.

Used by: AzureASOManagedClusterList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedClusterSpec
status		AzureASOManagedClusterStatus

AzureASOManagedClusterSpec

Property	Description	Type
AzureASOManagedClusterTemplateResourceSpec
controlPlaneEndpoint	ControlPlaneEndpoint is the location of the API server within the control plane. CAPZ manages this field and it should not be set by the user. It fulfills Cluster API's cluster infrastructure provider contract. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureASOManagedClusterStatus

Property	Description	Type
ready	Ready represents whether or not the cluster has been provisioned and is ready. It fulfills Cluster API's cluster infrastructure provider contract.	bool
resources		ResourceStatus[]

AzureASOManagedClusterList

AzureASOManagedClusterList contains a list of AzureASOManagedCluster.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedCluster[]

AzureASOManagedClusterTemplate

AzureASOManagedClusterTemplate is the Schema for the azureasomanagedclustertemplates API.

Used by: AzureASOManagedClusterTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedClusterTemplateSpec

AzureASOManagedClusterTemplateList

AzureASOManagedClusterTemplateList contains a list of AzureASOManagedClusterTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedClusterTemplate[]

AzureASOManagedClusterTemplateResource

AzureASOManagedClusterTemplateResource defines the templated resource.

Used by: AzureASOManagedClusterTemplateSpec.

Property	Description	Type
spec		AzureASOManagedClusterTemplateResourceSpec

AzureASOManagedClusterTemplateResourceSpec

AzureASOManagedClusterTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedClusterTemplateResource.

Property	Description	Type
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]

AzureASOManagedClusterTemplateSpec

AzureASOManagedClusterTemplateSpec defines the desired state of AzureASOManagedClusterTemplate.

Used by: AzureASOManagedClusterTemplate.

Property	Description	Type
template		AzureASOManagedClusterTemplateResource

AzureASOManagedControlPlane

AzureASOManagedControlPlane is the Schema for the azureasomanagedcontrolplanes API.

Used by: AzureASOManagedControlPlaneList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedControlPlaneSpec
status		AzureASOManagedControlPlaneStatus

AzureASOManagedControlPlaneSpec

AzureASOManagedControlPlaneStatus

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint for the cluster's API server.	clusterv1.APIEndpoint
initialized	Initialized represents whether or not the API server has been provisioned. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `ready`.	bool
ready	Ready represents whether or not the API server is ready to receive requests. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `initialized`.	bool
resources		ResourceStatus[]
version	Version is the observed Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedControlPlaneList

AzureASOManagedControlPlaneList contains a list of AzureASOManagedControlPlane.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedControlPlane[]

AzureASOManagedControlPlaneResource

AzureASOManagedControlPlaneResource defines the templated resource.

Used by: AzureASOManagedControlPlaneTemplateSpec, and AzureASOManagedMachinePoolTemplateSpec.

Property	Description	Type
spec		AzureASOManagedControlPlaneTemplateResourceSpec

AzureASOManagedControlPlaneTemplate

AzureASOManagedControlPlaneTemplate is the Schema for the azureasomanagedcontrolplanetemplates API.

Used by: AzureASOManagedControlPlaneTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedControlPlaneTemplateSpec

AzureASOManagedControlPlaneTemplateList

AzureASOManagedControlPlaneTemplateList contains a list of AzureASOManagedControlPlaneTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedControlPlaneTemplate[]

AzureASOManagedControlPlaneTemplateResourceSpec

AzureASOManagedControlPlaneTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedControlPlaneResource.

Property	Description	Type
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]
version	Version is the Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedControlPlaneTemplateSpec

AzureASOManagedControlPlaneTemplateSpec defines the desired state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlaneTemplate.

Property	Description	Type
template		AzureASOManagedControlPlaneResource

AzureASOManagedMachinePool

AzureASOManagedMachinePool is the Schema for the azureasomanagedmachinepools API.

Used by: AzureASOManagedMachinePoolList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedMachinePoolSpec
status		AzureASOManagedMachinePoolStatus

AzureASOManagedMachinePoolSpec

AzureASOManagedMachinePoolStatus

Property	Description	Type
ready	Ready represents whether or not the infrastructure is ready to be used. It fulfills Cluster API's machine pool infrastructure provider contract.	bool
replicas	Replicas is the current number of provisioned replicas. It fulfills Cluster API's machine pool infrastructure provider contract.	int32
resources		ResourceStatus[]

AzureASOManagedMachinePoolList

AzureASOManagedMachinePoolList contains a list of AzureASOManagedMachinePool.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedMachinePool[]

AzureASOManagedMachinePoolResource

AzureASOManagedMachinePoolResource defines the templated resource.

Property	Description	Type
spec		AzureASOManagedMachinePoolTemplateResourceSpec

AzureASOManagedMachinePoolTemplate

AzureASOManagedMachinePoolTemplate is the Schema for the azureasomanagedmachinepooltemplates API.

Used by: AzureASOManagedMachinePoolTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedMachinePoolTemplateSpec

AzureASOManagedMachinePoolTemplateList

AzureASOManagedMachinePoolTemplateList contains a list of AzureASOManagedMachinePoolTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedMachinePoolTemplate[]

AzureASOManagedMachinePoolTemplateResourceSpec

AzureASOManagedMachinePoolTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedMachinePoolResource.

Property	Description	Type
providerIDList	ProviderIDList is the list of cloud provider IDs for the instances. It fulfills Cluster API's machine pool infrastructure provider contract.	string[]
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]

AzureASOManagedMachinePoolTemplateSpec

AzureASOManagedMachinePoolTemplateSpec defines the desired state of AzureASOManagedMachinePoolTemplate.

Used by: AzureASOManagedMachinePoolTemplate.

Property	Description	Type
template		AzureASOManagedControlPlaneResource

AzureBastionTemplateSpec

AzureBastionTemplateSpec specifies a template for an Azure Bastion host.

Used by: BastionTemplateSpec.

Property	Description	Type
subnet		SubnetTemplateSpec

AzureCluster

AzureCluster is the Schema for the azureclusters API.

Used by: AzureClusterList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureClusterSpec
status		AzureClusterStatus

AzureClusterSpec

Property	Description	Type
AzureClusterClassSpec
bastionSpec	BastionSpec encapsulates all things related to the Bastions in the cluster.	BastionSpec
controlPlaneEnabled	ControlPlaneEnabled enables control plane components in the cluster.	bool
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. It is not recommended to set this when creating an AzureCluster as CAPZ will set this for you. However, if it is set, CAPZ will not change it.	clusterv1.APIEndpoint
networkSpec	NetworkSpec encapsulates all things related to Azure network.	NetworkSpec
resourceGroup		string

AzureClusterStatus

Property	Description	Type
conditions	Conditions defines current service state of the AzureCluster.	clusterv1.Conditions
failureDomains	FailureDomains specifies the list of unique failure domains for the location/region of the cluster. A FailureDomain maps to Availability Zone with an Azure Region (if the region support them). An Availability Zone is a separate data center within a region and they can be used to ensure the cluster is more resilient to failure. See: https://learn.microsoft.com/azure/reliability/availability-zones-overview This list will be used by Cluster API to try and spread the machines across the failure domains.	clusterv1.FailureDomains
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool

AzureClusterClassSpec

AzureClusterClassSpec defines the AzureCluster properties that may be shared across several Azure clusters.

Property	Description	Type
additionalTags	AdditionalTags is an optional set of tags to add to Azure resources managed by the Azure provider, in addition to the ones added by default.	Tags
azureEnvironment	AzureEnvironment is the name of the AzureCloud to be used. The default value that would be used by most users is "AzurePublicCloud", other values are: - ChinaCloud: "AzureChinaCloud" - GermanCloud: "AzureGermanCloud" - PublicCloud: "AzurePublicCloud" - USGovernmentCloud: "AzureUSGovernmentCloud" Note that values other than the default must also be accompanied by corresponding changes to the aso-controller-settings Secret to configure ASO to refer to the non-Public cloud. ASO currently does not support referring to multiple different clouds in a single installation. The following fields must be defined in the Secret: - AZURE_AUTHORITY_HOST - AZURE_RESOURCE_MANAGER_ENDPOINT - AZURE_RESOURCE_MANAGER_AUDIENCE See the ASO docs for more details.	string
cloudProviderConfigOverrides	CloudProviderConfigOverrides is an optional set of configuration values that can be overridden in azure cloud provider config. This is only a subset of options that are available in azure cloud provider config. Some values for the cloud provider config are inferred from other parts of cluster api provider azure spec, and may not be available for overrides. See: https://cloud-provider-azure.sigs.k8s.io/install/configs Note: All cloud provider config values can be customized by creating the secret beforehand. CloudProviderConfigOverrides is only used when the secret is managed by the Azure Provider.	CloudProviderConfigOverrides
extendedLocation	ExtendedLocation is an optional set of ExtendedLocation properties for clusters on Azure public MEC.	ExtendedLocationSpec
failureDomains	FailureDomains is a list of failure domains in the cluster's region, used to restrict eligibility to host the control plane. A FailureDomain maps to an availability zone, which is a separated group of datacenters within a region. See: https://learn.microsoft.com/azure/reliability/availability-zones-overview	clusterv1.FailureDomains
identityRef	IdentityRef is a reference to an AzureIdentity to be used when reconciling this cluster	corev1.ObjectReference
location		string
subscriptionID		string

AzureClusterIdentity

AzureClusterIdentity is the Schema for the azureclustersidentities API.

Used by: AzureClusterIdentityList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureClusterIdentitySpec
status		AzureClusterIdentityStatus

AzureClusterIdentitySpec

Property	Description	Type
allowedNamespaces	AllowedNamespaces is used to identify the namespaces the clusters are allowed to use the identity from. Namespaces can be selected either using an array of namespaces or with label selector. An empty allowedNamespaces object indicates that AzureClusters can use this identity from any namespace. If this object is nil, no namespaces will be allowed (default behaviour, if this field is not provided) A namespace should be either in the NamespaceList or match with Selector to use the identity.	AllowedNamespaces
certPath	CertPath is the path where certificates exist. When set, it takes precedence over ClientSecret for types that use certs like ServicePrincipalCertificate.	string
clientID	ClientID is the service principal client ID. Both User Assigned MSI and SP can use this field.	string
clientSecret	ClientSecret is a secret reference which should contain either a Service Principal password or certificate secret.	corev1.SecretReference
resourceID	ResourceID is the Azure resource ID for the User Assigned MSI resource. Only applicable when type is UserAssignedMSI. Deprecated: This field no longer has any effect.	string
tenantID	TenantID is the service principal primary tenant id.	string
type	Type is the type of Azure Identity used. ServicePrincipal, ServicePrincipalCertificate, UserAssignedMSI, ManualServicePrincipal, UserAssignedIdentityCredential, or WorkloadIdentity.	IdentityType
userAssignedIdentityCredentialsCloudType	UserAssignedIdentityCredentialsCloudType is used with UserAssignedIdentityCredentialsPath to specify the Cloud type. Can only be one of the following values: public, china, or usgovernment If a value is not specified, defaults to public	string
userAssignedIdentityCredentialsPath	UserAssignedIdentityCredentialsPath is the path where an existing JSON file exists containing the JSON format of a UserAssignedIdentityCredentials struct. See the msi-dataplane for more details on UserAssignedIdentityCredentials - https://github.com/Azure/msi-dataplane/blob/main/pkg/dataplane/internal/client/models.go#L125	string

AzureClusterIdentityStatus

Property	Description	Type
conditions	Conditions defines current service state of the AzureClusterIdentity.	clusterv1.Conditions

AzureClusterIdentityList

AzureClusterIdentityList contains a list of AzureClusterIdentity.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureClusterIdentity[]

AzureClusterList

AzureClusterList contains a list of AzureClusters.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureCluster[]

AzureClusterTemplate

AzureClusterTemplate is the Schema for the azureclustertemplates API.

Used by: AzureClusterTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureClusterTemplateSpec

AzureClusterTemplateList

AzureClusterTemplateList contains a list of AzureClusterTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureClusterTemplate[]

AzureClusterTemplateResource

AzureClusterTemplateResource describes the data needed to create an AzureCluster from a template.

Used by: AzureClusterTemplateSpec.

Property	Description	Type
spec		AzureClusterTemplateResourceSpec

AzureClusterTemplateResourceSpec

AzureClusterTemplateResourceSpec specifies an Azure cluster template resource.

Used by: AzureClusterTemplateResource.

Property	Description	Type
AzureClusterClassSpec
bastionSpec	BastionSpec encapsulates all things related to the Bastions in the cluster.	BastionTemplateSpec
networkSpec	NetworkSpec encapsulates all things related to Azure network.	NetworkTemplateSpec

AzureClusterTemplateSpec

AzureClusterTemplateSpec defines the desired state of AzureClusterTemplate.

Used by: AzureClusterTemplate.

Property	Description	Type
template		AzureClusterTemplateResource

AzureKeyVaultKms

AzureKeyVaultKms service settings for the security profile. See also AKS doc.

Used by: ManagedClusterSecurityProfile.

Property	Description	Type
enabled	Enabled enables the Azure Key Vault key management service. The default is false.	bool Required
keyID	KeyID defines the Identifier of Azure Key Vault key. When Azure Key Vault key management service is enabled, this field is required and must be a valid key identifier.	string Required
keyVaultNetworkAccess	KeyVaultNetworkAccess defines the network access of key vault. The possible values are Public and Private. Public means the key vault allows public access from all networks. Private means the key vault disables public access and enables private link. The default value is Public.	KeyVaultNetworkAccessTypes
keyVaultResourceID	KeyVaultResourceID is the Resource ID of key vault. When keyVaultNetworkAccess is Private, this field is required and must be a valid resource ID.	string

AzureMachine

AzureMachine is the Schema for the azuremachines API.

Used by: AzureMachineList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureMachineSpec
status		AzureMachineStatus

AzureMachineSpec

Property	Description	Type
acceleratedNetworking	Deprecated: AcceleratedNetworking should be set in the networkInterfaces field.	bool
additionalCapabilities	AdditionalCapabilities specifies additional capabilities enabled or disabled on the virtual machine.	AdditionalCapabilities
additionalTags	AdditionalTags is an optional set of tags to add to an instance, in addition to the ones added by default by the Azure provider. If both the AzureCluster and the AzureMachine specify the same tag name with different values, the AzureMachine's value takes precedence.	Tags
allocatePublicIP	AllocatePublicIP allows the ability to create dynamic public ips for machines where this value is true.	bool
capacityReservationGroupID	CapacityReservationGroupID specifies the capacity reservation group resource id that should be used for allocating the virtual machine. The field size should be greater than 0 and the field input must start with '/'. The input for capacityReservationGroupID must be similar to '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Compute/capacityReservationGroups/{capacityReservationGroupName}'. The keys which are used should be among 'subscriptions', 'providers' and 'resourcegroups' followed by valid ID or names respectively. It is optional but may not be changed once set.	string
dataDisks	DataDisk specifies the parameters that are used to add one or more data disks to the machine	DataDisk[]
diagnostics	Diagnostics specifies the diagnostics settings for a virtual machine. If not specified then Boot diagnostics (Managed) will be enabled.	Diagnostics
disableExtensionOperations	DisableExtensionOperations specifies whether extension operations should be disabled on the virtual machine. Use this setting only if VMExtensions are not supported by your image, as it disables CAPZ bootstrapping extension used for detecting Kubernetes bootstrap failure. This may only be set to True when no extensions are configured on the virtual machine.	bool
disableVMBootstrapExtension	DisableVMBootstrapExtension specifies whether the VM bootstrap extension should be disabled on the virtual machine. Use this setting if you want to disable only the bootstrapping extension and not all extensions.	bool
dnsServers	DNSServers adds a list of DNS Server IP addresses to the VM NICs.	string[]
enableIPForwarding	EnableIPForwarding enables IP Forwarding in Azure which is required for some CNI's to send traffic from a pods on one machine to another. This is required for IpV6 with Calico in combination with User Defined Routes (set by the Azure Cloud Controller manager). Default is false for disabled.	bool
failureDomain	FailureDomain is the failure domain unique identifier this Machine should be attached to, as defined in Cluster API. This relates to an Azure Availability Zone	string
identity	Identity is the type of identity used for the virtual machine. The type 'SystemAssigned' is an implicitly created identity. The generated identity will be assigned a Subscription contributor role. The type 'UserAssigned' is a standalone Azure resource provided by the user and assigned to the VM	VMIdentity
image	Image is used to provide details of an image to use during VM creation. If image details are omitted, the default is to use an Azure Compute Gallery Image from CAPZ's community gallery.	Image
networkInterfaces	NetworkInterfaces specifies a list of network interface configurations. If left unspecified, the VM will get a single network interface with a single IPConfig in the subnet specified in the cluster's node subnet field. The primary interface will be the first networkInterface specified (index 0) in the list.	NetworkInterface[]
osDisk	OSDisk specifies the parameters for the operating system disk of the machine	OSDisk
providerID	ProviderID is the unique identifier as specified by the cloud provider.	string
roleAssignmentName	Deprecated: RoleAssignmentName should be set in the systemAssignedIdentityRole field.	string
securityProfile	SecurityProfile specifies the Security profile settings for a virtual machine.	SecurityProfile
spotVMOptions	SpotVMOptions allows the ability to specify the Machine should use a Spot VM	SpotVMOptions
sshPublicKey	SSHPublicKey is the SSH public key string, base64-encoded to add to a Virtual Machine. Linux only. Refer to documentation on how to set up SSH access on Windows instances.	string
subnetName	Deprecated: SubnetName should be set in the networkInterfaces field.	string
systemAssignedIdentityRole	SystemAssignedIdentityRole defines the role and scope to assign to the system-assigned identity.	SystemAssignedIdentityRole
userAssignedIdentities	UserAssignedIdentities is a list of standalone Azure identities provided by the user The lifecycle of a user-assigned identity is managed separately from the lifecycle of the AzureMachine. See https://learn.microsoft.com/azure/active-directory/managed-identities-azure-resources/how-to-manage-ua-identity-cli	UserAssignedIdentity[]
vmExtensions	VMExtensions specifies a list of extensions to be added to the virtual machine.	VMExtension[]
vmSize		string

AzureMachineStatus

Property	Description	Type
addresses	Addresses contains the Azure instance associated addresses.	corev1.NodeAddress[]
conditions	Conditions defines current service state of the AzureMachine.	clusterv1.Conditions
failureMessage	ErrorMessage will be set in the event that there is a terminal problem reconciling the Machine and will contain a more verbose string suitable for logging and human consumption. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the Machine's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
failureReason	ErrorReason will be set in the event that there is a terminal problem reconciling the Machine and will contain a succinct value suitable for machine interpretation. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the Machine's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool
vmState	VMState is the provisioning state of the Azure virtual machine.	ProvisioningState

AzureMachineList

AzureMachineList contains a list of AzureMachine.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureMachine[]

AzureMachineTemplate

AzureMachineTemplate is the Schema for the azuremachinetemplates API.

Used by: AzureMachineTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureMachineTemplateSpec

AzureMachineTemplateList

AzureMachineTemplateList contains a list of AzureMachineTemplates.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureMachineTemplate[]

AzureMachineTemplateResource

AzureMachineTemplateResource describes the data needed to create an AzureMachine from a template.

Used by: AzureMachineTemplateSpec.

Property	Description	Type
metadata		clusterv1.ObjectMeta
spec	Spec is the specification of the desired behavior of the machine.	AzureMachineSpec

AzureMachineTemplateSpec

AzureMachineTemplateSpec defines the desired state of AzureMachineTemplate.

Used by: AzureMachineTemplate.

Property	Description	Type
template		AzureMachineTemplateResource

azureMachineWebhook

azureMachineWebhook implements a validating and defaulting webhook for AzureMachines.

Property	Description	Type
Client		client.Client

AzureManagedCluster

AzureManagedCluster is the Schema for the azuremanagedclusters API.

Used by: AzureManagedClusterList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedClusterSpec
status		AzureManagedClusterStatus

AzureManagedClusterSpec

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. Immutable, populated by the AKS API at create. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureManagedClusterStatus

Property	Description	Type
ready	Ready is true when the provider resource is ready.	bool

AzureManagedClusterList

AzureManagedClusterList contains a list of AzureManagedClusters.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedCluster[]

AzureManagedClusterTemplate

AzureManagedClusterTemplate is the Schema for the AzureManagedClusterTemplates API.

Used by: AzureManagedClusterTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedClusterTemplateSpec

AzureManagedClusterTemplateList

AzureManagedClusterTemplateList contains a list of AzureManagedClusterTemplates.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedClusterTemplate[]

AzureManagedClusterTemplateResource

AzureManagedClusterTemplateResource describes the data needed to create an AzureManagedCluster from a template.

Used by: AzureManagedClusterTemplateSpec.

Property	Description	Type
spec		AzureManagedClusterTemplateResourceSpec

AzureManagedClusterTemplateSpec

AzureManagedClusterTemplateSpec defines the desired state of AzureManagedClusterTemplate.

Used by: AzureManagedClusterTemplate.

Property	Description	Type
template		AzureManagedClusterTemplateResource

AzureManagedControlPlane

AzureManagedControlPlane is the Schema for the azuremanagedcontrolplanes API.

Used by: AzureManagedControlPlaneList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedControlPlaneSpec
status		AzureManagedControlPlaneStatus

AzureManagedControlPlaneSpec

Property	Description	Type
AzureManagedControlPlaneClassSpec
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. Immutable, populated by the AKS API at create.	clusterv1.APIEndpoint
dnsPrefix	DNSPrefix allows the user to customize dns prefix. Immutable.	string
fleetsMember	FleetsMember is the spec for the fleet this cluster is a member of. See also AKS doc.	FleetsMember
nodeResourceGroupName	NodeResourceGroupName is the name of the resource group containing cluster IaaS resources. Will be populated to default in webhook. Immutable.	string
sshPublicKey	SSHPublicKey is a string literal containing an ssh public key base64 encoded. Use empty string to autogenerate new key. Use null value to not set key. Immutable.	string

AzureManagedControlPlaneStatus

Property	Description	Type
autoUpgradeVersion	AutoUpgradeVersion is the Kubernetes version populated after auto-upgrade based on the upgrade channel.	string
conditions	Conditions defines current service state of the AzureManagedControlPlane.	clusterv1.Conditions
initialized	Initialized is true when the control plane is available for initial contact. This may occur before the control plane is fully ready. In the AzureManagedControlPlane implementation, these are identical.	bool
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
oidcIssuerProfile	OIDCIssuerProfile is the OIDC issuer profile of the Managed Cluster.	OIDCIssuerProfileStatus
ready	Ready is true when the provider resource is ready.	bool
version	Version defines the Kubernetes version for the control plane instance.	string

AzureManagedControlPlaneClassSpec

AzureManagedControlPlaneClassSpec defines the AzureManagedControlPlane properties that may be shared across several azure managed control planes.

Property	Description	Type
aadProfile	AadProfile is Azure Active Directory configuration to integrate with AKS for aad authentication.	AADProfile
additionalTags	AdditionalTags is an optional set of tags to add to Azure resources managed by the Azure provider, in addition to the ones added by default.	Tags
addonProfiles	AddonProfiles are the profiles of managed cluster add-on.	AddonProfile[]
apiServerAccessProfile	APIServerAccessProfile is the access profile for AKS API server. Immutable except for `authorizedIPRanges`.	APIServerAccessProfile
asoManagedClusterPatches	ASOManagedClusterPatches defines JSON merge patches to be applied to the generated ASO ManagedCluster resource. WARNING: This is meant to be used sparingly to enable features for development and testing that are not otherwise represented in the CAPZ API. Misconfiguration that conflicts with CAPZ's normal mode of operation is possible.	string[]
autoscalerProfile	AutoscalerProfile is the parameters to be applied to the cluster-autoscaler when enabled	AutoScalerProfile
autoUpgradeProfile	AutoUpgradeProfile defines the auto upgrade configuration.	ManagedClusterAutoUpgradeProfile
azureEnvironment	AzureEnvironment is the name of the AzureCloud to be used. The default value that would be used by most users is "AzurePublicCloud", other values are: - ChinaCloud: "AzureChinaCloud" - PublicCloud: "AzurePublicCloud" - USGovernmentCloud: "AzureUSGovernmentCloud" Note that values other than the default must also be accompanied by corresponding changes to the aso-controller-settings Secret to configure ASO to refer to the non-Public cloud. ASO currently does not support referring to multiple different clouds in a single installation. The following fields must be defined in the Secret: - AZURE_AUTHORITY_HOST - AZURE_RESOURCE_MANAGER_ENDPOINT - AZURE_RESOURCE_MANAGER_AUDIENCE See the ASO docs for more details.	string
disableLocalAccounts	DisableLocalAccounts disables getting static credentials for this cluster when set. Expected to only be used for AAD clusters.	bool
dnsServiceIP	DNSServiceIP is an IP address assigned to the Kubernetes DNS service. It must be within the Kubernetes service address range specified in serviceCidr. Immutable.	string
enablePreviewFeatures	EnablePreviewFeatures enables preview features for the cluster.	bool
extensions	Extensions is a list of AKS extensions to be installed on the cluster.	AKSExtension[]
fleetsMember	FleetsMember is the spec for the fleet this cluster is a member of. See also AKS doc.	FleetsMemberClassSpec
httpProxyConfig	HTTPProxyConfig is the HTTP proxy configuration for the cluster. Immutable.	HTTPProxyConfig
identity	Identity configuration used by the AKS control plane.	Identity
identityRef	IdentityRef is a reference to a AzureClusterIdentity to be used when reconciling this cluster	corev1.ObjectReference
kubeletUserAssignedIdentity	KubeletUserAssignedIdentity is the user-assigned identity for kubelet. For authentication with Azure Container Registry.	string
loadBalancerProfile	LoadBalancerProfile is the profile of the cluster load balancer.	LoadBalancerProfile
loadBalancerSKU	LoadBalancerSKU is the SKU of the loadBalancer to be provisioned. Immutable.	string
location	Location is a string matching one of the canonical Azure region names. Examples: "westus2", "eastus".	string
machineTemplate	MachineTemplate contains information about how machines should be shaped when creating or updating a control plane. For the AzureManagedControlPlaneTemplate, this field is used only to fulfill the CAPI contract.	AzureManagedControlPlaneTemplateMachineTemplate
networkDataplane	NetworkDataplane is the dataplane used for building the Kubernetes network.	NetworkDataplaneType
networkPlugin	NetworkPlugin used for building Kubernetes network.	string
networkPluginMode	NetworkPluginMode is the mode the network plugin should use. Allowed value is "overlay".	NetworkPluginMode
networkPolicy	NetworkPolicy used for building Kubernetes network.	string
oidcIssuerProfile	OIDCIssuerProfile is the OIDC issuer profile of the Managed Cluster.	OIDCIssuerProfile
outboundType	Outbound configuration used by Nodes.	ManagedControlPlaneOutboundType
resourceGroupName	ResourceGroupName is the name of the Azure resource group for this AKS Cluster. Immutable.	string
securityProfile	SecurityProfile defines the security profile for cluster.	ManagedClusterSecurityProfile
sku	SKU is the SKU of the AKS to be provisioned.	AKSSku
subscriptionID	SubscriptionID is the GUID of the Azure subscription that owns this cluster.	string
version	Version defines the desired Kubernetes version.	string
virtualNetwork	VirtualNetwork describes the virtual network for the AKS cluster. It will be created if it does not already exist.	ManagedControlPlaneVirtualNetwork

AzureManagedControlPlaneList

AzureManagedControlPlaneList contains a list of AzureManagedControlPlane.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedControlPlane[]

AzureManagedControlPlaneSkuTier

AzureManagedControlPlaneSkuTier - Tier of a managed cluster SKU.

Used by: AKSSku.

AzureManagedControlPlaneTemplate

AzureManagedControlPlaneTemplate is the Schema for the AzureManagedControlPlaneTemplates API.

Used by: AzureManagedControlPlaneTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedControlPlaneTemplateSpec

AzureManagedControlPlaneTemplateList

AzureManagedControlPlaneTemplateList contains a list of AzureManagedControlPlaneTemplates.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedControlPlaneTemplate[]

AzureManagedControlPlaneTemplateResource

AzureManagedControlPlaneTemplateResource describes the data needed to create an AzureManagedCluster from a template.

Used by: AzureManagedControlPlaneTemplateSpec.

Property	Description	Type
spec		AzureManagedControlPlaneTemplateResourceSpec

AzureManagedControlPlaneTemplateResourceSpec

AzureManagedControlPlaneTemplateResourceSpec specifies an Azure managed control plane template resource.

Used by: AzureManagedControlPlaneTemplateResource.

AzureManagedControlPlaneTemplateSpec

AzureManagedControlPlaneTemplateSpec defines the desired state of AzureManagedControlPlaneTemplate.

Used by: AzureManagedControlPlaneTemplate.

Property	Description	Type
template		AzureManagedControlPlaneTemplateResource

azureManagedControlPlaneTemplateWebhook

Property	Description	Type
Client		client.Client

azureManagedControlPlaneWebhook

azureManagedControlPlaneWebhook implements a validating and defaulting webhook for AzureManagedControlPlane.

Property	Description	Type
Client		client.Client

AzureManagedMachinePool

AzureManagedMachinePool is the Schema for the azuremanagedmachinepools API.

Used by: AzureManagedMachinePoolList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedMachinePoolSpec
status		AzureManagedMachinePoolStatus

AzureManagedMachinePoolSpec

Property	Description	Type
AzureManagedMachinePoolClassSpec
providerIDList	ProviderIDList is the unique identifier as specified by the cloud provider.	string[]

AzureManagedMachinePoolStatus

Property	Description	Type
conditions	Conditions defines current service state of the AzureManagedControlPlane.	clusterv1.Conditions
errorMessage	Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
errorReason	Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool
replicas	Replicas is the most recently observed number of replicas.	int32

AzureManagedMachinePoolClassSpec

AzureManagedMachinePoolClassSpec defines the AzureManagedMachinePool properties that may be shared across several Azure managed machinepools.

Property	Description	Type
additionalTags	AdditionalTags is an optional set of tags to add to Azure resources managed by the Azure provider, in addition to the ones added by default.	Tags
asoManagedClustersAgentPoolPatches	ASOManagedClustersAgentPoolPatches defines JSON merge patches to be applied to the generated ASO ManagedClustersAgentPool resource. WARNING: This is meant to be used sparingly to enable features for development and testing that are not otherwise represented in the CAPZ API. Misconfiguration that conflicts with CAPZ's normal mode of operation is possible.	string[]
availabilityZones	AvailabilityZones - Availability zones for nodes. Must use VirtualMachineScaleSets AgentPoolType. Immutable.	string[]
enableEncryptionAtHost	EnableEncryptionAtHost indicates whether host encryption is enabled on the node pool. Immutable. See also AKS doc.	bool
enableFIPS	EnableFIPS indicates whether FIPS is enabled on the node pool. Immutable.	bool
enableNodePublicIP	EnableNodePublicIP controls whether or not nodes in the pool each have a public IP address. Immutable.	bool
enableUltraSSD	EnableUltraSSD enables the storage type UltraSSD_LRS for the agent pool. Immutable.	bool
kubeletConfig	KubeletConfig specifies the kubelet configurations for nodes. Immutable.	KubeletConfig
kubeletDiskType	KubeletDiskType specifies the kubelet disk type. Default to OS. Possible values include: 'OS', 'Temporary'. Requires Microsoft.ContainerService/KubeletDisk preview feature to be set. Immutable. See also AKS doc.	KubeletDiskType
linuxOSConfig	LinuxOSConfig specifies the custom Linux OS settings and configurations. Immutable.	LinuxOSConfig
maxPods	MaxPods specifies the kubelet `--max-pods` configuration for the node pool. Immutable. See also AKS doc, K8s doc.	int
mode	Mode represents the mode of an agent pool. Possible values include: System, User.	string
name	Name is the name of the agent pool. If not specified, CAPZ uses the name of the CR as the agent pool name. Immutable.	string
nodeLabels	Node labels represent the labels for all of the nodes present in node pool. See also AKS doc.	map[string]string
nodePublicIPPrefixID	NodePublicIPPrefixID specifies the public IP prefix resource ID which VM nodes should use IPs from. Immutable.	string
osDiskSizeGB	OSDiskSizeGB is the disk size for every machine in this agent pool. If you specify 0, it will apply the default osDisk size according to the vmSize specified. Immutable.	int
osDiskType	OsDiskType specifies the OS disk type for each node in the pool. Allowed values are 'Ephemeral' and 'Managed' (default). Immutable. See also AKS doc.	string
osType	OSType specifies the virtual machine operating system. Default to Linux. Possible values include: 'Linux', 'Windows'. 'Windows' requires the AzureManagedControlPlane's `spec.networkPlugin` to be `azure`. Immutable. See also AKS doc.	string
scaleDownMode	ScaleDownMode affects the cluster autoscaler behavior. Default to Delete. Possible values include: 'Deallocate', 'Delete'	string
scaleSetPriority	ScaleSetPriority specifies the ScaleSetPriority value. Default to Regular. Possible values include: 'Regular', 'Spot' Immutable.	string
scaling	Scaling specifies the autoscaling parameters for the node pool.	ManagedMachinePoolScaling
sku	SKU is the size of the VMs in the node pool. Immutable.	string
spotMaxPrice	SpotMaxPrice defines max price to pay for spot instance. Possible values are any decimal value greater than zero or -1. If you set the max price to be -1, the VM won't be evicted based on price. The price for the VM will be the current price for spot or the price for a standard VM, which ever is less, as long as there's capacity and quota available.	resource.Quantity
subnetName	SubnetName specifies the Subnet where the MachinePool will be placed Immutable.	string
taints	Taints specifies the taints for nodes present in this agent pool. See also AKS doc.	Taints

AzureManagedMachinePoolList

AzureManagedMachinePoolList contains a list of AzureManagedMachinePools.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedMachinePool[]

AzureManagedMachinePoolTemplate

AzureManagedMachinePoolTemplate is the Schema for the AzureManagedMachinePoolTemplates API.

Used by: AzureManagedMachinePoolTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureManagedMachinePoolTemplateSpec

AzureManagedMachinePoolTemplateList

AzureManagedMachinePoolTemplateList contains a list of AzureManagedMachinePoolTemplates.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureManagedMachinePoolTemplate[]

AzureManagedMachinePoolTemplateResource

AzureManagedMachinePoolTemplateResource describes the data needed to create an AzureManagedCluster from a template.

Used by: AzureManagedMachinePoolTemplateSpec.

Property	Description	Type
spec		AzureManagedMachinePoolTemplateResourceSpec

AzureManagedMachinePoolTemplateResourceSpec

AzureManagedMachinePoolTemplateResourceSpec specifies an Azure managed control plane template resource.

Used by: AzureManagedMachinePoolTemplateResource.

AzureManagedMachinePoolTemplateSpec

AzureManagedMachinePoolTemplateSpec defines the desired state of AzureManagedMachinePoolTemplate.

Used by: AzureManagedMachinePoolTemplate.

Property	Description	Type
template		AzureManagedMachinePoolTemplateResource

azureManagedMachinePoolTemplateWebhook

Property	Description	Type
Client		client.Client

azureManagedMachinePoolWebhook

azureManagedMachinePoolWebhook implements a validating and defaulting webhook for AzureManagedMachinePool.

Property	Description	Type
Client		client.Client

BackOffConfig

BackOffConfig indicates the back-off config options.

Used by: CloudProviderConfigOverrides.

Property	Description	Type
cloudProviderBackoff		bool
cloudProviderBackoffDuration		int
cloudProviderBackoffExponent		resource.Quantity
cloudProviderBackoffJitter		resource.Quantity
cloudProviderBackoffRetries		int

BalanceSimilarNodeGroups

BalanceSimilarNodeGroups enumerates the values for BalanceSimilarNodeGroups.

Used by: AutoScalerProfile.

BastionTemplateSpec

BastionTemplateSpec specifies a template for a bastion host.

Used by: AzureClusterTemplateResourceSpec.

Property	Description	Type
azureBastion		AzureBastionTemplateSpec

BuildParams

BuildParams is used to build tags around an azure resource.

Property	Description	Type
Additional	Any additional tags to be added to the resource.	Tags
ClusterName	ClusterName is the cluster associated with the resource.	string
Lifecycle	Lifecycle determines the resource lifecycle.	ResourceLifecycle
Name	Name is the name of the resource, it's applied as the tag "Name" on Azure.	string
ResourceID	ResourceID is the unique identifier of the resource to be tagged.	string
Role	Role is the role associated to the resource.	string

CloudProviderConfigOverrides

CloudProviderConfigOverrides represents the fields that can be overridden in azure cloud provider config.

Used by: AzureClusterClassSpec.

Property	Description	Type
backOffs		BackOffConfig
rateLimits		RateLimitSpec[]

CPUManagerPolicy

CPUManagerPolicy enumerates the values for KubeletConfig.CPUManagerPolicy.

Used by: KubeletConfig.

Expander

Expander enumerates the values for Expander.

Used by: AutoScalerProfile.

ExtendedLocationSpec

ExtendedLocationSpec defines the ExtendedLocation properties to enable CAPZ for Azure public MEC.

Used by: AzureClusterClassSpec.

Property	Description	Type
name	Name defines the name for the extended location.	string
type	Type defines the type for the extended location.	string

ExtensionIdentity

ExtensionIdentity defines the identity of the AKS marketplace extension, if configured.

Used by: AKSExtension.

ExtensionPlan

ExtensionPlan represents the plan for an AKS marketplace extension.

Used by: AKSExtension.

Property	Description	Type
name	Name is the user-defined name of the 3rd Party Artifact that is being procured.	string
product	Product is the name of the 3rd Party artifact that is being procured.	string
promotionCode	PromotionCode is a publisher-provided promotion code as provisioned in Data Market for the said product/artifact.	string
publisher	Publisher is the name of the publisher of the 3rd Party Artifact that is being bought.	string
version	Version is the version of the plan.	string

ExtensionScope

ExtensionScope defines the scope of the AKS marketplace extension, if configured.

Used by: AKSExtension.

Property	Description	Type
releaseNamespace	ReleaseNamespace is the namespace where the extension Release must be placed, for a Cluster-scoped extension. Required for Cluster-scoped extensions.	string
scopeType	ScopeType is the scope of the extension. It can be either Cluster or Namespace, but not both.	ExtensionScopeType
targetNamespace	TargetNamespace is the namespace where the extension will be created for a Namespace-scoped extension. Required for Namespace-scoped extensions.	string

ExtensionScopeType

ExtensionScopeType defines the scope type of the AKS marketplace extension, if configured.

Used by: ExtensionScope.

FleetsMemberClassSpec

FleetsMemberClassSpec defines the FleetsMemberSpec properties that may be shared across several Azure clusters.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
group	Group is the group this member belongs to for multi-cluster update management.	string
managerName	ManagerName is the name of the fleet manager.	string
managerResourceGroup	ManagerResourceGroup is the resource group of the fleet manager.	string

FrontendIPClass

FrontendIPClass defines the FrontendIP properties that may be shared across several Azure clusters.

Property	Description	Type
privateIP		string

Future

Future contains the data needed for an Azure long-running operation to continue across reconcile loops.

Property	Description	Type
data	Data is the base64 url encoded json Azure AutoRest Future.	string
name	Name is the name of the Azure resource. Together with the service name, this forms the unique identifier for the future.	string
resourceGroup	ResourceGroup is the Azure resource group for the resource.	string
serviceName	ServiceName is the name of the Azure service. Together with the name of the resource, this forms the unique identifier for the future.	string
type	Type describes the type of future, such as update, create, delete, etc.	string

HTTPProxyConfig

HTTPProxyConfig is the HTTP proxy configuration for the cluster.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
httpProxy	HTTPProxy is the HTTP proxy server endpoint to use.	string
httpsProxy	HTTPSProxy is the HTTPS proxy server endpoint to use.	string
noProxy	NoProxy indicates the endpoints that should not go through proxy.	string[]
trustedCa	TrustedCA is the alternative CA cert to use for connecting to proxy servers.	string

Identity

Identity represents the Identity configuration for an AKS control plane. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
type	Type - The Identity type to use.	ManagedControlPlaneIdentityType
userAssignedIdentityResourceID	UserAssignedIdentityResourceID - Identity ARM resource ID when using user-assigned identity.	string

KeyVaultNetworkAccessTypes

KeyVaultNetworkAccessTypes defines the types of network access of key vault. The possible values are Public and Private. The default value is Public.

Used by: AzureKeyVaultKms.

KubeletConfig

KubeletConfig defines the supported subset of kubelet configurations for nodes in pools. See also AKS doc, K8s doc.

Used by: AzureManagedMachinePoolClassSpec.

Property	Description	Type
allowedUnsafeSysctls	AllowedUnsafeSysctls - Allowlist of unsafe sysctls or unsafe sysctl patterns (ending in ``). Valid values match `kernel.shm`, `kernel.msg`, `kernel.sem`, `fs.mqueue.`, or `net.*`.	string[]
containerLogMaxFiles	ContainerLogMaxFiles - The maximum number of container log files that can be present for a container. The number must be ≥ 2.	int
containerLogMaxSizeMB	ContainerLogMaxSizeMB - The maximum size in MB of a container log file before it is rotated.	int
cpuCfsQuota	CPUCfsQuota - Enable CPU CFS quota enforcement for containers that specify CPU limits.	bool
cpuCfsQuotaPeriod	CPUCfsQuotaPeriod - Sets CPU CFS quota period value. Must end in "ms", e.g. "100ms"	string
cpuManagerPolicy	CPUManagerPolicy - CPU Manager policy to use.	CPUManagerPolicy
failSwapOn	FailSwapOn - If set to true it will make the Kubelet fail to start if swap is enabled on the node.	bool
imageGcHighThreshold	ImageGcHighThreshold - The percent of disk usage after which image garbage collection is always run. Valid values are 0-100 (inclusive).	int
imageGcLowThreshold	ImageGcLowThreshold - The percent of disk usage before which image garbage collection is never run. Valid values are 0-100 (inclusive) and must be less than `imageGcHighThreshold`.	int
podMaxPids	PodMaxPids - The maximum number of processes per pod. Must not exceed kernel PID limit. -1 disables the limit.	int
topologyManagerPolicy	TopologyManagerPolicy - Topology Manager policy to use.	TopologyManagerPolicy

KubeletDiskType

KubeletDiskType enumerates the values for the agent pool's KubeletDiskType.

Used by: AzureManagedMachinePoolClassSpec.

LBType

LBType defines an Azure load balancer Type.

Used by: LoadBalancerClassSpec.

Value	Description
"Internal"
"Public"

LinuxOSConfig

LinuxOSConfig specifies the custom Linux OS settings and configurations. See also AKS doc.

Used by: AzureManagedMachinePoolClassSpec.

Property	Description	Type
swapFileSizeMB	SwapFileSizeMB specifies size in MB of a swap file will be created on the agent nodes from this node pool. Max value of SwapFileSizeMB should be the size of temporary disk(/dev/sdb). Must be at least 1. See also AKS doc.	int
sysctls	Sysctl specifies the settings for Linux agent nodes.	SysctlConfig
transparentHugePageDefrag	TransparentHugePageDefrag specifies whether the kernel should make aggressive use of memory compaction to make more hugepages available. See also [Linux doc](https://www.kernel.org/doc/html/latest/admin-guide/mm/transhuge.html#admin-guide-transhuge for more details.).	TransparentHugePageOption
transparentHugePageEnabled	TransparentHugePageEnabled specifies various modes of Transparent Hugepages. See also [Linux doc](https://www.kernel.org/doc/html/latest/admin-guide/mm/transhuge.html#admin-guide-transhuge for more details.).	TransparentHugePageOption

LoadBalancerClassSpec

LoadBalancerClassSpec defines the LoadBalancerSpec properties that may be shared across several Azure clusters.

Used by: NetworkTemplateSpec, NetworkTemplateSpec, and NetworkTemplateSpec.

Property	Description	Type
idleTimeoutInMinutes	IdleTimeoutInMinutes specifies the timeout for the TCP idle connection.	int32
sku		SKU
type		LBType

LoadBalancerProfile

LoadBalancerProfile - Profile of the cluster load balancer. At most one of managedOutboundIPs, outboundIPPrefixes, or outboundIPs may be specified. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
allocatedOutboundPorts	AllocatedOutboundPorts - Desired number of allocated SNAT ports per VM. Allowed values must be in the range of 0 to 64000 (inclusive). The default value is 0 which results in Azure dynamically allocating ports.	int
idleTimeoutInMinutes	IdleTimeoutInMinutes - Desired outbound flow idle timeout in minutes. Allowed values must be in the range of 4 to 120 (inclusive). The default value is 30 minutes.	int
managedOutboundIPs	ManagedOutboundIPs - Desired managed outbound IPs for the cluster load balancer.	int
outboundIPPrefixes	OutboundIPPrefixes - Desired outbound IP Prefix resources for the cluster load balancer.	string[]
outboundIPs	OutboundIPs - Desired outbound IP resources for the cluster load balancer.	string[]

ManagedClusterAutoUpgradeProfile

ManagedClusterAutoUpgradeProfile defines the auto upgrade profile for a managed cluster.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
upgradeChannel	UpgradeChannel determines the type of upgrade channel for automatically upgrading the cluster.	UpgradeChannel

ManagedClusterSecurityProfile

ManagedClusterSecurityProfile defines the security profile for the cluster.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
azureKeyVaultKms	AzureKeyVaultKms defines Azure Key Vault Management Services Profile for the security profile.	AzureKeyVaultKms
defender	Defender settings for the security profile.	ManagedClusterSecurityProfileDefender
imageCleaner	ImageCleaner settings for the security profile.	ManagedClusterSecurityProfileImageCleaner
workloadIdentity	Workloadidentity enables Kubernetes applications to access Azure cloud resources securely with Azure AD. Ensure to enable OIDC issuer while enabling Workload Identity	ManagedClusterSecurityProfileWorkloadIdentity

ManagedClusterSecurityProfileDefender

ManagedClusterSecurityProfileDefender defines Microsoft Defender settings for the security profile. See also AKS doc.

Used by: ManagedClusterSecurityProfile.

Property	Description	Type
logAnalyticsWorkspaceResourceID	LogAnalyticsWorkspaceResourceID is the ID of the Log Analytics workspace that has to be associated with Microsoft Defender. When Microsoft Defender is enabled, this field is required and must be a valid workspace resource ID.	string Required
securityMonitoring	SecurityMonitoring profile defines the Microsoft Defender threat detection for Cloud settings for the security profile.	ManagedClusterSecurityProfileDefenderSecurityMonitoring Required

ManagedClusterSecurityProfileDefenderSecurityMonitoring

ManagedClusterSecurityProfileDefenderSecurityMonitoring settings for the security profile threat detection.

Used by: ManagedClusterSecurityProfileDefender.

Property	Description	Type
enabled	Enabled enables Defender threat detection	bool Required

ManagedClusterSecurityProfileImageCleaner

ManagedClusterSecurityProfileImageCleaner removes unused images from nodes, freeing up disk space and helping to reduce attack surface area. See also AKS doc.

Used by: ManagedClusterSecurityProfile.

Property	Description	Type
enabled	Enabled enables the Image Cleaner on AKS cluster.	bool Required
intervalHours	IntervalHours defines Image Cleaner scanning interval in hours. Default value is 24 hours.	int

ManagedClusterSecurityProfileWorkloadIdentity

ManagedClusterSecurityProfileWorkloadIdentity settings for the security profile. See also AKS doc.

Used by: ManagedClusterSecurityProfile.

Property	Description	Type
enabled	Enabled enables the workload identity.	bool Required

ManagedControlPlaneIdentityType

ManagedControlPlaneIdentityType enumerates the values for managed control plane identity type.

Used by: Identity.

ManagedControlPlaneOutboundType

ManagedControlPlaneOutboundType enumerates the values for the managed control plane OutboundType.

Used by: AzureManagedControlPlaneClassSpec.

ManagedControlPlaneSubnet

ManagedControlPlaneSubnet describes a subnet for an AKS cluster.

Used by: ManagedControlPlaneVirtualNetworkClassSpec.

Property	Description	Type
cidrBlock		string
name		string
privateEndpoints	PrivateEndpoints is a slice of Virtual Network private endpoints to create for the subnets.	PrivateEndpoints
serviceEndpoints	ServiceEndpoints is a slice of Virtual Network service endpoints to enable for the subnets.	ServiceEndpoints

ManagedControlPlaneVirtualNetwork

ManagedControlPlaneVirtualNetwork describes a virtual network required to provision AKS clusters.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
ManagedControlPlaneVirtualNetworkClassSpec
name	Name is the name of the virtual network.	string
resourceGroup	ResourceGroup is the name of the Azure resource group for the VNet and Subnet.	string

ManagedControlPlaneVirtualNetworkClassSpec

ManagedControlPlaneVirtualNetworkClassSpec defines the ManagedControlPlaneVirtualNetwork properties that may be shared across several managed control plane vnets.

Property	Description	Type
cidrBlock		string
subnet		ManagedControlPlaneSubnet

ManagedMachinePoolScaling

ManagedMachinePoolScaling specifies scaling options.

Used by: AzureManagedMachinePoolClassSpec.

Property	Description	Type
maxSize	MaxSize is the maximum number of nodes for auto-scaling.	int
minSize	MinSize is the minimum number of nodes for auto-scaling.	int

mockClient

Property	Description	Type
client.Client
ReturnError		bool

mockDefaultClient

Property	Description	Type
client.Client
SubscriptionID		string

NatGatewayClassSpec

NatGatewayClassSpec defines a NAT gateway class specification.

Used by: SubnetTemplateSpec.

Property	Description	Type
name		string

NetworkClassSpec

NetworkClassSpec defines the NetworkSpec properties that may be shared across several Azure clusters.

Property	Description	Type
privateDNSZoneName	PrivateDNSZoneName defines the zone name for the Azure Private DNS.	string
privateDNSZoneResourceGroup	PrivateDNSZoneResourceGroup defines the resource group to be used for Azure Private DNS Zone. If not specified, the resource group of the cluster will be used to create the Azure Private DNS Zone.	string

NetworkDataplaneType

NetworkDataplaneType is the type of network dataplane to use.

Used by: AzureManagedControlPlaneClassSpec.

NetworkPluginMode

NetworkPluginMode is the mode the network plugin should use.

Used by: AzureManagedControlPlaneClassSpec.

NetworkTemplateSpec

NetworkTemplateSpec specifies a network template.

Used by: AzureClusterTemplateResourceSpec.

Property	Description	Type
NetworkClassSpec
additionalAPIServerLBPorts	AdditionalAPIServerLBPorts is the configuration for the additional inbound control-plane load balancer ports Each port specified (e.g., 9345) creates an inbound rule where the frontend port and the backend port are the same.	LoadBalancerPort[]
apiServerLB	APIServerLB is the configuration for the control-plane load balancer.	LoadBalancerClassSpec
controlPlaneOutboundLB	ControlPlaneOutboundLB is the configuration for the control-plane outbound load balancer. This is different from APIServerLB, and is used only in private clusters (optionally) for enabling outbound traffic.	LoadBalancerClassSpec
nodeOutboundLB	NodeOutboundLB is the configuration for the node outbound load balancer.	LoadBalancerClassSpec
subnets	Subnets is the configuration for the control-plane subnet and the node subnet.	SubnetTemplatesSpec
vnet	Vnet is the configuration for the Azure virtual network.	VnetTemplateSpec

NodePoolMode

NodePoolMode enumerates the values for agent pool mode.

OIDCIssuerProfile

OIDCIssuerProfile is the OIDC issuer profile of the Managed Cluster. See also AKS doc.

Used by: AzureManagedControlPlaneClassSpec.

Property	Description	Type
enabled	Enabled is whether the OIDC issuer is enabled.	bool

OrchestrationModeType

OrchestrationModeType represents the orchestration mode for a Virtual Machine Scale Set backing an AzureMachinePool.

osDiskTestInput

Property	Description	Type
name		string
osDisk		OSDisk
wantErr		bool

PrivateEndpointSpec

PrivateEndpointSpec configures an Azure Private Endpoint.

Property	Description	Type
applicationSecurityGroups	ApplicationSecurityGroups specifies the Application security group in which the private endpoint IP configuration is included.	string[]
customNetworkInterfaceName	CustomNetworkInterfaceName specifies the network interface name associated with the private endpoint.	string
location	Location specifies the region to create the private endpoint.	string
manualApproval	ManualApproval specifies if the connection approval needs to be done manually or not. Set it true when the network admin does not have access to approve connections to the remote resource. Defaults to false.	bool
name	Name specifies the name of the private endpoint.	string
privateIPAddresses	PrivateIPAddresses specifies the IP addresses for the network interface associated with the private endpoint. They have to be part of the subnet where the private endpoint is linked.	string[]
privateLinkServiceConnections	PrivateLinkServiceConnections specifies Private Link Service Connections of the private endpoint.	PrivateLinkServiceConnection[]

PrivateLinkServiceConnection

PrivateLinkServiceConnection defines the specification for a private link service connection associated with a private endpoint.

Used by: PrivateEndpointSpec.

Property	Description	Type
groupIDs	GroupIDs specifies the ID(s) of the group(s) obtained from the remote resource that this private endpoint should connect to.	string[]
name	Name specifies the name of the private link service.	string
privateLinkServiceID	PrivateLinkServiceID specifies the resource ID of the private link service.	string
requestMessage	RequestMessage specifies a message passed to the owner of the remote resource with the private endpoint connection request.	string

RateLimitConfig

RateLimitConfig indicates the rate limit config options.

Used by: RateLimitSpec.

Property	Description	Type
cloudProviderRateLimit		bool
cloudProviderRateLimitBucket		int
cloudProviderRateLimitBucketWrite		int
cloudProviderRateLimitQPS		resource.Quantity
cloudProviderRateLimitQPSWrite		resource.Quantity

RateLimitSpec

RateLimitSpec represents the rate limit configuration for a particular kind of resource. Eg. loadBalancerRateLimit is used to configure rate limits for load balancers. This eventually gets converted to CloudProviderRateLimitConfig that cloud-provider-azure expects. See: https://github.com/kubernetes-sigs/cloud-provider-azure/blob/d585c2031925b39c925624302f22f8856e29e352/pkg/provider/azure_ratelimit.go#L25 We cannot use CloudProviderRateLimitConfig directly because floating point values are not supported in controller-tools. See: https://github.com/kubernetes-sigs/controller-tools/issues/245

Used by: CloudProviderConfigOverrides.

Property	Description	Type
config		RateLimitConfig
name	Name is the name of the rate limit spec.	string

ResourceLifecycle

ResourceLifecycle configures the lifecycle of a resource.

Used by: BuildParams.

Value	Description
"owned"
"shared"

SecurityGroupClass

SecurityGroupClass defines the SecurityGroup properties that may be shared across several Azure clusters.

Used by: SubnetTemplateSpec.

Property	Description	Type
securityRules		SecurityRules
tags		Tags

SecurityGroupProtocol

SecurityGroupProtocol defines the protocol type for a security group rule.

Used by: SecurityRule.

Value	Description
"*"
"Tcp"
"Udp"
"Icmp"

SecurityRule

SecurityRule defines an Azure security rule for security groups.

Property	Description	Type
action	Action specifies whether network traffic is allowed or denied. Can either be "Allow" or "Deny". Defaults to "Allow".	SecurityRuleAccess
description	A description for this rule. Restricted to 140 chars.	string
destination	Destination is the destination address prefix. CIDR or destination IP range. Asterix '*' can also be used to match all source IPs. Default tags such as 'VirtualNetwork', 'AzureLoadBalancer' and 'Internet' can also be used.	string
destinationPorts	DestinationPorts specifies the destination port or range. Integer or range between 0 and 65535. Asterix '*' can also be used to match all ports.	string
direction	Direction indicates whether the rule applies to inbound, or outbound traffic. "Inbound" or "Outbound".	SecurityRuleDirection
name	Name is a unique name within the network security group.	string
priority	Priority is a number between 100 and 4096. Each rule should have a unique value for priority. Rules are processed in priority order, with lower numbers processed before higher numbers. Once traffic matches a rule, processing stops.	int32
protocol	Protocol specifies the protocol type. "Tcp", "Udp", "Icmp", or "*".	SecurityGroupProtocol
source	Source specifies the CIDR or source IP range. Asterix '*' can also be used to match all source IPs. Default tags such as 'VirtualNetwork', 'AzureLoadBalancer' and 'Internet' can also be used. If this is an ingress rule, specifies where network traffic originates from.	string
sourcePorts	SourcePorts specifies source port or range. Integer or range between 0 and 65535. Asterix '*' can also be used to match all ports.	string
sources	Sources specifies The CIDR or source IP ranges.	string[]

SecurityRuleAccess

SecurityRuleAccess defines the action type for a security group rule.

Used by: SecurityRule.

SecurityRuleDirection

SecurityRuleDirection defines the direction type for a security group rule.

Used by: SecurityRule.

Value	Description
"Inbound"
"Outbound"

ServiceEndpointSpec

ServiceEndpointSpec configures an Azure Service Endpoint.

Property	Description	Type
locations		string[]
service		string

SkipNodesWithLocalStorage

SkipNodesWithLocalStorage enumerates the values for SkipNodesWithLocalStorage.

Used by: AutoScalerProfile.

SkipNodesWithSystemPods

SkipNodesWithSystemPods enumerates the values for SkipNodesWithSystemPods.

Used by: AutoScalerProfile.

SKU

SKU defines an Azure load balancer SKU.

Used by: LoadBalancerClassSpec.

Value	Description
"Standard"

SubnetClassSpec

SubnetClassSpec defines the SubnetSpec properties that may be shared across several Azure clusters.

Property	Description	Type
cidrBlocks	CIDRBlocks defines the subnet's address space, specified as one or more address prefixes in CIDR notation.	string[]
name	Name defines a name for the subnet resource.	string
privateEndpoints	PrivateEndpoints defines a list of private endpoints that should be attached to this subnet.	PrivateEndpoints
role	Role defines the subnet role (eg. Node, ControlPlane)	SubnetRole
serviceEndpoints	ServiceEndpoints is a slice of Virtual Network service endpoints to enable for the subnets.	ServiceEndpoints

SubnetRole

SubnetRole defines the unique role of a subnet.

Used by: SubnetClassSpec.

SubnetTemplateSpec

SubnetTemplateSpec specifies a template for a subnet.

Used by: AzureBastionTemplateSpec.

Property	Description	Type
SubnetClassSpec
natGateway	NatGateway associated with this subnet.	NatGatewayClassSpec
securityGroup	SecurityGroup defines the NSG (network security group) that should be attached to this subnet.	SecurityGroupClass

SysctlConfig

SysctlConfig specifies the settings for Linux agent nodes.

Used by: LinuxOSConfig.

Property	Description	Type
fsAioMaxNr	FsAioMaxNr specifies the maximum number of system-wide asynchronous io requests. Valid values are 65536-6553500 (inclusive). Maps to fs.aio-max-nr.	int
fsFileMax	FsFileMax specifies the max number of file-handles that the Linux kernel will allocate, by increasing increases the maximum number of open files permitted. Valid values are 8192-12000500 (inclusive). Maps to fs.file-max.	int
fsInotifyMaxUserWatches	FsInotifyMaxUserWatches specifies the number of file watches allowed by the system. Each watch is roughly 90 bytes on a 32-bit kernel, and roughly 160 bytes on a 64-bit kernel. Valid values are 781250-2097152 (inclusive). Maps to fs.inotify.max_user_watches.	int
fsNrOpen	FsNrOpen specifies the maximum number of file-handles a process can allocate. Valid values are 8192-20000500 (inclusive). Maps to fs.nr_open.	int
kernelThreadsMax	KernelThreadsMax specifies the maximum number of all threads that can be created. Valid values are 20-513785 (inclusive). Maps to kernel.threads-max.	int
netCoreNetdevMaxBacklog	NetCoreNetdevMaxBacklog specifies maximum number of packets, queued on the INPUT side, when the interface receives packets faster than kernel can process them. Valid values are 1000-3240000 (inclusive). Maps to net.core.netdev_max_backlog.	int
netCoreOptmemMax	NetCoreOptmemMax specifies the maximum ancillary buffer size (option memory buffer) allowed per socket. Socket option memory is used in a few cases to store extra structures relating to usage of the socket. Valid values are 20480-4194304 (inclusive). Maps to net.core.optmem_max.	int
netCoreRmemDefault	NetCoreRmemDefault specifies the default receive socket buffer size in bytes. Valid values are 212992-134217728 (inclusive). Maps to net.core.rmem_default.	int
netCoreRmemMax	NetCoreRmemMax specifies the maximum receive socket buffer size in bytes. Valid values are 212992-134217728 (inclusive). Maps to net.core.rmem_max.	int
netCoreSomaxconn	NetCoreSomaxconn specifies maximum number of connection requests that can be queued for any given listening socket. An upper limit for the value of the backlog parameter passed to the listen(2)(https://man7.org/linux/man-pages/man2/listen.2.html) function. If the backlog argument is greater than the somaxconn, then it's silently truncated to this limit. Valid values are 4096-3240000 (inclusive). Maps to net.core.somaxconn.	int
netCoreWmemDefault	NetCoreWmemDefault specifies the default send socket buffer size in bytes. Valid values are 212992-134217728 (inclusive). Maps to net.core.wmem_default.	int
netCoreWmemMax	NetCoreWmemMax specifies the maximum send socket buffer size in bytes. Valid values are 212992-134217728 (inclusive). Maps to net.core.wmem_max.	int
netIpv4IPLocalPortRange	NetIpv4IPLocalPortRange is used by TCP and UDP traffic to choose the local port on the agent node. PortRange should be specified in the format "first last". First, being an integer, must be between [1024 - 60999]. Last, being an integer, must be between [32768 - 65000]. Maps to net.ipv4.ip_local_port_range.	string
netIpv4NeighDefaultGcThresh1	NetIpv4NeighDefaultGcThresh1 specifies the minimum number of entries that may be in the ARP cache. Garbage collection won't be triggered if the number of entries is below this setting. Valid values are 128-80000 (inclusive). Maps to net.ipv4.neigh.default.gc_thresh1.	int
netIpv4NeighDefaultGcThresh2	NetIpv4NeighDefaultGcThresh2 specifies soft maximum number of entries that may be in the ARP cache. ARP garbage collection will be triggered about 5 seconds after reaching this soft maximum. Valid values are 512-90000 (inclusive). Maps to net.ipv4.neigh.default.gc_thresh2.	int
netIpv4NeighDefaultGcThresh3	NetIpv4NeighDefaultGcThresh3 specified hard maximum number of entries in the ARP cache. Valid values are 1024-100000 (inclusive). Maps to net.ipv4.neigh.default.gc_thresh3.	int
netIpv4TCPFinTimeout	NetIpv4TCPFinTimeout specifies the length of time an orphaned connection will remain in the FIN_WAIT_2 state before it's aborted at the local end. Valid values are 5-120 (inclusive). Maps to net.ipv4.tcp_fin_timeout.	int
netIpv4TCPkeepaliveIntvl	NetIpv4TCPkeepaliveIntvl specifies the frequency of the probes sent out. Multiplied by tcpKeepaliveprobes, it makes up the time to kill a connection that isn't responding, after probes started. Valid values are 1-75 (inclusive). Maps to net.ipv4.tcp_keepalive_intvl.	int
netIpv4TCPKeepaliveProbes	NetIpv4TCPKeepaliveProbes specifies the number of keepalive probes TCP sends out, until it decides the connection is broken. Valid values are 1-15 (inclusive). Maps to net.ipv4.tcp_keepalive_probes.	int
netIpv4TCPKeepaliveTime	NetIpv4TCPKeepaliveTime specifies the rate at which TCP sends out a keepalive message when keepalive is enabled. Valid values are 30-432000 (inclusive). Maps to net.ipv4.tcp_keepalive_time.	int
netIpv4TCPMaxSynBacklog	NetIpv4TCPMaxSynBacklog specifies the maximum number of queued connection requests that have still not received an acknowledgment from the connecting client. If this number is exceeded, the kernel will begin dropping requests. Valid values are 128-3240000 (inclusive). Maps to net.ipv4.tcp_max_syn_backlog.	int
netIpv4TCPMaxTwBuckets	NetIpv4TCPMaxTwBuckets specifies maximal number of timewait sockets held by system simultaneously. If this number is exceeded, time-wait socket is immediately destroyed and warning is printed. Valid values are 8000-1440000 (inclusive). Maps to net.ipv4.tcp_max_tw_buckets.	int
netIpv4TCPTwReuse	NetIpv4TCPTwReuse is used to allow to reuse TIME-WAIT sockets for new connections when it's safe from protocol viewpoint. Maps to net.ipv4.tcp_tw_reuse.	bool
netNetfilterNfConntrackBuckets	NetNetfilterNfConntrackBuckets specifies the size of hash table used by nf_conntrack module to record the established connection record of the TCP protocol. Valid values are 65536-147456 (inclusive). Maps to net.netfilter.nf_conntrack_buckets.	int
netNetfilterNfConntrackMax	NetNetfilterNfConntrackMax specifies the maximum number of connections supported by the nf_conntrack module or the size of connection tracking table. Valid values are 131072-1048576 (inclusive). Maps to net.netfilter.nf_conntrack_max.	int
vmMaxMapCount	VMMaxMapCount specifies the maximum number of memory map areas a process may have. Maps to vm.max_map_count. Valid values are 65530-262144 (inclusive).	int
vmSwappiness	VMSwappiness specifies aggressiveness of the kernel in swapping memory pages. Higher values will increase aggressiveness, lower values decrease the amount of swap. Valid values are 0-100 (inclusive). Maps to vm.swappiness.	int
vmVfsCachePressure	VMVfsCachePressure specifies the percentage value that controls tendency of the kernel to reclaim the memory, which is used for caching of directory and inode objects. Valid values are 1-500 (inclusive). Maps to vm.vfs_cache_pressure.	int

Taint

Taint represents a Kubernetes taint.

Property	Description	Type
effect	Effect specifies the effect for the taint	TaintEffect
key	Key is the key of the taint	string
value	Value is the value of the taint	string

TaintEffect

TaintEffect is the effect for a Kubernetes taint.

Used by: Taint.

TopologyManagerPolicy

TopologyManagerPolicy enumerates the values for KubeletConfig.TopologyManagerPolicy.

Used by: KubeletConfig.

TransparentHugePageOption

TransparentHugePageOption enumerates the values for various modes of Transparent Hugepages.

Used by: LinuxOSConfig, and LinuxOSConfig.

UpgradeChannel

UpgradeChannel determines the type of upgrade channel for automatically upgrading the cluster. See also AKS doc.

Used by: ManagedClusterAutoUpgradeProfile.

VMState

VMState describes the state of an Azure virtual machine. Deprecated: use ProvisioningState.

VnetClassSpec

VnetClassSpec defines the VnetSpec properties that may be shared across several Azure clusters.

Property	Description	Type
cidrBlocks	CIDRBlocks defines the virtual network's address space, specified as one or more address prefixes in CIDR notation.	string[]
tags	Tags is a collection of tags describing the resource.	Tags

VnetPeeringClassSpec

VnetPeeringClassSpec specifies a virtual network peering class.

Property	Description	Type
forwardPeeringProperties	ForwardPeeringProperties specifies VnetPeeringProperties for peering from the cluster's virtual network to the remote virtual network.	VnetPeeringProperties
remoteVnetName	RemoteVnetName defines name of the remote virtual network.	string
resourceGroup	ResourceGroup is the resource group name of the remote virtual network.	string
reversePeeringProperties	ReversePeeringProperties specifies VnetPeeringProperties for peering from the remote virtual network to the cluster's virtual network.	VnetPeeringProperties

VnetPeeringProperties

VnetPeeringProperties specifies virtual network peering properties.

Used by: VnetPeeringClassSpec, and VnetPeeringClassSpec.

Property	Description	Type
allowForwardedTraffic	AllowForwardedTraffic specifies whether the forwarded traffic from the VMs in the local virtual network will be allowed/disallowed in remote virtual network.	bool
allowGatewayTransit	AllowGatewayTransit specifies if gateway links can be used in remote virtual networking to link to this virtual network.	bool
allowVirtualNetworkAccess	AllowVirtualNetworkAccess specifies whether the VMs in the local virtual network space would be able to access the VMs in remote virtual network space.	bool
useRemoteGateways	UseRemoteGateways specifies if remote gateways can be used on this virtual network. If the flag is set to true, and allowGatewayTransit on remote peering is also set to true, the virtual network will use the gateways of the remote virtual network for transit. Only one peering can have this flag set to true. This flag cannot be set if virtual network already has a gateway.	bool

VnetPeeringSpec

VnetPeeringSpec specifies an existing remote virtual network to peer with the AzureCluster's virtual network.

VnetTemplateSpec

VnetTemplateSpec defines the desired state of a virtual network.

Used by: NetworkTemplateSpec.

Property	Description	Type
VnetClassSpec
peerings	Peerings defines a list of peerings of the newly created virtual network with existing virtual networks.	VnetPeeringsTemplateSpec

AzureASOManagedClusterSpec

AzureASOManagedClusterSpec defines the desired state of AzureASOManagedCluster.

Used by: AzureASOManagedCluster.

Property	Description	Type
AzureASOManagedClusterTemplateResourceSpec
controlPlaneEndpoint	ControlPlaneEndpoint is the location of the API server within the control plane. CAPZ manages this field and it should not be set by the user. It fulfills Cluster API's cluster infrastructure provider contract. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureASOManagedClusterStatus

AzureASOManagedClusterStatus defines the observed state of AzureASOManagedCluster.

Used by: AzureASOManagedCluster.

Property	Description	Type
ready	Ready represents whether or not the cluster has been provisioned and is ready. It fulfills Cluster API's cluster infrastructure provider contract.	bool
resources		ResourceStatus[]

AzureASOManagedControlPlaneSpec

AzureASOManagedControlPlaneSpec defines the desired state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlane.

AzureASOManagedControlPlaneStatus

AzureASOManagedControlPlaneStatus defines the observed state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlane.

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint for the cluster's API server.	clusterv1.APIEndpoint
initialized	Initialized represents whether or not the API server has been provisioned. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `ready`.	bool
ready	Ready represents whether or not the API server is ready to receive requests. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `initialized`.	bool
resources		ResourceStatus[]
version	Version is the observed Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedMachinePoolSpec

AzureASOManagedMachinePoolSpec defines the desired state of AzureASOManagedMachinePool.

Used by: AzureASOManagedMachinePool.

AzureASOManagedMachinePoolStatus

AzureASOManagedMachinePoolStatus defines the observed state of AzureASOManagedMachinePool.

Used by: AzureASOManagedMachinePool.

Property	Description	Type
ready	Ready represents whether or not the infrastructure is ready to be used. It fulfills Cluster API's machine pool infrastructure provider contract.	bool
replicas	Replicas is the current number of provisioned replicas. It fulfills Cluster API's machine pool infrastructure provider contract.	int32
resources		ResourceStatus[]

AzureClusterIdentitySpec

AzureClusterIdentitySpec defines the parameters that are used to create an AzureIdentity.

Used by: AzureClusterIdentity.

Property	Description	Type
allowedNamespaces	AllowedNamespaces is used to identify the namespaces the clusters are allowed to use the identity from. Namespaces can be selected either using an array of namespaces or with label selector. An empty allowedNamespaces object indicates that AzureClusters can use this identity from any namespace. If this object is nil, no namespaces will be allowed (default behaviour, if this field is not provided) A namespace should be either in the NamespaceList or match with Selector to use the identity.	AllowedNamespaces
certPath	CertPath is the path where certificates exist. When set, it takes precedence over ClientSecret for types that use certs like ServicePrincipalCertificate.	string
clientID	ClientID is the service principal client ID. Both User Assigned MSI and SP can use this field.	string
clientSecret	ClientSecret is a secret reference which should contain either a Service Principal password or certificate secret.	corev1.SecretReference
resourceID	ResourceID is the Azure resource ID for the User Assigned MSI resource. Only applicable when type is UserAssignedMSI. Deprecated: This field no longer has any effect.	string
tenantID	TenantID is the service principal primary tenant id.	string
type	Type is the type of Azure Identity used. ServicePrincipal, ServicePrincipalCertificate, UserAssignedMSI, ManualServicePrincipal, UserAssignedIdentityCredential, or WorkloadIdentity.	IdentityType
userAssignedIdentityCredentialsCloudType	UserAssignedIdentityCredentialsCloudType is used with UserAssignedIdentityCredentialsPath to specify the Cloud type. Can only be one of the following values: public, china, or usgovernment If a value is not specified, defaults to public	string
userAssignedIdentityCredentialsPath	UserAssignedIdentityCredentialsPath is the path where an existing JSON file exists containing the JSON format of a UserAssignedIdentityCredentials struct. See the msi-dataplane for more details on UserAssignedIdentityCredentials - https://github.com/Azure/msi-dataplane/blob/main/pkg/dataplane/internal/client/models.go#L125	string

AzureClusterIdentityStatus

AzureClusterIdentityStatus defines the observed state of AzureClusterIdentity.

Used by: AzureClusterIdentity.

Property	Description	Type
conditions	Conditions defines current service state of the AzureClusterIdentity.	clusterv1.Conditions

AzureClusterSpec

AzureClusterSpec defines the desired state of AzureCluster.

Used by: AzureCluster.

Property	Description	Type
AzureClusterClassSpec
bastionSpec	BastionSpec encapsulates all things related to the Bastions in the cluster.	BastionSpec
controlPlaneEnabled	ControlPlaneEnabled enables control plane components in the cluster.	bool
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. It is not recommended to set this when creating an AzureCluster as CAPZ will set this for you. However, if it is set, CAPZ will not change it.	clusterv1.APIEndpoint
networkSpec	NetworkSpec encapsulates all things related to Azure network.	NetworkSpec
resourceGroup		string

AzureClusterStatus

AzureClusterStatus defines the observed state of AzureCluster.

Used by: AzureCluster.

Property	Description	Type
conditions	Conditions defines current service state of the AzureCluster.	clusterv1.Conditions
failureDomains	FailureDomains specifies the list of unique failure domains for the location/region of the cluster. A FailureDomain maps to Availability Zone with an Azure Region (if the region support them). An Availability Zone is a separate data center within a region and they can be used to ensure the cluster is more resilient to failure. See: https://learn.microsoft.com/azure/reliability/availability-zones-overview This list will be used by Cluster API to try and spread the machines across the failure domains.	clusterv1.FailureDomains
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool

AzureMachineSpec

AzureMachineSpec defines the desired state of AzureMachine.

Used by: AzureMachine, and AzureMachineTemplateResource.

Property	Description	Type
acceleratedNetworking	Deprecated: AcceleratedNetworking should be set in the networkInterfaces field.	bool
additionalCapabilities	AdditionalCapabilities specifies additional capabilities enabled or disabled on the virtual machine.	AdditionalCapabilities
additionalTags	AdditionalTags is an optional set of tags to add to an instance, in addition to the ones added by default by the Azure provider. If both the AzureCluster and the AzureMachine specify the same tag name with different values, the AzureMachine's value takes precedence.	Tags
allocatePublicIP	AllocatePublicIP allows the ability to create dynamic public ips for machines where this value is true.	bool
capacityReservationGroupID	CapacityReservationGroupID specifies the capacity reservation group resource id that should be used for allocating the virtual machine. The field size should be greater than 0 and the field input must start with '/'. The input for capacityReservationGroupID must be similar to '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Compute/capacityReservationGroups/{capacityReservationGroupName}'. The keys which are used should be among 'subscriptions', 'providers' and 'resourcegroups' followed by valid ID or names respectively. It is optional but may not be changed once set.	string
dataDisks	DataDisk specifies the parameters that are used to add one or more data disks to the machine	DataDisk[]
diagnostics	Diagnostics specifies the diagnostics settings for a virtual machine. If not specified then Boot diagnostics (Managed) will be enabled.	Diagnostics
disableExtensionOperations	DisableExtensionOperations specifies whether extension operations should be disabled on the virtual machine. Use this setting only if VMExtensions are not supported by your image, as it disables CAPZ bootstrapping extension used for detecting Kubernetes bootstrap failure. This may only be set to True when no extensions are configured on the virtual machine.	bool
disableVMBootstrapExtension	DisableVMBootstrapExtension specifies whether the VM bootstrap extension should be disabled on the virtual machine. Use this setting if you want to disable only the bootstrapping extension and not all extensions.	bool
dnsServers	DNSServers adds a list of DNS Server IP addresses to the VM NICs.	string[]
enableIPForwarding	EnableIPForwarding enables IP Forwarding in Azure which is required for some CNI's to send traffic from a pods on one machine to another. This is required for IpV6 with Calico in combination with User Defined Routes (set by the Azure Cloud Controller manager). Default is false for disabled.	bool
failureDomain	FailureDomain is the failure domain unique identifier this Machine should be attached to, as defined in Cluster API. This relates to an Azure Availability Zone	string
identity	Identity is the type of identity used for the virtual machine. The type 'SystemAssigned' is an implicitly created identity. The generated identity will be assigned a Subscription contributor role. The type 'UserAssigned' is a standalone Azure resource provided by the user and assigned to the VM	VMIdentity
image	Image is used to provide details of an image to use during VM creation. If image details are omitted, the default is to use an Azure Compute Gallery Image from CAPZ's community gallery.	Image
networkInterfaces	NetworkInterfaces specifies a list of network interface configurations. If left unspecified, the VM will get a single network interface with a single IPConfig in the subnet specified in the cluster's node subnet field. The primary interface will be the first networkInterface specified (index 0) in the list.	NetworkInterface[]
osDisk	OSDisk specifies the parameters for the operating system disk of the machine	OSDisk
providerID	ProviderID is the unique identifier as specified by the cloud provider.	string
roleAssignmentName	Deprecated: RoleAssignmentName should be set in the systemAssignedIdentityRole field.	string
securityProfile	SecurityProfile specifies the Security profile settings for a virtual machine.	SecurityProfile
spotVMOptions	SpotVMOptions allows the ability to specify the Machine should use a Spot VM	SpotVMOptions
sshPublicKey	SSHPublicKey is the SSH public key string, base64-encoded to add to a Virtual Machine. Linux only. Refer to documentation on how to set up SSH access on Windows instances.	string
subnetName	Deprecated: SubnetName should be set in the networkInterfaces field.	string
systemAssignedIdentityRole	SystemAssignedIdentityRole defines the role and scope to assign to the system-assigned identity.	SystemAssignedIdentityRole
userAssignedIdentities	UserAssignedIdentities is a list of standalone Azure identities provided by the user The lifecycle of a user-assigned identity is managed separately from the lifecycle of the AzureMachine. See https://learn.microsoft.com/azure/active-directory/managed-identities-azure-resources/how-to-manage-ua-identity-cli	UserAssignedIdentity[]
vmExtensions	VMExtensions specifies a list of extensions to be added to the virtual machine.	VMExtension[]
vmSize		string

AzureMachineStatus

AzureMachineStatus defines the observed state of AzureMachine.

Used by: AzureMachine.

Property	Description	Type
addresses	Addresses contains the Azure instance associated addresses.	corev1.NodeAddress[]
conditions	Conditions defines current service state of the AzureMachine.	clusterv1.Conditions
failureMessage	ErrorMessage will be set in the event that there is a terminal problem reconciling the Machine and will contain a more verbose string suitable for logging and human consumption. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the Machine's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
failureReason	ErrorReason will be set in the event that there is a terminal problem reconciling the Machine and will contain a succinct value suitable for machine interpretation. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the Machine's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool
vmState	VMState is the provisioning state of the Azure virtual machine.	ProvisioningState

AzureManagedClusterSpec

AzureManagedClusterSpec defines the desired state of AzureManagedCluster.

Used by: AzureManagedCluster.

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. Immutable, populated by the AKS API at create. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureManagedClusterStatus

AzureManagedClusterStatus defines the observed state of AzureManagedCluster.

Used by: AzureManagedCluster.

Property	Description	Type
ready	Ready is true when the provider resource is ready.	bool

AzureManagedControlPlaneSpec

AzureManagedControlPlaneSpec defines the desired state of AzureManagedControlPlane.

Used by: AzureManagedControlPlane.

Property	Description	Type
AzureManagedControlPlaneClassSpec
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. Immutable, populated by the AKS API at create.	clusterv1.APIEndpoint
dnsPrefix	DNSPrefix allows the user to customize dns prefix. Immutable.	string
fleetsMember	FleetsMember is the spec for the fleet this cluster is a member of. See also AKS doc.	FleetsMember
nodeResourceGroupName	NodeResourceGroupName is the name of the resource group containing cluster IaaS resources. Will be populated to default in webhook. Immutable.	string
sshPublicKey	SSHPublicKey is a string literal containing an ssh public key base64 encoded. Use empty string to autogenerate new key. Use null value to not set key. Immutable.	string

AzureManagedControlPlaneStatus

AzureManagedControlPlaneStatus defines the observed state of AzureManagedControlPlane.

Used by: AzureManagedControlPlane.

Property	Description	Type
autoUpgradeVersion	AutoUpgradeVersion is the Kubernetes version populated after auto-upgrade based on the upgrade channel.	string
conditions	Conditions defines current service state of the AzureManagedControlPlane.	clusterv1.Conditions
initialized	Initialized is true when the control plane is available for initial contact. This may occur before the control plane is fully ready. In the AzureManagedControlPlane implementation, these are identical.	bool
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
oidcIssuerProfile	OIDCIssuerProfile is the OIDC issuer profile of the Managed Cluster.	OIDCIssuerProfileStatus
ready	Ready is true when the provider resource is ready.	bool
version	Version defines the Kubernetes version for the control plane instance.	string

AzureManagedMachinePoolSpec

AzureManagedMachinePoolSpec defines the desired state of AzureManagedMachinePool.

Used by: AzureManagedMachinePool.

Property	Description	Type
AzureManagedMachinePoolClassSpec
providerIDList	ProviderIDList is the unique identifier as specified by the cloud provider.	string[]

AzureManagedMachinePoolStatus

AzureManagedMachinePoolStatus defines the observed state of AzureManagedMachinePool.

Used by: AzureManagedMachinePool.

Property	Description	Type
conditions	Conditions defines current service state of the AzureManagedControlPlane.	clusterv1.Conditions
errorMessage	Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
errorReason	Any transient errors that occur during the reconciliation of Machines can be added as events to the Machine object and/or logged in the controller's output.	string
longRunningOperationStates	LongRunningOperationStates saves the states for Azure long-running operations so they can be continued on the next reconciliation loop.	Futures
ready	Ready is true when the provider resource is ready.	bool
replicas	Replicas is the most recently observed number of replicas.	int32

AdditionalCapabilities

AdditionalCapabilities enables or disables a capability on the virtual machine.

Used by: AzureMachineSpec.

Property	Description	Type
ultraSSDEnabled	UltraSSDEnabled enables or disables Azure UltraSSD capability for the virtual machine. Defaults to true if Ultra SSD data disks are specified, otherwise it doesn't set the capability on the VM.	bool

AllowedNamespaces

AllowedNamespaces defines the namespaces the clusters are allowed to use the identity from NamespaceList takes precedence over the Selector.

Used by: AzureClusterIdentitySpec.

Property	Description	Type
list	A nil or empty list indicates that AzureCluster cannot use the identity from any namespace.	string[]
selector	Selector is a selector of namespaces that AzureCluster can use this Identity from. This is a standard Kubernetes LabelSelector, a label query over a set of resources. The result of matchLabels and matchExpressions are ANDed. A nil or empty selector indicates that AzureCluster cannot use this AzureClusterIdentity from any namespace.	metav1.LabelSelector

BastionSpec

BastionSpec specifies how the Bastion feature should be set up for the cluster.

Used by: AzureClusterSpec.

Property	Description	Type
azureBastion		AzureBastion

DataDisk

DataDisk specifies the parameters that are used to add one or more data disks to the machine.

Used by: AzureMachineSpec.

Property	Description	Type
cachingType	CachingType specifies the caching requirements.	string
diskSizeGB	DiskSizeGB is the size in GB to assign to the data disk.	int32
lun	Lun Specifies the logical unit number of the data disk. This value is used to identify data disks within the VM and therefore must be unique for each data disk attached to a VM. The value must be between 0 and 63.	int32
managedDisk	ManagedDisk specifies the Managed Disk parameters for the data disk.	ManagedDiskParameters
nameSuffix	NameSuffix is the suffix to be appended to the machine name to generate the disk name. Each disk name will be in format _.	string

Diagnostics

Diagnostics is used to configure the diagnostic settings of the virtual machine.

Used by: AzureMachineSpec.

Property	Description	Type
boot	Boot configures the boot diagnostics settings for the virtual machine. This allows to configure capturing serial output from the virtual machine on boot. This is useful for debugging software based launch issues. If not specified then Boot diagnostics (Managed) will be enabled.	BootDiagnostics

FleetsMember

FleetsMember defines the fleets member configuration. See also AKS doc.

Used by: AzureManagedControlPlaneSpec.

Property	Description	Type
FleetsMemberClassSpec
name	Name is the name of the member.	string

IdentityType

IdentityType represents different types of identities.

Used by: AzureClusterIdentitySpec.

Image

Image defines information about the image to use for VM creation. There are three ways to specify an image: by ID, Marketplace Image or SharedImageGallery One of ID, SharedImage or Marketplace should be set.

Used by: AzureMachineSpec.

Property	Description	Type
computeGallery	ComputeGallery specifies an image to use from the Azure Compute Gallery	AzureComputeGalleryImage
id	ID specifies an image to use by ID	string
marketplace	Marketplace specifies an image to use from the Azure Marketplace	AzureMarketplaceImage
sharedGallery	SharedGallery specifies an image to use from an Azure Shared Image Gallery Deprecated: use ComputeGallery instead.	AzureSharedGalleryImage

NetworkInterface

NetworkInterface defines a network interface.

Used by: AzureMachineSpec.

Property	Description	Type
acceleratedNetworking	AcceleratedNetworking enables or disables Azure accelerated networking. If omitted, it will be set based on whether the requested VMSize supports accelerated networking. If AcceleratedNetworking is set to true with a VMSize that does not support it, Azure will return an error.	bool
privateIPConfigs	PrivateIPConfigs specifies the number of private IP addresses to attach to the interface. Defaults to 1 if not specified.	int
subnetName	SubnetName specifies the subnet in which the new network interface will be placed.	string

NetworkSpec

NetworkSpec specifies what the Azure networking resources should look like.

Used by: AzureClusterSpec.

Property	Description	Type
NetworkClassSpec
additionalAPIServerLBPorts	AdditionalAPIServerLBPorts specifies extra inbound ports for the APIServer load balancer. Each port specified (e.g., 9345) creates an inbound rule where the frontend port and the backend port are the same.	LoadBalancerPort[]
apiServerLB	APIServerLB is the configuration for the control-plane load balancer.	LoadBalancerSpec
controlPlaneOutboundLB	ControlPlaneOutboundLB is the configuration for the control-plane outbound load balancer. This is different from APIServerLB, and is used only in private clusters (optionally) for enabling outbound traffic.	LoadBalancerSpec
nodeOutboundLB	NodeOutboundLB is the configuration for the node outbound load balancer.	LoadBalancerSpec
subnets	Subnets is the configuration for the control-plane subnet and the node subnet.	Subnets
vnet	Vnet is the configuration for the Azure virtual network.	VnetSpec

OIDCIssuerProfileStatus

OIDCIssuerProfileStatus is the OIDC issuer profile of the Managed Cluster.

Used by: AzureManagedControlPlaneStatus.

Property	Description	Type
issuerURL	IssuerURL is the OIDC issuer url of the Managed Cluster.	string

OSDisk

OSDisk defines the operating system disk for a VM.
WARNING: this requires any updates to ManagedDisk to be manually converted. This is due to the odd issue with conversion-gen where the warning message generated uses a relative directory import rather than the fully qualified import when generating outside of the GOPATH.

Used by: AzureMachineSpec, and osDiskTestInput.

Property	Description	Type
cachingType	CachingType specifies the caching requirements.	string
diffDiskSettings		DiffDiskSettings
diskSizeGB	DiskSizeGB is the size in GB to assign to the OS disk. Will have a default of 30GB if not provided	int32
managedDisk	ManagedDisk specifies the Managed Disk parameters for the OS disk.	ManagedDiskParameters
osType		string

ProvisioningState

ProvisioningState describes the provisioning state of an Azure resource.

Used by: AzureMachineStatus.

ResourceStatus

ResourceStatus represents the status of a resource.

Used by: AzureASOManagedClusterStatus, AzureASOManagedControlPlaneStatus, and AzureASOManagedMachinePoolStatus.

Property	Description	Type
ready		bool
resource		StatusResource

SecurityProfile

SecurityProfile specifies the Security profile settings for a virtual machine or virtual machine scale set.

Used by: AzureMachineSpec.

Property	Description	Type
encryptionAtHost	This field indicates whether Host Encryption should be enabled or disabled for a virtual machine or virtual machine scale set. This should be disabled when SecurityEncryptionType is set to DiskWithVMGuestState. Default is disabled.	bool
securityType	SecurityType specifies the SecurityType of the virtual machine. It has to be set to any specified value to enable UefiSettings. The default behavior is: UefiSettings will not be enabled unless this property is set.	SecurityTypes
uefiSettings	UefiSettings specifies the security settings like secure boot and vTPM used while creating the virtual machine.	UefiSettings

SpotVMOptions

SpotVMOptions defines the options relevant to running the Machine on Spot VMs.

Used by: AzureMachineSpec.

Property	Description	Type
evictionPolicy	EvictionPolicy defines the behavior of the virtual machine when it is evicted. It can be either Delete or Deallocate.	SpotEvictionPolicy
maxPrice	MaxPrice defines the maximum price the user is willing to pay for Spot VM instances	resource.Quantity

SystemAssignedIdentityRole

SystemAssignedIdentityRole defines the role and scope to assign to the system assigned identity.

Used by: AzureMachineSpec.

Property	Description	Type
definitionID	DefinitionID is the ID of the role definition to create for a system assigned identity. It can be an Azure built-in role or a custom role. Refer to built-in roles: https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles	string
name	Name is the name of the role assignment to create for a system assigned identity. It can be any valid UUID. If not specified, a random UUID will be generated.	string
scope	Scope is the scope that the role assignment or definition applies to. The scope can be any REST resource instance. If not specified, the scope will be the subscription.	string

UserAssignedIdentity

UserAssignedIdentity defines the user-assigned identities provided by the user to be assigned to Azure resources.

Used by: AzureMachineSpec.

Property	Description	Type
providerID	ProviderID is the identification ID of the user-assigned Identity, the format of an identity is: 'azure:///subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}'	string

VMExtension

VMExtension specifies the parameters for a custom VM extension.

Used by: AzureMachineSpec.

Property	Description	Type
name	Name is the name of the extension.	string
protectedSettings	ProtectedSettings is a JSON formatted protected settings for the extension.	Tags
publisher	Publisher is the name of the extension handler publisher.	string
settings	Settings is a JSON formatted public settings for the extension.	Tags
version	Version specifies the version of the script handler.	string

VMIdentity

VMIdentity defines the identity of the virtual machine, if configured.

Used by: AzureMachineSpec.

AzureBastion

AzureBastion specifies how the Azure Bastion cloud component should be configured.

Used by: BastionSpec.

Property	Description	Type
enableTunneling	EnableTunneling enables the native client support feature for the Azure Bastion Host. Defaults to false.	bool
name		string
publicIP		PublicIPSpec
sku	BastionHostSkuName configures the tier of the Azure Bastion Host. Can be either Basic or Standard. Defaults to Basic.	BastionHostSkuName
subnet		SubnetSpec

AzureComputeGalleryImage

AzureComputeGalleryImage defines an image in the Azure Compute Gallery to use for VM creation.

Used by: Image.

Property	Description	Type
gallery	Gallery specifies the name of the compute image gallery that contains the image	string
name	Name is the name of the image	string
plan	Plan contains plan information.	ImagePlan
resourceGroup	ResourceGroup specifies the resource group containing the private compute gallery.	string
subscriptionID	SubscriptionID is the identifier of the subscription that contains the private compute gallery.	string
version	Version specifies the version of the marketplace image. The allowed formats are Major.Minor.Build or 'latest'. Major, Minor, and Build are decimal numbers. Specify 'latest' to use the latest version of an image available at deploy time. Even if you use 'latest', the VM image will not automatically update after deploy time even if a new version becomes available.	string

AzureMarketplaceImage

AzureMarketplaceImage defines an image in the Azure Marketplace to use for VM creation.

Used by: Image.

Property	Description	Type
ImagePlan
thirdPartyImage	ThirdPartyImage indicates the image is published by a third party publisher and a Plan will be generated for it.	bool
version	Version specifies the version of an image sku. The allowed formats are Major.Minor.Build or 'latest'. Major, Minor, and Build are decimal numbers. Specify 'latest' to use the latest version of an image available at deploy time. Even if you use 'latest', the VM image will not automatically update after deploy time even if a new version becomes available.	string

AzureSharedGalleryImage

AzureSharedGalleryImage defines an image in a Shared Image Gallery to use for VM creation.

Used by: Image.

Property	Description	Type
gallery	Gallery specifies the name of the shared image gallery that contains the image	string
name	Name is the name of the image	string
offer	Offer specifies the name of a group of related images created by the publisher. For example, UbuntuServer, WindowsServer This value will be used to add a `Plan` in the API request when creating the VM/VMSS resource. This is needed when the source image from which this SIG image was built requires the `Plan` to be used.	string
publisher	Publisher is the name of the organization that created the image. This value will be used to add a `Plan` in the API request when creating the VM/VMSS resource. This is needed when the source image from which this SIG image was built requires the `Plan` to be used.	string
resourceGroup	ResourceGroup specifies the resource group containing the shared image gallery	string
sku	SKU specifies an instance of an offer, such as a major release of a distribution. For example, 18.04-LTS, 2019-Datacenter This value will be used to add a `Plan` in the API request when creating the VM/VMSS resource. This is needed when the source image from which this SIG image was built requires the `Plan` to be used.	string
subscriptionID	SubscriptionID is the identifier of the subscription that contains the shared image gallery	string
version	Version specifies the version of the marketplace image. The allowed formats are Major.Minor.Build or 'latest'. Major, Minor, and Build are decimal numbers. Specify 'latest' to use the latest version of an image available at deploy time. Even if you use 'latest', the VM image will not automatically update after deploy time even if a new version becomes available.	string

BootDiagnostics

BootDiagnostics configures the boot diagnostics settings for the virtual machine. This allows you to configure capturing serial output from the virtual machine on boot. This is useful for debugging software based launch issues.

Used by: Diagnostics.

Property	Description	Type
storageAccountType	StorageAccountType determines if the storage account for storing the diagnostics data should be disabled (Disabled), provisioned by Azure (Managed) or by the user (UserManaged).	BootDiagnosticsStorageAccountType Required
userManaged	UserManaged provides a reference to the user-managed storage account.	UserManagedBootDiagnostics

DiffDiskSettings

DiffDiskSettings describe ephemeral disk settings for the os disk.

Used by: OSDisk.

Property	Description	Type
option	Option enables ephemeral OS when set to "Local" See https://learn.microsoft.com/azure/virtual-machines/ephemeral-os-disks for full details	string
placement	Placement specifies the ephemeral disk placement for operating system disk. If placement is specified, Option must be set to "Local".	DiffDiskPlacement

LoadBalancerPort

LoadBalancerPort specifies additional port for the API server load balancer.

Used by: NetworkSpec, and NetworkTemplateSpec.

Property	Description	Type
name	Name for the additional port within LB definition	string
port	Port for the LB definition	int32

LoadBalancerSpec

LoadBalancerSpec defines an Azure load balancer.

Used by: NetworkSpec, NetworkSpec, and NetworkSpec.

Property	Description	Type
LoadBalancerClassSpec
backendPool	BackendPool describes the backend pool of the load balancer.	BackendPool
frontendIPs		FrontendIP[]
frontendIPsCount	FrontendIPsCount specifies the number of frontend IP addresses for the load balancer.	int32
id	ID is the Azure resource ID of the load balancer. READ-ONLY	string
name		string

ManagedDiskParameters

ManagedDiskParameters defines the parameters of a managed disk.

Used by: DataDisk, and OSDisk.

Property	Description	Type
diskEncryptionSet	DiskEncryptionSet specifies the customer-managed disk encryption set resource id for the managed disk.	DiskEncryptionSetParameters
securityProfile	SecurityProfile specifies the security profile for the managed disk.	VMDiskSecurityProfile
storageAccountType		string

SecurityTypes

SecurityTypes represents the SecurityType of the virtual machine.

Used by: SecurityProfile.

SpotEvictionPolicy

SpotEvictionPolicy defines the eviction policy for spot VMs, if configured.

Used by: SpotVMOptions.

StatusResource

StatusResource is a handle to a resource.

Used by: ResourceStatus.

Property	Description	Type
group		string
kind		string
name		string
version		string

UefiSettings

UefiSettings specifies the security settings like secure boot and vTPM used while creating the virtual machine.

Used by: SecurityProfile.

Property	Description	Type
secureBootEnabled	SecureBootEnabled specifies whether secure boot should be enabled on the virtual machine. Secure Boot verifies the digital signature of all boot components and halts the boot process if signature verification fails. If omitted, the platform chooses a default, which is subject to change over time, currently that default is false.	bool
vTpmEnabled	VTpmEnabled specifies whether vTPM should be enabled on the virtual machine. When true it enables the virtualized trusted platform module measurements to create a known good boot integrity policy baseline. The integrity policy baseline is used for comparison with measurements from subsequent VM boots to determine if anything has changed. This is required to be set to Enabled if SecurityEncryptionType is defined. If omitted, the platform chooses a default, which is subject to change over time, currently that default is false.	bool

VnetSpec

VnetSpec configures an Azure virtual network.

Used by: NetworkSpec.

Property	Description	Type
VnetClassSpec
id	ID is the Azure resource ID of the virtual network. READ-ONLY	string
name	Name defines a name for the virtual network resource.	string
peerings	Peerings defines a list of peerings of the newly created virtual network with existing virtual networks.	VnetPeerings
resourceGroup	ResourceGroup is the name of the resource group of the existing virtual network or the resource group where a managed virtual network should be created.	string

BackendPool

BackendPool describes the backend pool of the load balancer.

Used by: LoadBalancerSpec.

Property	Description	Type
name	Name specifies the name of backend pool for the load balancer. If not specified, the default name will be set, depending on the load balancer role.	string

BastionHostSkuName

BastionHostSkuName is the name of the SKU used to specify the tier of Azure Bastion Host.

Used by: AzureBastion.

BootDiagnosticsStorageAccountType

BootDiagnosticsStorageAccountType defines the list of valid storage account types for the boot diagnostics.

Used by: BootDiagnostics.

DiffDiskPlacement

DiffDiskPlacement - Specifies the ephemeral disk placement for operating system disk. This property can be used by user in the request to choose the location i.e, cache disk, resource disk or nvme disk space for Ephemeral OS disk provisioning. For more information on Ephemeral OS disk size requirements, please refer Ephemeral OS disk size requirements for Windows VM at https://docs.microsoft.com/azure/virtual-machines/windows/ephemeral-os-disks#size-requirements and Linux VM at https://docs.microsoft.com/azure/virtual-machines/linux/ephemeral-os-disks#size-requirements.

Used by: DiffDiskSettings.

DiskEncryptionSetParameters

DiskEncryptionSetParameters defines disk encryption options.

Used by: ManagedDiskParameters, and VMDiskSecurityProfile.

Property	Description	Type
id	ID defines resourceID for diskEncryptionSet resource. It must be in the same subscription	string

FrontendIP

FrontendIP defines a load balancer frontend IP configuration.

Used by: LoadBalancerSpec.

Property	Description	Type
FrontendIPClass
name		string
publicIP		PublicIPSpec

ImagePlan

ImagePlan contains plan information for marketplace images.

Used by: AzureComputeGalleryImage.

Property	Description	Type
offer	Offer specifies the name of a group of related images created by the publisher. For example, UbuntuServer, WindowsServer	string
publisher	Publisher is the name of the organization that created the image	string
sku	SKU specifies an instance of an offer, such as a major release of a distribution. For example, 18.04-LTS, 2019-Datacenter	string

PublicIPSpec

PublicIPSpec defines the inputs to create an Azure public IP address.

Used by: AzureBastion, FrontendIP, and NatGateway.

Property	Description	Type
dnsName		string
ipTags		IPTag[]
name		string

SubnetSpec

SubnetSpec configures an Azure subnet.

Used by: AzureBastion.

Property	Description	Type
SubnetClassSpec
id	ID is the Azure resource ID of the subnet. READ-ONLY	string
natGateway	NatGateway associated with this subnet.	NatGateway
routeTable	RouteTable defines the route table that should be attached to this subnet.	RouteTable
securityGroup	SecurityGroup defines the NSG (network security group) that should be attached to this subnet.	SecurityGroup

UserManagedBootDiagnostics

UserManagedBootDiagnostics provides a reference to a user-managed storage account.

Used by: BootDiagnostics.

Property	Description	Type
storageAccountURI	StorageAccountURI is the URI of the user-managed storage account. The URI typically will be `https://<mystorageaccountname>.blob.core.windows.net/` but may differ if you are using Azure DNS zone endpoints. You can find the correct endpoint by looking for the Blob Primary Endpoint in the endpoints tab in the Azure console or with the CLI by issuing `az storage account list --query='[].{name: name, "resource group": resourceGroup, "blob endpoint": primaryEndpoints.blob}'`.	string Required

VMDiskSecurityProfile

VMDiskSecurityProfile specifies the security profile settings for the managed disk. It can be set only for Confidential VMs.

Used by: ManagedDiskParameters.

Property	Description	Type
diskEncryptionSet	DiskEncryptionSet specifies the customer-managed disk encryption set resource id for the managed disk that is used for Customer Managed Key encrypted ConfidentialVM OS Disk and VMGuest blob.	DiskEncryptionSetParameters
securityEncryptionType	SecurityEncryptionType specifies the encryption type of the managed disk. It is set to DiskWithVMGuestState to encrypt the managed disk along with the VMGuestState blob, and to VMGuestStateOnly to encrypt the VMGuestState blob only. When set to VMGuestStateOnly, VirtualizedTrustedPlatformModule should be set to Enabled. When set to DiskWithVMGuestState, EncryptionAtHost should be disabled, SecureBoot and VirtualizedTrustedPlatformModule should be set to Enabled. It can be set only for Confidential VMs.	SecurityEncryptionType

IPTag

IPTag contains the IpTag associated with the object.

Used by: PublicIPSpec.

Property	Description	Type
tag	Tag specifies the value of the IP tag associated with the public IP. Example: SQL.	string
type	Type specifies the IP tag type. Example: FirstPartyUsage.	string

NatGateway

NatGateway defines an Azure NAT gateway. NAT gateway resources are part of Vnet NAT and provide outbound Internet connectivity for subnets of a virtual network.

Used by: SubnetSpec.

Property	Description	Type
NatGatewayClassSpec
id	ID is the Azure resource ID of the NAT gateway. READ-ONLY	string
ip		PublicIPSpec

RouteTable

RouteTable defines an Azure route table.

Used by: SubnetSpec.

Property	Description	Type
id	ID is the Azure resource ID of the route table. READ-ONLY	string
name		string

SecurityEncryptionType

SecurityEncryptionType represents the Encryption Type when the virtual machine is a Confidential VM.

Used by: VMDiskSecurityProfile.

SecurityGroup

SecurityGroup defines an Azure security group.

Used by: SubnetSpec.

Property	Description	Type
SecurityGroupClass
id	ID is the Azure resource ID of the security group. READ-ONLY	string
name		string

v1beta1

Metadata	Value
Group	infrastructure.cluster.x-k8s.io
Version
Module	sigs.k8s.io/cluster-api-provider-azure/exp/api/v1beta1
Property Optionality

AzureMachinePool

Used by: AzureMachinePoolList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureMachinePoolSpec
status		AzureMachinePoolStatus

AzureMachinePoolSpec

Property	Description	Type
additionalTags	AdditionalTags is an optional set of tags to add to an instance, in addition to the ones added by default by the Azure provider. If both the AzureCluster and the AzureMachine specify the same tag name with different values, the AzureMachine's value takes precedence.	infrav1.Tags
identity	Identity is the type of identity used for the Virtual Machine Scale Set. The type 'SystemAssigned' is an implicitly created identity. The generated identity will be assigned a Subscription contributor role. The type 'UserAssigned' is a standalone Azure resource provided by the user and assigned to the VM	infrav1.VMIdentity
location	Location is the Azure region location e.g. westus2	string
orchestrationMode	OrchestrationMode specifies the orchestration mode for the Virtual Machine Scale Set	infrav1.OrchestrationModeType
platformFaultDomainCount	PlatformFaultDomainCount specifies the number of fault domains that the Virtual Machine Scale Set can use. The count determines the spreading algorithm of the Azure fault domain.	int32
providerID	ProviderID is the identification ID of the Virtual Machine Scale Set	string
providerIDList	ProviderIDList are the identification IDs of machine instances provided by the provider. This field must match the provider IDs as seen on the node objects corresponding to a machine pool's machine instances.	string[]
roleAssignmentName	Deprecated: RoleAssignmentName should be set in the systemAssignedIdentityRole field.	string
strategy	The deployment strategy to use to replace existing AzureMachinePoolMachines with new ones.	AzureMachinePoolDeploymentStrategy
systemAssignedIdentityRole	SystemAssignedIdentityRole defines the role and scope to assign to the system assigned identity.	infrav1.SystemAssignedIdentityRole
template	Template contains the details used to build a replica virtual machine within the Machine Pool	AzureMachinePoolMachineTemplate
userAssignedIdentities	UserAssignedIdentities is a list of standalone Azure identities provided by the user The lifecycle of a user-assigned identity is managed separately from the lifecycle of the AzureMachinePool. See https://learn.microsoft.com/azure/active-directory/managed-identities-azure-resources/how-to-manage-ua-identity-cli	infrav1.UserAssignedIdentity[]
zoneBalance	ZoneBalane dictates whether to force strictly even Virtual Machine distribution cross x-zones in case there is zone outage.	bool

AzureMachinePoolStatus

Property	Description	Type
conditions	Conditions defines current service state of the AzureMachinePool.	clusterv1.Conditions
failureMessage	FailureMessage will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a more verbose string suitable for logging and human consumption. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the MachinePool's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
failureReason	FailureReason will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a succinct value suitable for machine interpretation. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the MachinePool's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
image	Image is the current image used in the AzureMachinePool. When the spec image is nil, this image is populated with the details of the defaulted Azure Marketplace "capi" offer.	infrav1.Image
infrastructureMachineKind	InfrastructureMachineKind is the kind of the infrastructure resources behind MachinePool Machines.	string
instances	Instances is the VM instance status for each VM in the VMSS	AzureMachinePoolInstanceStatus[]
longRunningOperationStates	LongRunningOperationStates saves the state for Azure long-running operations so they can be continued on the next reconciliation loop.	infrav1.Futures
provisioningState	ProvisioningState is the provisioning state of the Azure virtual machine.	infrav1.ProvisioningState
ready	Ready is true when the provider resource is ready.	bool
replicas	Replicas is the most recently observed number of replicas.	int32
version	Version is the Kubernetes version for the current VMSS model	string

AzureMachinePoolList

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureMachinePool[]

AzureMachinePoolMachine

Used by: AzureMachinePoolMachineList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureMachinePoolMachineSpec
status		AzureMachinePoolMachineStatus

AzureMachinePoolMachineSpec

Property	Description	Type
instanceID	InstanceID is the identification of the Machine Instance within the VMSS	string
providerID	ProviderID is the identification ID of the Virtual Machine Scale Set	string

AzureMachinePoolMachineStatus

Property	Description	Type
conditions	Conditions defines current service state of the AzureMachinePool.	clusterv1.Conditions
failureMessage	FailureMessage will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a more verbose string suitable for logging and human consumption. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
failureReason	FailureReason will be set in the event that there is a terminal problem reconciling the MachinePool machine and will contain a succinct value suitable for machine interpretation. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
instanceName	InstanceName is the name of the Machine Instance within the VMSS	string
latestModelApplied	LatestModelApplied indicates the instance is running the most up-to-date VMSS model. A VMSS model describes the image version the VM is running. If the instance is not running the latest model, it means the instance may not be running the version of Kubernetes the Machine Pool has specified and needs to be updated.	bool
longRunningOperationStates	LongRunningOperationStates saves the state for Azure long running operations so they can be continued on the next reconciliation loop.	infrav1.Futures
nodeRef	NodeRef will point to the corresponding Node if it exists.	corev1.ObjectReference
provisioningState	ProvisioningState is the provisioning state of the Azure virtual machine instance.	infrav1.ProvisioningState
ready	Ready is true when the provider resource is ready.	bool
version	Version defines the Kubernetes version for the VM Instance	string

AzureMachinePoolMachineList

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureMachinePoolMachine[]

azureMachinePoolWebhook

azureMachinePoolWebhook implements a validating and defaulting webhook for AzureMachinePool.

Property	Description	Type
Client		client.Client

mockClient

Property	Description	Type
client.Client
ReturnError		bool
Version		string

mockDefaultClient

Property	Description	Type
client.Client
ClusterName		string
Name		string
ReturnError		bool
SubscriptionID		string
Version		string

AzureMachinePoolMachineSpec

Used by: AzureMachinePoolMachine.

Property	Description	Type
instanceID	InstanceID is the identification of the Machine Instance within the VMSS	string
providerID	ProviderID is the identification ID of the Virtual Machine Scale Set	string

AzureMachinePoolMachineStatus

Used by: AzureMachinePoolMachine.

Property	Description	Type
conditions	Conditions defines current service state of the AzureMachinePool.	clusterv1.Conditions
failureMessage	FailureMessage will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a more verbose string suitable for logging and human consumption. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
failureReason	FailureReason will be set in the event that there is a terminal problem reconciling the MachinePool machine and will contain a succinct value suitable for machine interpretation. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
instanceName	InstanceName is the name of the Machine Instance within the VMSS	string
latestModelApplied	LatestModelApplied indicates the instance is running the most up-to-date VMSS model. A VMSS model describes the image version the VM is running. If the instance is not running the latest model, it means the instance may not be running the version of Kubernetes the Machine Pool has specified and needs to be updated.	bool
longRunningOperationStates	LongRunningOperationStates saves the state for Azure long running operations so they can be continued on the next reconciliation loop.	infrav1.Futures
nodeRef	NodeRef will point to the corresponding Node if it exists.	corev1.ObjectReference
provisioningState	ProvisioningState is the provisioning state of the Azure virtual machine instance.	infrav1.ProvisioningState
ready	Ready is true when the provider resource is ready.	bool
version	Version defines the Kubernetes version for the VM Instance	string

AzureMachinePoolSpec

Used by: AzureMachinePool.

Property	Description	Type
additionalTags	AdditionalTags is an optional set of tags to add to an instance, in addition to the ones added by default by the Azure provider. If both the AzureCluster and the AzureMachine specify the same tag name with different values, the AzureMachine's value takes precedence.	infrav1.Tags
identity	Identity is the type of identity used for the Virtual Machine Scale Set. The type 'SystemAssigned' is an implicitly created identity. The generated identity will be assigned a Subscription contributor role. The type 'UserAssigned' is a standalone Azure resource provided by the user and assigned to the VM	infrav1.VMIdentity
location	Location is the Azure region location e.g. westus2	string
orchestrationMode	OrchestrationMode specifies the orchestration mode for the Virtual Machine Scale Set	infrav1.OrchestrationModeType
platformFaultDomainCount	PlatformFaultDomainCount specifies the number of fault domains that the Virtual Machine Scale Set can use. The count determines the spreading algorithm of the Azure fault domain.	int32
providerID	ProviderID is the identification ID of the Virtual Machine Scale Set	string
providerIDList	ProviderIDList are the identification IDs of machine instances provided by the provider. This field must match the provider IDs as seen on the node objects corresponding to a machine pool's machine instances.	string[]
roleAssignmentName	Deprecated: RoleAssignmentName should be set in the systemAssignedIdentityRole field.	string
strategy	The deployment strategy to use to replace existing AzureMachinePoolMachines with new ones.	AzureMachinePoolDeploymentStrategy
systemAssignedIdentityRole	SystemAssignedIdentityRole defines the role and scope to assign to the system assigned identity.	infrav1.SystemAssignedIdentityRole
template	Template contains the details used to build a replica virtual machine within the Machine Pool	AzureMachinePoolMachineTemplate
userAssignedIdentities	UserAssignedIdentities is a list of standalone Azure identities provided by the user The lifecycle of a user-assigned identity is managed separately from the lifecycle of the AzureMachinePool. See https://learn.microsoft.com/azure/active-directory/managed-identities-azure-resources/how-to-manage-ua-identity-cli	infrav1.UserAssignedIdentity[]
zoneBalance	ZoneBalane dictates whether to force strictly even Virtual Machine distribution cross x-zones in case there is zone outage.	bool

AzureMachinePoolStatus

Used by: AzureMachinePool.

Property	Description	Type
conditions	Conditions defines current service state of the AzureMachinePool.	clusterv1.Conditions
failureMessage	FailureMessage will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a more verbose string suitable for logging and human consumption. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the MachinePool's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
failureReason	FailureReason will be set in the event that there is a terminal problem reconciling the MachinePool and will contain a succinct value suitable for machine interpretation. This field should not be set for transitive errors that a controller faces that are expected to be fixed automatically over time (like service outages), but instead indicate that something is fundamentally wrong with the MachinePool's spec or the configuration of the controller, and that manual intervention is required. Examples of terminal errors would be invalid combinations of settings in the spec, values that are unsupported by the controller, or the responsible controller itself being critically misconfigured. Any transient errors that occur during the reconciliation of MachinePools can be added as events to the MachinePool object and/or logged in the controller's output.	string
image	Image is the current image used in the AzureMachinePool. When the spec image is nil, this image is populated with the details of the defaulted Azure Marketplace "capi" offer.	infrav1.Image
infrastructureMachineKind	InfrastructureMachineKind is the kind of the infrastructure resources behind MachinePool Machines.	string
instances	Instances is the VM instance status for each VM in the VMSS	AzureMachinePoolInstanceStatus[]
longRunningOperationStates	LongRunningOperationStates saves the state for Azure long-running operations so they can be continued on the next reconciliation loop.	infrav1.Futures
provisioningState	ProvisioningState is the provisioning state of the Azure virtual machine.	infrav1.ProvisioningState
ready	Ready is true when the provider resource is ready.	bool
replicas	Replicas is the most recently observed number of replicas.	int32
version	Version is the Kubernetes version for the current VMSS model	string

AzureMachinePoolDeploymentStrategy

Used by: AzureMachinePoolSpec.

Property	Description	Type
rollingUpdate	Rolling update config params. Present only if MachineDeploymentStrategyType = RollingUpdate.	MachineRollingUpdateDeployment
type	Type of deployment. Currently the only supported strategy is RollingUpdate	AzureMachinePoolDeploymentStrategyType

AzureMachinePoolInstanceStatus

Used by: AzureMachinePoolStatus.

Property	Description	Type
instanceID	InstanceID is the identification of the Machine Instance within the VMSS	string
instanceName	InstanceName is the name of the Machine Instance within the VMSS	string
latestModelApplied	LatestModelApplied indicates the instance is running the most up-to-date VMSS model. A VMSS model describes the image version the VM is running. If the instance is not running the latest model, it means the instance may not be running the version of Kubernetes the Machine Pool has specified and needs to be updated.	bool
providerID	ProviderID is the provider identification of the VMSS Instance	string
provisioningState	ProvisioningState is the provisioning state of the Azure virtual machine instance.	infrav1.ProvisioningState
version	Version defines the Kubernetes version for the VM Instance	string

AzureMachinePoolMachineTemplate

Used by: AzureMachinePoolSpec.

Property	Description	Type
acceleratedNetworking	Deprecated: AcceleratedNetworking should be set in the networkInterfaces field.	bool
additionalCapabilities	AdditionalCapabilities specifies additional capabilities enabled or disabled on the virtual machine.	infrav1.AdditionalCapabilities
dataDisks	DataDisks specifies the list of data disks to be created for a Virtual Machine	infrav1.DataDisk[]
diagnostics	Diagnostics specifies the diagnostics settings for a virtual machine. If not specified then Boot diagnostics (Managed) will be enabled.	infrav1.Diagnostics
image	Image is used to provide details of an image to use during VM creation. If image details are omitted the image will default the Azure Marketplace "capi" offer, which is based on Ubuntu.	infrav1.Image
networkInterfaces	NetworkInterfaces specifies a list of network interface configurations. If left unspecified, the VM will get a single network interface with a single IPConfig in the subnet specified in the cluster's node subnet field. The primary interface will be the first networkInterface specified (index 0) in the list.	infrav1.NetworkInterface[]
osDisk	OSDisk contains the operating system disk information for a Virtual Machine	infrav1.OSDisk
securityProfile	SecurityProfile specifies the Security profile settings for a virtual machine.	infrav1.SecurityProfile
spotVMOptions	SpotVMOptions allows the ability to specify the Machine should use a Spot VM	infrav1.SpotVMOptions
sshPublicKey	SSHPublicKey is the SSH public key string, base64-encoded to add to a Virtual Machine. Linux only. Refer to documentation on how to set up SSH access on Windows instances.	string
subnetName	Deprecated: SubnetName should be set in the networkInterfaces field.	string
terminateNotificationTimeout	TerminateNotificationTimeout enables or disables VMSS scheduled events termination notification with specified timeout allowed values are between 5 and 15 (mins)	int
vmExtensions	VMExtensions specifies a list of extensions to be added to the scale set.	infrav1.VMExtension[]
vmSize	VMSize is the size of the Virtual Machine to build. See https://learn.microsoft.com/rest/api/compute/virtualmachines/createorupdate#virtualmachinesizetypes	string

AzureMachinePoolDeploymentStrategyType

Used by: AzureMachinePoolDeploymentStrategy.

MachineRollingUpdateDeployment

Used by: AzureMachinePoolDeploymentStrategy.

Property	Description	Type
deletePolicy	DeletePolicy defines the policy used by the MachineDeployment to identify nodes to delete when downscaling. Valid values are "Random, "Newest", "Oldest" When no value is supplied, the default is Oldest	AzureMachinePoolDeletePolicyType
maxSurge	The maximum number of machines that can be scheduled above the desired number of machines. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). This can not be 0 if MaxUnavailable is 0. Absolute number is calculated from percentage by rounding up. Defaults to 1. Example: when this is set to 30%, the new MachineSet can be scaled up immediately when the rolling update starts, such that the total number of old and new machines do not exceed 130% of desired machines. Once old machines have been killed, new MachineSet can be scaled up further, ensuring that total number of machines running at any time during the update is at most 130% of desired machines.	intstr.IntOrString
maxUnavailable	The maximum number of machines that can be unavailable during the update. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). Absolute number is calculated from percentage by rounding down. This can not be 0 if MaxSurge is 0. Defaults to 0. Example: when this is set to 30%, the old MachineSet can be scaled down to 70% of desired machines immediately when the rolling update starts. Once new machines are ready, old MachineSet can be scaled down further, followed by scaling up the new MachineSet, ensuring that the total number of machines available at all times during the update is at least 70% of desired machines.	intstr.IntOrString

AzureMachinePoolDeletePolicyType

Used by: MachineRollingUpdateDeployment.

v1alpha1

Metadata	Value
Group	infrastructure.cluster.x-k8s.io
Version
Module	sigs.k8s.io/cluster-api-provider-azure/api/v1alpha1
Property Optionality

AzureASOManagedCluster

AzureASOManagedCluster is the Schema for the azureasomanagedclusters API.

Used by: AzureASOManagedClusterList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedClusterSpec
status		AzureASOManagedClusterStatus

AzureASOManagedClusterSpec

Property	Description	Type
AzureASOManagedClusterTemplateResourceSpec
controlPlaneEndpoint	ControlPlaneEndpoint is the location of the API server within the control plane. CAPZ manages this field and it should not be set by the user. It fulfills Cluster API's cluster infrastructure provider contract. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureASOManagedClusterStatus

Property	Description	Type
ready	Ready represents whether or not the cluster has been provisioned and is ready. It fulfills Cluster API's cluster infrastructure provider contract.	bool
resources		ResourceStatus[]

AzureASOManagedClusterList

AzureASOManagedClusterList contains a list of AzureASOManagedCluster.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedCluster[]

AzureASOManagedClusterTemplate

AzureASOManagedClusterTemplate is the Schema for the azureasomanagedclustertemplates API.

Used by: AzureASOManagedClusterTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedClusterTemplateSpec

AzureASOManagedClusterTemplateList

AzureASOManagedClusterTemplateList contains a list of AzureASOManagedClusterTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedClusterTemplate[]

AzureASOManagedClusterTemplateResource

AzureASOManagedClusterTemplateResource defines the templated resource.

Used by: AzureASOManagedClusterTemplateSpec.

Property	Description	Type
spec		AzureASOManagedClusterTemplateResourceSpec

AzureASOManagedClusterTemplateResourceSpec

AzureASOManagedClusterTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedClusterTemplateResource.

Property	Description	Type
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]

AzureASOManagedClusterTemplateSpec

AzureASOManagedClusterTemplateSpec defines the desired state of AzureASOManagedClusterTemplate.

Used by: AzureASOManagedClusterTemplate.

Property	Description	Type
template		AzureASOManagedClusterTemplateResource

AzureASOManagedControlPlane

AzureASOManagedControlPlane is the Schema for the azureasomanagedcontrolplanes API.

Used by: AzureASOManagedControlPlaneList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedControlPlaneSpec
status		AzureASOManagedControlPlaneStatus

AzureASOManagedControlPlaneSpec

AzureASOManagedControlPlaneStatus

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint for the cluster's API server.	clusterv1.APIEndpoint
initialized	Initialized represents whether or not the API server has been provisioned. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `ready`.	bool
ready	Ready represents whether or not the API server is ready to receive requests. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `initialized`.	bool
resources		ResourceStatus[]
version	Version is the observed Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedControlPlaneList

AzureASOManagedControlPlaneList contains a list of AzureASOManagedControlPlane.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedControlPlane[]

AzureASOManagedControlPlaneResource

AzureASOManagedControlPlaneResource defines the templated resource.

Used by: AzureASOManagedControlPlaneTemplateSpec, and AzureASOManagedMachinePoolTemplateSpec.

Property	Description	Type
spec		AzureASOManagedControlPlaneTemplateResourceSpec

AzureASOManagedControlPlaneTemplate

AzureASOManagedControlPlaneTemplate is the Schema for the azureasomanagedcontrolplanetemplates API.

Used by: AzureASOManagedControlPlaneTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedControlPlaneTemplateSpec

AzureASOManagedControlPlaneTemplateList

AzureASOManagedControlPlaneTemplateList contains a list of AzureASOManagedControlPlaneTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedControlPlaneTemplate[]

AzureASOManagedControlPlaneTemplateResourceSpec

AzureASOManagedControlPlaneTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedControlPlaneResource.

Property	Description	Type
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]
version	Version is the Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedControlPlaneTemplateSpec

AzureASOManagedControlPlaneTemplateSpec defines the desired state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlaneTemplate.

Property	Description	Type
template		AzureASOManagedControlPlaneResource

AzureASOManagedMachinePool

AzureASOManagedMachinePool is the Schema for the azureasomanagedmachinepools API.

Used by: AzureASOManagedMachinePoolList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedMachinePoolSpec
status		AzureASOManagedMachinePoolStatus

AzureASOManagedMachinePoolSpec

AzureASOManagedMachinePoolStatus

Property	Description	Type
ready	Ready represents whether or not the infrastructure is ready to be used. It fulfills Cluster API's machine pool infrastructure provider contract.	bool
replicas	Replicas is the current number of provisioned replicas. It fulfills Cluster API's machine pool infrastructure provider contract.	int32
resources		ResourceStatus[]

AzureASOManagedMachinePoolList

AzureASOManagedMachinePoolList contains a list of AzureASOManagedMachinePool.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedMachinePool[]

AzureASOManagedMachinePoolResource

AzureASOManagedMachinePoolResource defines the templated resource.

Property	Description	Type
spec		AzureASOManagedMachinePoolTemplateResourceSpec

AzureASOManagedMachinePoolTemplate

AzureASOManagedMachinePoolTemplate is the Schema for the azureasomanagedmachinepooltemplates API.

Used by: AzureASOManagedMachinePoolTemplateList.

Property	Description	Type
metav1.TypeMeta
metav1.ObjectMeta
spec		AzureASOManagedMachinePoolTemplateSpec

AzureASOManagedMachinePoolTemplateList

AzureASOManagedMachinePoolTemplateList contains a list of AzureASOManagedMachinePoolTemplate.

Property	Description	Type
metav1.TypeMeta
metav1.ListMeta
items		AzureASOManagedMachinePoolTemplate[]

AzureASOManagedMachinePoolTemplateResourceSpec

AzureASOManagedMachinePoolTemplateResourceSpec defines the desired state of the templated resource.

Used by: AzureASOManagedMachinePoolResource.

Property	Description	Type
providerIDList	ProviderIDList is the list of cloud provider IDs for the instances. It fulfills Cluster API's machine pool infrastructure provider contract.	string[]
resources	Resources are embedded ASO resources to be managed by this resource.	runtime.RawExtension[]

AzureASOManagedMachinePoolTemplateSpec

AzureASOManagedMachinePoolTemplateSpec defines the desired state of AzureASOManagedMachinePoolTemplate.

Used by: AzureASOManagedMachinePoolTemplate.

Property	Description	Type
template		AzureASOManagedControlPlaneResource

AzureASOManagedClusterSpec

AzureASOManagedClusterSpec defines the desired state of AzureASOManagedCluster.

Used by: AzureASOManagedCluster.

Property	Description	Type
AzureASOManagedClusterTemplateResourceSpec
controlPlaneEndpoint	ControlPlaneEndpoint is the location of the API server within the control plane. CAPZ manages this field and it should not be set by the user. It fulfills Cluster API's cluster infrastructure provider contract. Because this field is programmatically set by CAPZ after resource creation, we define it as +optional in the API schema to permit resource admission.	clusterv1.APIEndpoint

AzureASOManagedClusterStatus

AzureASOManagedClusterStatus defines the observed state of AzureASOManagedCluster.

Used by: AzureASOManagedCluster.

Property	Description	Type
ready	Ready represents whether or not the cluster has been provisioned and is ready. It fulfills Cluster API's cluster infrastructure provider contract.	bool
resources		ResourceStatus[]

AzureASOManagedControlPlaneSpec

AzureASOManagedControlPlaneSpec defines the desired state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlane.

AzureASOManagedControlPlaneStatus

AzureASOManagedControlPlaneStatus defines the observed state of AzureASOManagedControlPlane.

Used by: AzureASOManagedControlPlane.

Property	Description	Type
controlPlaneEndpoint	ControlPlaneEndpoint represents the endpoint for the cluster's API server.	clusterv1.APIEndpoint
initialized	Initialized represents whether or not the API server has been provisioned. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `ready`.	bool
ready	Ready represents whether or not the API server is ready to receive requests. It fulfills Cluster API's control plane provider contract. For AKS, this is equivalent to `initialized`.	bool
resources		ResourceStatus[]
version	Version is the observed Kubernetes version of the control plane. It fulfills Cluster API's control plane provider contract.	string

AzureASOManagedMachinePoolSpec

AzureASOManagedMachinePoolSpec defines the desired state of AzureASOManagedMachinePool.

Used by: AzureASOManagedMachinePool.

AzureASOManagedMachinePoolStatus

AzureASOManagedMachinePoolStatus defines the observed state of AzureASOManagedMachinePool.

Used by: AzureASOManagedMachinePool.

Property	Description	Type
ready	Ready represents whether or not the infrastructure is ready to be used. It fulfills Cluster API's machine pool infrastructure provider contract.	bool
replicas	Replicas is the current number of provisioned replicas. It fulfills Cluster API's machine pool infrastructure provider contract.	int32
resources		ResourceStatus[]

ResourceStatus

ResourceStatus represents the status of a resource.

Used by: AzureASOManagedClusterStatus, AzureASOManagedControlPlaneStatus, and AzureASOManagedMachinePoolStatus.

Property	Description	Type
ready		bool
resource		StatusResource

StatusResource

StatusResource is a handle to a resource.

Used by: ResourceStatus.

Property	Description	Type
group		string
kind		string
name		string
version		string

The Cluster API Provider Azure Book