CI/CD

Publish at: 21 May 2026

trinity gitops argo-cd github-actions pulumi oidc

CI/CD and GitOps flow

Phase 2 - GitOps #

The second phase changes the operating model.

Phase 1 used Pulumi for everything: cloud resources, Kubernetes provider setup, the hello namespace, the deployment, and the load balancer service. That was useful for proving that each cluster could run a workload, but it is not the delivery model I want for the platform. Once Kubernetes exists, application and platform state should be reconciled inside Kubernetes from Git.

So Phase 2 is about drawing a clean ownership line:

Pulumi owns cloud infrastructure and cluster bootstrap.
Argo CD owns Kubernetes application and platform manifests.
Git owns the desired state.
Manual kubectl apply is for debugging only.

I am choosing Argo CD for this exercise.

Argo CD is a GitOps controller for Kubernetes. It runs inside the cluster, watches a Git repository, compares the manifests in Git with the objects that exist in Kubernetes, and reconciles the cluster back to the declared state. Git becomes the source of truth; Argo CD becomes the control loop that keeps each cluster aligned with that source.

Flux would also fit. It is small, composable, and very natural if the whole workflow is meant to stay close to Kubernetes APIs. Argo CD is a better fit for this particular exercise because the demo matters. It gives a clear application model, a visible sync status, a useful UI, and an easy way to show drift, reconciliation, and rollback to someone reviewing the platform.

The ownership boundary #

The first rule for this phase is that Pulumi and Argo CD should not fight over the same Kubernetes objects.

Right now the hello workload is created by the shared Pulumi component:

const hello = deployHelloApp("aws", environment, cluster.provider, [
  cluster,
  nodeGroup,
]);

That was fine for Phase 1. In Phase 2, that component needs to disappear from the cloud stacks or become a temporary bootstrap-only proof that is disabled once Argo CD is in place. The same namespace, deployment, and service cannot be managed by both Pulumi and Argo CD without creating a confusing source of truth.

The new boundary should be:

Pulumi:
  - EKS / GKE / AKS
  - node pools
  - kubeconfig outputs
  - Argo CD installation
  - minimal Argo CD bootstrap application

Argo CD:
  - namespaces
  - platform add-ons
  - application manifests
  - environment overlays

This keeps Pulumi responsible for the machinery needed to create and reach the clusters. After that, Argo CD takes over the Kubernetes state that should evolve through Git.

Repository shape #

The repo also needs to start looking like the platform it is becoming.

Phase 1 only needed this:

infra/
  pulumi/
    aws/
    gcp/
    azure/
    components/

Phase 2 needs separate places for cluster infrastructure, platform configuration, and applications:

infra/
  pulumi/
    aws/
    gcp/
    azure/
    components/

platform/
  argocd/
    projects/
    applications/
    clusters/

apps/
  hello/
    base/
    overlays/
      aws/
      gcp/
      azure/

The hello app is still deliberately boring. It is not the final sample application. Eventually, we are going to need something more realistic: frontend, API, a datastore or stateful substitute, and health endpoints. But before building that, I want the delivery path to be correct. A tiny app is enough to prove that Argo CD can reconcile into all three clusters.

That gives a practical migration path:

Move the hello namespace, deployment, and service into apps/hello/base.
Add one overlay per cloud.
Install Argo CD into each cluster.
Register an Argo CD application for each cluster overlay.
Remove the Pulumi-managed hello component from the cloud stacks.
Prove that changing Git changes the cluster without a manual apply.

The base app should contain the shared Kubernetes intent:

apps/hello/base/
  namespace.yaml
  deployment.yaml
  service.yaml
  kustomization.yaml

The overlays should carry the cloud-specific identity:

apps/hello/overlays/
  aws/
    kustomization.yaml
  gcp/
    kustomization.yaml
  azure/
    kustomization.yaml

At first, the overlays may only add labels or name suffixes. That is enough. The point is to establish the pattern before there are real differences. Later, ingress annotations, DNS names, secret references, resource sizing, and cloud-specific integrations will have a clear place to live.

Bootstrapping Argo CD #

There is a small chicken-and-egg problem with GitOps: something has to install the GitOps controller before the GitOps controller can reconcile anything.

For this exercise, Pulumi should do that first installation. Each cloud stack already creates a Kubernetes provider. That provider can install Argo CD with the Helm chart after the cluster and node pool are ready.

The bootstrap component creates the namespace first, then installs the Argo CD Helm chart into it:

const namespaceName = "argocd";

const namespace = new k8s.core.v1.Namespace(
  `${rootApplicationName}-argocd-namespace`,
  {
    metadata: {
      name: namespaceName,
      labels,
    },
  },
  { provider, dependsOn },
);

const release = new k8s.helm.v3.Release(
  `${rootApplicationName}-argocd`,
  {
    chart: "argo-cd",
    version: "9.5.2",
    namespace: namespace.metadata.name,
    repositoryOpts: {
      repo: "https://argoproj.github.io/argo-helm",
    },
    values: {
      server: {
        service: {
          type: "ClusterIP",
        },
      },
    },
  },
  {
    provider,
    dependsOn: [namespace],
  },
);

The exact dependency differs by cloud. AWS should wait for the managed node group. GCP should wait for the node pool. Azure should wait for AKS. The intent is the same: install Argo CD only after the cluster can schedule pods.

After Argo CD exists, Pulumi should create the smallest useful bootstrap object: an Argo CD Application pointing at the platform Git repo. That application can then manage the rest of the platform and app definitions.

In the implementation, that bootstrap application points at the cloud-specific root:

const rootApplication = new k8s.apiextensions.CustomResource(
  rootApplicationName,
  {
    apiVersion: "argoproj.io/v1alpha1",
    kind: "Application",
    metadata: {
      name: rootApplicationName,
      namespace: namespace.metadata.name,
      annotations: {
        "pulumi.com/skipAwait": "true",
      },
      labels: {
        ...labels,
        "app.kubernetes.io/part-of": "trinity",
      },
    },
    spec: {
      project: "default",
      source: {
        repoURL: repositoryUrl,
        targetRevision: revision,
        path: `platform/argocd/clusters/${cloud}`,
      },
      destination: {
        server: "https://kubernetes.default.svc",
        namespace: namespace.metadata.name,
      },
      syncPolicy: {
        automated: {
          prune: true,
          selfHeal: true,
        },
        syncOptions: ["ApplyOutOfSyncOnly=true"],
      },
    },
  },
  {
    provider,
    dependsOn: [release],
    deleteBeforeReplace: true,
  },
);

There are two reasonable bootstrap options:

One Argo CD application per cluster overlay.
An app-of-apps root that points Argo CD at platform/argocd/clusters/<cloud>.

For this exercise, I prefer the second option. The app-of-apps pattern gives the platform a visible root. It also scales better when the next phases add ingress, observability, secrets, and policy. Argo CD gets one root application per cloud, and that root application points at the cloud-specific cluster root under platform/argocd/clusters.

AWS - Argo CD bootstrap #

The first Phase 2 run was AWS again.

After updating the AWS stack, Pulumi still owned the EKS cluster and managed node group. It also installed Argo CD into the cluster and created the bootstrap root application. The hello workload itself was no longer created by Pulumi.

The control plane came up in the argocd namespace:

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n argocd get pods

NAME                                                              READY   STATUS    RESTARTS   AGE
trinity-dev-aws-root-argocd-c291b796-application-con-0            1/1     Running   0          2m42s
trinity-dev-aws-root-argocd-c291b796-applicationset-contronnrc8   1/1     Running   0          2m43s
trinity-dev-aws-root-argocd-c291b796-dex-server-5554bb9769272pg   1/1     Running   0          2m43s
trinity-dev-aws-root-argocd-c291b796-notifications-controlx8cf5   1/1     Running   0          2m43s
trinity-dev-aws-root-argocd-c291b796-redis-5fd7f5f6-5jvv5         1/1     Running   0          2m44s
trinity-dev-aws-root-argocd-c291b796-repo-server-b6f6bfbb966mdg   1/1     Running   0          2m44s
trinity-dev-aws-root-argocd-c291b796-server-ccdb554f6-f5vgq       1/1     Running   0          2m43s

Then Argo CD showed the app-of-apps chain in the cluster:

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n argocd get applications

NAME                   SYNC STATUS   HEALTH STATUS
trinity-dev-aws-root   Synced        Healthy
trinity-hello-aws      Synced        Healthy

That is the important GitOps signal. trinity-dev-aws-root is the bootstrap application created by Pulumi. trinity-hello-aws is created by the root application from the manifests under platform/argocd/applications/aws, and it points at the AWS overlay under apps/hello/overlays/aws.

The workload itself appeared in Kubernetes with stable names from the Git manifests:

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n hello get deployment,service,pods

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hello   1/1     1            1           3m21s

NAME            TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
service/hello   LoadBalancer   10.100.82.252   af251fa7c5d874caa8853c62919bfbc6-1554061041.us-east-1.elb.amazonaws.com   80:32580/TCP   3m21s

NAME                        READY   STATUS    RESTARTS   AGE
pod/hello-794f9b6d9-8r6pb   1/1     Running   0          3m21s

That small naming change is useful. In Phase 1, Pulumi generated names like trinity-dev-aws-hello-... for Kubernetes resources. In Phase 2, the GitOps-managed application owns the simple Kubernetes objects: deployment.apps/hello and service/hello in the hello namespace.

The load balancer answered through AWS:

curl -I http://af251fa7c5d874caa8853c62919bfbc6-1554061041.us-east-1.elb.amazonaws.com

HTTP/1.1 200 OK
Server: nginx/1.27.5
Date: Mon, 04 May 2026 10:23:14 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Wed, 16 Apr 2025 12:55:34 GMT
Connection: keep-alive
ETag: "67ffa8c6-267"
Accept-Ranges: bytes

That completes the AWS leg of Phase 2: EKS was still provisioned by Pulumi, Argo CD was bootstrapped by Pulumi, and the application state was reconciled from Git.

GCP - Argo CD bootstrap #

The GCP run followed the same path.

Pulumi created the GKE cluster, the explicit node pool, the Kubernetes provider, the Argo CD namespace, the Argo CD Helm release, and the root Argo CD application:

Type                                            Name                                   Status
+   pulumi:pulumi:Stack                             trinity-gcp-dev                        created (879s)
+   ├─ pulumi:providers:gcp                         gcp-provider                           created (0.27s)
+   ├─ gcp:container:Cluster                        trinity-dev-gcp-cluster                created (635s)
+   ├─ gcp:container:NodePool                       trinity-dev-gcp-nodepool               created (98s)
+   ├─ pulumi:providers:kubernetes                  trinity-dev-gcp-cluster-k8s-provider   created (0.25s)
+   ├─ kubernetes:core/v1:Namespace                 trinity-dev-gcp-root-argocd-namespace  created (1s)
+   ├─ kubernetes:helm.sh/v3:Release                trinity-dev-gcp-root-argocd            created (134s)
+   └─ kubernetes:argoproj.io/v1alpha1:Application  trinity-dev-gcp-root                   created (1s)

Outputs:
    argocdNamespace          : "argocd"
    argocdRootApplicationName: "trinity-dev-gcp-root"
    name                     : "trinity-dev-gcp-cluster"
    nodePoolNameOutput       : "trinity-dev-gcp-nodepool"

Resources:
    + 8 created

Duration: 14m41s

The Argo CD pods came up cleanly:

KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n argocd get pods

NAME                                                              READY   STATUS    RESTARTS   AGE
trinity-dev-gcp-root-argocd-e61cb5ce-application-con-0            1/1     Running   0          4m29s
trinity-dev-gcp-root-argocd-e61cb5ce-applicationset-controg7tbh   1/1     Running   0          4m32s
trinity-dev-gcp-root-argocd-e61cb5ce-dex-server-7c7fdf547bxkhqg   1/1     Running   0          4m32s
trinity-dev-gcp-root-argocd-e61cb5ce-notifications-controlwbzc9   1/1     Running   0          4m32s
trinity-dev-gcp-root-argocd-e61cb5ce-redis-6c95dc6448-cph8h       1/1     Running   0          4m33s
trinity-dev-gcp-root-argocd-e61cb5ce-repo-server-6bbb97dbffnx6l   1/1     Running   0          4m32s
trinity-dev-gcp-root-argocd-e61cb5ce-server-5db94564c8-rcpdx      1/1     Running   0          4m31s

The app-of-apps status matched AWS:

KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n argocd get applications

NAME                   SYNC STATUS   HEALTH STATUS
trinity-dev-gcp-root   Synced        Healthy
trinity-hello-gcp      Synced        Healthy

And the GitOps-managed hello workload had a public GCP load balancer IP:

KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n hello get deployment,service,pods

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hello   1/1     1            1           4m45s

NAME            TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
service/hello   LoadBalancer   34.118.231.69   35.222.146.79   80:30872/TCP   4m45s

NAME                        READY   STATUS    RESTARTS   AGE
pod/hello-794f9b6d9-7q4nx   1/1     Running   0          4m46s

The public endpoint answered:

curl -I http://35.222.146.79

HTTP/1.1 200 OK
Server: nginx/1.27.5
Date: Mon, 04 May 2026 10:58:32 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Wed, 16 Apr 2025 12:55:34 GMT
Connection: keep-alive
ETag: "67ffa8c6-267"
Accept-Ranges: bytes

That gives the second Phase 2 proof: GKE was created by Pulumi, Argo CD was bootstrapped by Pulumi, and the application state was reconciled from Git in the same shape as AWS.

Azure - Argo CD bootstrap #

Azure completed the third leg.

The Azure stack created the AKS cluster, then used the generated kubeconfig to install Argo CD and create the root application:

Outputs:
    argocdNamespace          : "argocd"
    argocdRootApplicationName: "trinity-dev-azure-root"
    name                     : "trinity-dev-azure-cluster"
    resourceGroupNameOutput  : "trinity-dev-azure-rg"

Resources:
    + 7 created

The kubeconfig output is deliberately omitted here because the AKS user credential includes client certificate, client key, and token material.

Argo CD came up in the cluster:

KUBECONFIG=./kubeconfig.azure.yaml kubectl -n argocd get pods

NAME                                                              READY   STATUS    RESTARTS   AGE
trinity-dev-azure-root-argocd-4c69f4f8-application-c-0            1/1     Running   0          9m15s
trinity-dev-azure-root-argocd-4c69f4f8-applicationset-contw6nmj   1/1     Running   0          9m16s
trinity-dev-azure-root-argocd-4c69f4f8-dex-server-597b5d8c2hbnq   1/1     Running   0          9m16s
trinity-dev-azure-root-argocd-4c69f4f8-notifications-contrhxpqm   1/1     Running   0          9m16s
trinity-dev-azure-root-argocd-4c69f4f8-redis-7b4b5b89db-mkk8s     1/1     Running   0          9m16s
trinity-dev-azure-root-argocd-4c69f4f8-repo-server-6658ddc72pkd   1/1     Running   0          9m16s
trinity-dev-azure-root-argocd-4c69f4f8-server-76b69cb86d-w8f7l    1/1     Running   0          9m16s

The root application and cloud-specific hello application were both reconciled:

KUBECONFIG=./kubeconfig.azure.yaml kubectl -n argocd get applications

NAME                     SYNC STATUS   HEALTH STATUS
trinity-dev-azure-root   Synced        Healthy
trinity-hello-azure      Synced        Healthy

The hello workload matched the other two clouds:

KUBECONFIG=./kubeconfig.azure.yaml kubectl -n hello get deployment,service,pods

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hello   1/1     1            1           9m8s

NAME            TYPE           CLUSTER-IP   EXTERNAL-IP      PORT(S)        AGE
service/hello   LoadBalancer   10.0.34.55   52.249.202.216   80:31033/TCP   9m9s

NAME                        READY   STATUS    RESTARTS   AGE
pod/hello-794f9b6d9-x9z49   1/1     Running   0          9m9s

And the Azure load balancer responded:

curl -I http://52.249.202.216

HTTP/1.1 200 OK
Server: nginx/1.27.5
Date: Mon, 04 May 2026 11:32:14 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Wed, 16 Apr 2025 12:55:34 GMT
Connection: keep-alive
ETag: "67ffa8c6-267"
Accept-Ranges: bytes

That gives the third Phase 2 proof. EKS, GKE, and AKS now all follow the same ownership model: Pulumi creates the cluster and bootstraps Argo CD; Argo CD reconciles the application from Git.

Exit #

install Argo CD
move app deployment into GitOps
define environment overlays per cloud
changes applied from Git only
cluster state reconciled automatically

For this platform, that means Phase 2 is done when each cluster has Argo CD running and the hello workload is no longer created by Pulumi. The proof now exists for all three clouds:

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n argocd get pods
KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n argocd get pods
KUBECONFIG=./kubeconfig.azure.yaml kubectl -n argocd get pods

Then the application state should be visible through Argo CD:

argocd app list
argocd app get trinity-hello-aws
argocd app get trinity-hello-gcp
argocd app get trinity-hello-azure

And Kubernetes should still show the same basic service in each cloud:

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n hello get deployment,service,pods
KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n hello get deployment,service,pods
KUBECONFIG=./kubeconfig.azure.yaml kubectl -n hello get deployment,service,pods

The important difference is the path that created the workload. In Phase 1, the workload came from Pulumi. In Phase 2, the workload comes from Git through Argo CD.

That is the operating model the rest of the exercise depends on. Ingress, observability, secrets, policy, and rollout strategy all become easier to reason about once the clusters are already reconciling from Git.

CI/CD checkpoint #

The next practical improvement is to stop treating the laptop as the normal control plane for this exercise.

Local commands are still useful while building and recovering, but the shared path should move into GitHub Actions:

pull requests wait for the existing ci-pr-approval environment
approved pull requests run TypeScript checks, manifest validation, Kustomize rendering, and pulumi preview
merges to main start the deployment workflow
deployments wait for an infra-deploy-approval environment before pulumi up
cloud credentials come from GitHub OIDC instead of long-lived local credentials

That gives the project a cleaner operating model:

Pull request:
  environment approval
  CI checks
  Pulumi preview

Merge to main:
  environment approval
  cloud OIDC login
  pulumi up

The pull request workflow starts with the GitHub events and permissions needed for checks, PR comments, and OIDC credentials:

name: CI

on:
  pull_request:
  workflow_dispatch:

permissions:
  contents: read
  id-token: write
  pull-requests: write

concurrency:
  group: ci-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true

The deploy workflow should still stay deliberately gated. A merge to main can start the deployment, but the environment approval is the point where a human confirms that the previewed infrastructure change should be applied. That keeps the happy path automated without letting every merge immediately mutate EKS, GKE, and AKS.

The first piece is a separate bootstrap stack:

infra/
  pulumi/
    bootstrap/

That stack creates the cloud identities GitHub Actions will use:

AWS: a GitHub Actions IAM role trusted through the account's GitHub OIDC provider
GCP: a Workload Identity Federation pool, provider, and service account
Azure: a user-assigned managed identity with federated credentials

The bootstrap stack is intentionally separate from the cluster stacks. It is run once with local cloud-admin credentials, then the outputs are copied into both GitHub environments as variables:

AWS_GITHUB_ACTIONS_ROLE_ARN
GCP_WORKLOAD_IDENTITY_PROVIDER
GCP_SERVICE_ACCOUNT
AZURE_CLIENT_ID
AZURE_TENANT_ID
AZURE_SUBSCRIPTION_ID

PULUMI_ACCESS_TOKEN remains a GitHub environment secret. Pulumi Cloud still owns the stack state; GitHub Actions only needs enough access to run previews and updates.

AWS had one small account-level constraint. IAM allows only one OIDC provider for https://token.actions.githubusercontent.com in an account. I already had one from another project, so the bootstrap stack has to reuse the existing provider and create only the Trinity-specific role. That is a useful reminder that some "bootstrap" resources are account primitives, not project primitives.

The CI workflow is a three-cloud matrix. Each pull request job runs the same local checks first:

jobs:
  check-and-preview:
    name: Check and Preview ${{ matrix.cloud }}
    runs-on: ubuntu-latest
    environment:
      name: ci-pr-approval

    strategy:
      fail-fast: false
      matrix:
        include:
          - cloud: aws
            stack: maxgherman/trinity-aws/dev
            workdir: infra/pulumi/aws
          - cloud: gcp
            stack: maxgherman/trinity-gcp/dev
            workdir: infra/pulumi/gcp
          - cloud: azure
            stack: maxgherman/trinity-azure/dev
            workdir: infra/pulumi/azure

Inside each matrix job, the local checks and render checks are deliberately provider-neutral:

npm run check
npm run check:manifests
kubectl kustomize apps/hello/overlays/aws
kubectl kustomize apps/hello/overlays/gcp
kubectl kustomize apps/hello/overlays/azure
kubectl kustomize platform/argocd/clusters/aws
kubectl kustomize platform/argocd/clusters/gcp
kubectl kustomize platform/argocd/clusters/azure

Then the job authenticates to exactly one cloud and runs pulumi preview against that cloud's stack. That keeps the output readable: AWS, GCP, and Azure report independently, and a failure in one provider does not hide the result from the others.

- name: Configure AWS credentials
  if: matrix.cloud == 'aws'
  uses: aws-actions/configure-aws-credentials@v6
  with:
    aws-region: us-east-1
    role-to-assume: ${{ vars.AWS_GITHUB_ACTIONS_ROLE_ARN }}
    role-session-name: trinity-pulumi-preview-${{ github.run_id }}

- name: Configure GCP credentials
  if: matrix.cloud == 'gcp'
  uses: google-github-actions/auth@v3
  with:
    project_id: trinity-k8s
    workload_identity_provider: ${{ vars.GCP_WORKLOAD_IDENTITY_PROVIDER }}
    service_account: ${{ vars.GCP_SERVICE_ACCOUNT }}

- name: Configure Azure credentials
  if: matrix.cloud == 'azure'
  uses: azure/login@v2
  with:
    client-id: ${{ vars.AZURE_CLIENT_ID }}
    tenant-id: ${{ vars.AZURE_TENANT_ID }}
    subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

- name: Pulumi preview
  uses: pulumi/actions@v6
  with:
    command: preview
    stack-name: ${{ matrix.stack }}
    work-dir: ${{ matrix.workdir }}
    comment-on-pr: ${{ github.event_name == 'pull_request' }}
    github-token: ${{ secrets.GITHUB_TOKEN }}
  env:
    PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}

The deployment workflow uses the same matrix after merge. Pushes to main start pulumi up for all three cloud stacks, but the jobs wait at infra-deploy-approval before cloud credentials are configured. The same workflow can also be run manually with:

operation: up | destroy
cloud: all | aws | gcp | azure

That is encoded directly in the deployment workflow inputs:

on:
  push:
    branches:
      - main
    paths:
      - ".github/workflows/ci.yml"
      - ".github/workflows/pulumi-deploy.yml"
      - "infra/pulumi/aws/**"
      - "infra/pulumi/gcp/**"
      - "infra/pulumi/azure/**"
      - "infra/pulumi/components/**"
      - "apps/**"
      - "platform/**"
      - "package.json"
      - "package-lock.json"
      - "tsconfig.json"
  workflow_dispatch:
    inputs:
      operation:
        required: true
        default: up
        type: choice
        options:
          - up
          - destroy
      cloud:
        required: true
        default: all
        type: choice
        options:
          - all
          - aws
          - gcp
          - azure

The deploy job uses the selected operation and waits on the protected deployment environment:

deploy:
  name: Pulumi ${{ needs.plan.outputs.command }} ${{ matrix.cloud }}
  needs: plan
  runs-on: ubuntu-latest
  environment:
    name: infra-deploy-approval

  strategy:
    fail-fast: false
    matrix:
      include: ${{ fromJson(needs.plan.outputs.matrix) }}

  steps:
    - name: Render application overlay
      run: kubectl kustomize apps/hello/overlays/${{ matrix.cloud }} >/tmp/hello-${{ matrix.cloud }}.yaml

    - name: Render Argo CD root
      run: kubectl kustomize platform/argocd/clusters/${{ matrix.cloud }} >/tmp/argocd-${{ matrix.cloud }}.yaml

    - name: Pulumi ${{ needs.plan.outputs.command }}
      uses: pulumi/actions@v6
      with:
        command: ${{ needs.plan.outputs.command }}
        stack-name: ${{ matrix.stack }}
        work-dir: ${{ matrix.workdir }}
        args: --yes
      env:
        PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}

That manual destroy path matters. These clusters are exercises, not permanent shared infrastructure, and managed Kubernetes clusters cost money while they sit idle. Teardown should be as reproducible as creation.

GCP exposed the most useful CI/CD-specific failure. The first GitHub deployment created the GKE cluster, but failed when Pulumi tried to create Kubernetes resources:

getting credentials: exec: executable gke-gcloud-auth-plugin not found

The kubeconfig generated for GKE uses the gke-gcloud-auth-plugin executable. Local machines often have that installed already; fresh GitHub runners do not. The fix was to install the Google Cloud SDK component in the GCP preview and deploy jobs before Pulumi touches the Kubernetes API.

After that fix, the deployments completed from GitHub Actions. The GCP stack showed the expected shape:

gcp:container:Cluster                        trinity-dev-gcp-cluster                created
gcp:container:NodePool                       trinity-dev-gcp-nodepool               created
kubernetes:helm.sh/v3:Release                trinity-dev-gcp-root-argocd            created
kubernetes:argoproj.io/v1alpha1:Application  trinity-dev-gcp-root                   created

The local validation path then worked across all three clusters. Each kubeconfig could see its nodes, Argo CD applications, and the application namespace:

KUBECONFIG=./kubeconfig.aws.yaml kubectl get nodes
KUBECONFIG=./kubeconfig.gcp.yaml kubectl get nodes
KUBECONFIG=./kubeconfig.azure.yaml kubectl get nodes

KUBECONFIG=./kubeconfig.aws.yaml kubectl -n argocd get applications
KUBECONFIG=./kubeconfig.gcp.yaml kubectl -n argocd get pods
KUBECONFIG=./kubeconfig.azure.yaml kubectl -n argocd get applications

That finished the automation checkpoint. The normal path is now pull request, preview, approval, GitHub OIDC, and pulumi up.

The final cross-cluster GitOps check showed every Argo CD application synced and healthy:

cloud  root  hello
aws    ok    ok
gcp    ok    ok
azure  ok    ok

With the GitOps and CI/CD path in place, the next checkpoint can move beyond hello by adding a workload that is useful enough to exercise traffic, health, observability, traces, and rollout behavior.

Source code #

Reference implementation (opens in a new tab)