Deploying a New EKS Service: GitOps, Internal ALB, and Cloudflare DNS

A new Spring Boot service was healthy inside EKS but still needed a working path through an internal ALB, Kubernetes Ingress, Route53, and Cloudflare DNS. The service was not meant to be directly public. With the AWS Load Balancer Controller, an Ingress can configure an ALB to route HTTP or HTTPS traffic to workloads in the cluster.^[1]

The final health check looked like this:

https://app-api-dev.example.com/actuator/health/readiness

Response:

{"status":"UP"}

The checks below follow the dependency path from the container to public DNS.

Terms used here

Term	Meaning
EKS	AWS managed Kubernetes. The service runs as Kubernetes workloads inside an EKS cluster.
ECR	AWS container image registry. CI pushes the new service image there before GitOps deploys it.
ArgoCD	GitOps controller that applies Kubernetes manifests from a Git repository into the cluster.
Ingress	Kubernetes routing rule for HTTP host/path traffic. Here it maps the new host to the app service.
Internal ALB	An AWS Application Load Balancer with private IPs only. It is reachable from networks that have a path into its VPC, not directly from the public internet.^[2]
Cloudflare DNS	The authoritative DNS provider for the example domain in this note. DNS records had to be added there instead of only in Route53.

Existing service boundary

The backend system already had several services:

Service	Access pattern
Main API	Reached through a dev API domain
Workers	Internal to the cluster
New app service	Needed a separate dev API domain

The new service was an independent Spring Boot module on port 8082. The main API calls it inside the cluster:

http://app-service:8082

So the first step was not DNS. The first step was to make the service run inside EKS with a stable Kubernetes Service name.

Step 1: Add the ECR Repository

CI needs somewhere to push the new image, so the dev ECR repository list gained one more repository:

ecr_repos = [
  "dev-api",
  "dev-app-service",
  "dev-worker-core",
  "dev-worker-growth"
]

Then run a plan against the dev ECR stack:

terragrunt plan -no-color

The useful result was:

Plan: 2 to add, 0 to change, 0 to destroy.

That proved the Terraform change was additive: one ECR repository and its lifecycle policy, with no changes to existing services.

After that, apply it:

terragrunt apply -auto-approve -no-color

Step 2: Add the GitOps Deployment Directory

The environment already used ArgoCD and Kustomize, so the new service followed the same layout:

dev/app-service/
  application.yaml
  external-secret.yaml
  k8s-app-service-dev.yaml
  kustomization.yaml

The Deployment had to match the service port:

containers:
  - name: app-service
    image: <ecr>/dev-app-service:dev-REPLACE_ME
    ports:
      - containerPort: 8082
    readinessProbe:
      httpGet:
        path: /actuator/health/readiness
        port: 8082

The Kubernetes Service kept the same DNS name the main API expected:

apiVersion: v1
kind: Service
metadata:
  name: app-service
spec:
  ports:
    - port: 8082
      targetPort: 8082

Do not casually expose the Service as port 80 here. The application configuration calls:

http://app-service:8082

If the Service only exposes 80, the in-cluster call will fail even though the Pod is running.

Step 3: Reuse the Existing Backend Configuration

The first idea was to create a separate Secrets Manager entry for the new service:

dev/app-service/application-secrets.yaml

After checking the Java configuration, the better choice was to reuse the same runtime configuration used by the main API. The ExternalSecret still creates a separate Kubernetes Secret, but its remoteRef points to the existing backend configuration:

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: app-service-secrets
spec:
  target:
    name: app-service-secrets
  data:
    - secretKey: application.yaml
      remoteRef:
        key: dev/api/application-secrets.yaml

The app service reads the same database variables:

spring:
  datasource:
    url: ${DB_URL}
    username: ${DB_USERNAME}
    password: ${DB_PASSWORD}

This avoids duplicating environment-specific configuration and reduces the chance that the API and the new service drift apart.

Step 4: Update GitLab CI

The new module existed in the code repository, but CI only built the main API and worker images. Three updates were needed:

Add a app service ECR repository variable.
Add Docker build and push steps.
Update the GitOps image tag for dev/app-service/kustomization.yaml.

The CI change looked like this:

variables:
  DEV_ECR_APP_SERVICE_REPO: "dev-app-service"

build_push:
  script:
    - docker build -f app-service-app/Dockerfile \
        -t "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$APP_SERVICE_REPO:$TAG" .
    - docker push "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$APP_SERVICE_REPO:$TAG"

The GitOps tag update also had to include the new path:

update_newtag(
  f"{prefix}/app-service/kustomization.yaml",
  os.environ["APP_SERVICE_IMAGE"],
  tag
)

Without this, the ECR repository and Kubernetes manifests can be correct, but no new image will actually be deployed.

Step 5: Create the ArgoCD Application

The app-of-apps file got one more ArgoCD Application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: dev-app-service
spec:
  source:
    repoURL: <gitops-repo>
    targetRevision: HEAD
    path: dev/app-service
  destination:
    server: https://kubernetes.default.svc
    namespace: dev
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

After applying it, check ArgoCD and Kubernetes:

kubectl -n argocd get application dev-app-service
kubectl -n dev get deploy,svc,pod -l app=app-service -o wide

The expected state:

dev-app-service   Synced   Healthy

deployment/app-service   1/1
pod/app-service-...       Running
service/app-service       8082/TCP

Also check ExternalSecret:

kubectl -n dev get externalsecret app-service-secrets

Expected:

SecretSynced   True

At this point the service is working inside the cluster.

Step 6: Expose a New Host Through the ALB

The existing dev API already used an internal ALB. The new service was added to the same Ingress with a new host rule:

rules:
  - host: api-dev.example.com
    http:
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: api
              port:
                number: 80

  - host: app-api-dev.example.com
    http:
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: app-service
              port:
                number: 8082

After applying it:

kubectl -n dev describe ingress dev-apps

The Rules section should include:

app-api-dev.example.com
  /   app-service:8082 (<pod-ip>:8082)

Then check the ALB target group:

aws elbv2 describe-target-health \
  --target-group-arn <app-service-target-group-arn>

Expected:

State: healthy

This confirmed the Ingress rule, ALB listener rule, target group, and Pod readiness.

Step 7: The DNS Record Was Added in the Wrong Place

DNS was the easy part to misread.

At first, a record was added in Route53:

app-api-dev.example.com CNAME <internal-alb-dns-name>

But local access still failed:

Could not resolve host

Check the name:

dig +short app-api-dev.example.com

No result.

At that point, do not keep debugging ALB or Kubernetes. Check the authoritative name servers for the root domain:

dig +short NS example.com

The result showed that the domain was delegated to Cloudflare, not Route53.

In practice, a same-name hosted zone in Route53 does not necessarily control the real domain. Public DNS follows the authoritative NS delegation.

The correct fix was to add the record in Cloudflare:

Type: CNAME
Name: app-api-dev
Target: <internal-alb-dns-name>
Proxy status: DNS only

The record was kept DNS only. This setup had no Cloudflare Tunnel or other private-network integration that could carry proxy traffic into the VPC.

Step 8: An Internal ALB Still Requires Internal Network Access

After DNS was fixed:

dig +short app-api-dev.example.com

The result looked like:

<internal-alb-dns-name>.
10.x.x.x
10.x.x.x

Seeing 10.x.x.x is expected because an internal load balancer resolves to private addresses.^[2]

Callers still need VPN, VPC access, or another network path to those private addresses. DNS resolution alone does not provide access to the VPC.^[2]

Step 9: Separate 200, 401, and Network Failures

Final health check:

curl -i https://app-api-dev.example.com/actuator/health/readiness

Response:

HTTP/2 200

Body:

{"status":"UP"}

Root path:

curl -i https://app-api-dev.example.com/

Response:

HTTP/2 401

That is not a service outage. It means the request reached the application and authentication is required.

Symptom	Meaning
`Could not resolve host`	DNS is not resolving
`Connection timed out`	Network path to the internal ALB is unavailable
`502`	ALB cannot reach the backend target, or the app is failing
`401`	Request reached the app, but authentication is required
`/actuator/health/readiness` returns `UP`	Backend readiness is healthy

Troubleshooting order

Do not start with browser symptoms and guess. A better order is:

Check Pod, Service, and ExternalSecret.
Check whether Ingress generated the ALB rule.
Check whether the target group is healthy.
Check whether the certificate covers the new host.
Check the authoritative DNS and the actual record.

When Route53 and Cloudflare both exist, always confirm the authoritative DNS first. A record visible in Route53 is irrelevant if the domain is delegated to Cloudflare.

Verified request path

The working path became:

browser
  -> Cloudflare DNS (DNS only)
  -> internal ALB
  -> Kubernetes Ingress host rule
  -> app-service Service:8082
  -> app-service Pod:8082

Health check:

{"status":"UP"}

At that point the backend path is working. If a business endpoint returns 401, the next investigation should be authentication, tokens, or authorization scope, not ALB or DNS.

References

[1] Amazon Web Services, “Route application and HTTP traffic with Application Load Balancers.”

[2] Amazon Web Services, “How Elastic Load Balancing works.”

Terms used here

Existing service boundary

Step 1: Add the ECR Repository

Step 2: Add the GitOps Deployment Directory

Step 3: Reuse the Existing Backend Configuration

Step 4: Update GitLab CI

Step 5: Create the ArgoCD Application

Step 6: Expose a New Host Through the ALB

Step 7: The DNS Record Was Added in the Wrong Place

Step 8: An Internal ALB Still Requires Internal Network Access

Step 9: Separate 200, 401, and Network Failures

Troubleshooting order

Verified request path

References

Related articles

When dig Works but the Browser Does Not: Debugging macOS Split DNS

Google Credentials in Kubernetes Without a Committed Key

Debugging an AWS Glue iam:PassRole AccessDenied Error