GitLab Runner Dial TCP Lookup: Fix

gitlab runner dial tcp lookup no such host dns error

Understanding the gitlab runner dial tcp lookup no such host dns error
Root Causes of DNS Failures in Docker Executors
Step-by-Step Resolution Paths for Host Lookup Failures
Handling Edge Cases and Network Namespace Quirks
Proactive Prevention and Infrastructure Best Practices
Frequently Asked Questions

The pipeline has created a wall of red, failed due to “dial tcp: lookup gitlab.com: no such host” after you pushed a critical hotfix at 2 AM, and now you’ve lost your CI/CD pipeline. Now you’re mad at Docker DNS because it worked fine 20 minutes ago, and you wasted your entire night searching for “gitlab runner dial tcp lookup no such host dns error” and getting a bunch of half-baked forum posts with no help. This is the post I wish I found that night.

Quick Summary

The primary reason for this failure will be docker executors, where the container’s /etc/resolv.conf does not have the correct nameservers.
The following three things are common causes:
- Docker daemon overriding the default DNS.
- Alpine’s musl libc.
- Network namespace isolation in the runner.
Multiple ways to fix include explicitly setting DNS in the daemon configuration, runner’s config.toml, or injecting additional hosts.
Other reasons such as corporate proxy configurations and broken bridge networks are potential causes.
To prevent any late-night surprises, a quick health-check stage in your pipeline is helpful.

Understanding the gitlab runner dial tcp lookup no such host dns error

The gitlab runner cannot resolve the hostname causing a timeout for every git clone, apt-get, or Docker registry pull thus, masking a simple DNS issue with a timeout.

Identifying the Pipeline Network Timeout

When examining the job’s logs, you would find lines resembling this:

Running with gitlab-runner 15.10.0~beta.1 (abc123)
  on docker-executor xyz-789
Preparing the "docker" executor
Using Docker executor with image alpine:latest ...
Pulling docker image alpine:latest ...
ERROR: Job failed: execution took longer than 1h0m0s seconds

... stderr ...
dial tcp: lookup gitlab.com on 192.168.1.1:53: no such host  <-- This line

The single red line serves as a warning sign. The container attempted to utilize a nameserver (most likely coming from the host) which failed either because the DNS server could not be reached from within the Docker Network, or due to a lack of a DNS record to match the requested domain name. Since nothing else in the entire pipeline can start without the ability to resolve the domain, thus everything else fails.

Prerequisites for Debugging

Prior to inspecting the runner, confirm that DNS resolution works properly on the host machine itself as your working point.

$ nslookup gitlab.com 8.8.8.8
Server:		8.8.8.8
Address:	8.8.8.8#53

Non-authoritative answer:
Name:	gitlab.com
Address: 172.65.251.78

In the case that DNS resolves successfully from the Host Machine, but the Containers cannot resolve, you can now conclude that the issue lies with the Docker Network Layer, not with the upstream DNS. You can now investigate further from this point.

Root Causes of DNS Failures in Docker Executors

Containers began by Docker run in a unique and isolated Layer 2 Network Namespace. This is a positive factor because it provides a layer of security; however, it does mean that any DNS resolution that occurs within a container will need to inherit the Davemon DNS Configuration Plumbed into the Daemon, or a stripped-down version there of.

Default Docker Daemon resolv.conf Override Rules

The Docker Daemon by default copies the Host’s DNS Resolver Configuration File (/etc/resolv.conf) into the Container, but filters out any Nameservers that cannot be reached from the Container Network. If your only Nameserver is your Local Router (Say 192.168.1.1), and for some reason your container cannot route to it, then this results in Docker’s /etc/resolv.conf file being empty. What does the Container actually receive?

$ docker run --rm alpine:latest cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.0.11
options ndots:0

In this case, the container only has a Stub Resolver for Docker (127.0.0.11). You may find an explanation for the Alpine image DNS resolution issue in CI/CD pipeline below.

The Alpine Image DNS Resolution CI/CD Quirk

If you receive a DNS lookup failure with your Alpine image, it is likely due to musl libc.

The musl-based resolver inside the Alpine image (musl-gethostbyname) treats the options field as a no-op and falls back to resolving ns records in a static, non-glibc manner, causing failure to communicate correctly to stub resolvers and timeouts. In the case of an Alpine image image similar to the following:

build:
  image: alpine:latest
  script:
    - apk add --no-cache curl
    - curl https://gitlab.example.com/api/v4/projects

Your Alpine image may fail to resolve correctly against stub resolvers and may timeout where a debian:bullseye docker image was successful.

This is commonly referred to as an Alpine Image DNS Resolution Quirk in CI/CD and continues to plague CI/CD build pipelines.

What didn’t work for me

I attempted to manually inject nameservers by writing “nameserver 8.8.8.8” into the /etc/resolv.conf file directly as a line in my Dockerfile when creating an Alpine container image. Unfortunately, the nameserver entry would be overwritten immediately after the creation of the container and when starting the container, Docker automatically rewrites the /etc/resolv.conf file.

I also attempted to specify the DNS settings using the --dns=8.8.8.8 option when executing docker run directly on the host, which resolved the issue for one-off tests but did not impact the DNS for the execution by the runner of jobs. Unfortunately, none of these methods were effective solutions to the problem. To solve the problem, we must address it at the daemon or runner configuration level.

Step-by-Step Resolution Paths for Host Lookup Failures

Now we will make modifications to the configuration that will enable consistent, reliable, predictable name resolution within the CI/CD Docker executor when running builds.

Modifying GitLab CI Docker Executor DNS Settings

To have all Docker containers receive the same nameserver configuration, the best method to implement is to configure Docker daemon to return specific nameservers to the containers.To edit the Docker daemon configuration on your system you have to edit the /etc/docker/daemon.json file. You should make the following changes to the file in a code editor of your choice (Note: make sure to use sudo if you do not run the command as root):

{
  "dns": ["8.8.8.8", "1.1.1.1"],
  "dns-search": ["default.svc.cluster.local"]
}

Once you have made these changes, restart docker (using sudo systemctl restart docker). When you restart docker it will force all containers that are started by the daemon to use the DNS servers defined in the daemon configuration file, even if the default DNS server that would be returned by the host resolver has been configured. This allows you to have a global DNS configuration that can handle all containers started by docker, provided that all of your runner containers require the same DNS servers. If you wish, you can set these DNS values directly through the runner docker executor in the config.toml file according to the GitLab Runner advanced configuration.

Updating gitlab-runner config.toml dns_search Parameters

In some cases, you need different DNS settings for each runner where you are running containers in multiple network zones. This can be accomplished by editing the /etc/gitlab-runner/config.toml file, in the [runners.docker] section and adding the dns and dns_search parameters as arrays.

[[runners]]
  name = "docker-runner"
  url = "https://gitlab.example.com/"
  token = "abcdef1234567890"
  executor = "docker"
  [runners.docker]
    image = "alpine:latest"
    dns = ["8.8.8.8", "1.1.1.1"]
    dns_search = ["internal.mycompany.com"]

Adding these parameters directly to the config.toml for each of your runners allows you to set the desired DNS servers and search domains directly into the container’s resolv.conf file. This means that the DNS servers used by the runner’s containers will not be determined by whatever DNS server was defined in the daemon’s configuration file. I personally think it is better to configure your runners to use DNS servers directly in their configuration file, so you can keep the default configuration of the daemon free of clutter, but still have granular control over DNS configuration.

Restarting and Verifying the Runner Service

Once you have changed any of the runner’s configuration files (e.g., config.toml), you need to restart the runner and check whether the runner is running correctly:

$ sudo gitlab-runner restart
$ sudo gitlab-runner verify
Runtime platform                                    arch=amd64 os=linux pid=2341 revision=abc123 version=15.10.0
Verifying runner... is alive                        runner=docker-runner

After you verify that the runner is healthy, you should run a test pipeline job. Simply create a job that simply runs nslookup gitlab.com and you should get back a valid IP address instead of a timeout.

Handling Edge Cases and Network Namespace Quirks

Even if you have set the correct DNS servers for your runners, there are still situations where Docker networks can become stale or broken.

Resolve Network Namespace Issues GitLab Runner Environments

In many cases, the runner’s default bridge network (docker0) will have become unusable, resulting in the loss of the ability to resolve DNS via that network.Check to see what DNS settings are built into your bridge network:

$ docker network inspect bridge
...
"IPAM": {
    "Config": [
        {
            "Subnet": "172.17.0.0/16",
            "Gateway": "172.17.0.1"
        }
    ]
},
"Options": {
    "com.docker.network.bridge.name": "docker0",
    "com.docker.network.driver.mtu": "1500"
}

There are no DNS entries in the output of the inspect command because the bridge network does not act as a DNS server. Any container on the bridge network will be unable to ping the gateway; consequently, all nameservers that depend on the bridge network will be unable to resolve. Solution: Prune any old networks using the command docker network prune, and allow the runner to create a new network stack. If you are using a custom bridge with subnetting, check to ensure there are no iptables rules blocking traffic from the bridge to the nameserver you set up.

Corporate Proxy and Custom DNS Server Conflicts

Locked-down environments require that all traffic go through an HTTP forward proxy. Typically, you would set the appropriate environment variables in GitLab project > Settings > CI/CD > Variables:

HTTP_PROXY=http://proxy.corp.local:3128
HTTPS_PROXY=http://proxy.corp.local:3128
NO_PROXY=localhost,127.0.0.1,.internal.corp

While this allows you to pass your HTTP traffic through your corporate proxy, DNS resolution does not automatically pass through your corporate proxy. If your corporate DNS server will block any external DNS forwarding, you will be required to ensure that the resolv.conf file in the container is set up to use that internal DNS server. One method for doing so is to add the dns entry in your config.toml file as well as the dns_search entry that points to your corporate domain, ensuring that the hostname of the proxy resolves to an IP address.

Proactive Prevention and Infrastructure Best Practices

Rather than reactively putting out fires, create resilience in your configuration from the start.

Hardcoding Host Aliases for Resilience

For destinations that absolutely must resolve, add an entry in your config.toml file for that hostname as an extra host, thus circumventing the DNS infrastructure entirely.

[runners.docker]
  extra_hosts = ["gitlab.example.com:192.168.100.50"]

The “/etc/hosts” file on the container will have a static entry to prevent issues with cloning or pushing to a GitLab Instance or private container registry in the event of major DNS failure. This is useful when hosting GitLab servers using a static IP.

Automating DNS Health Checks in Pipelines

Create a minimal stage that checks DNS resolution before performing any expensive work. This allows for early catching of failed deployments.

stages:
  - preflight
  - build

dns-check:
  stage: preflight
  script:
    - nslookup gitlab.example.com || (echo "DNS borked" && exit 1)
    - nslookup registry.example.com

If the preflight fails, it will halt the pipeline immediately, providing clarity, rather than a confusing timeout twenty minutes later. This has prevented me from being woken up at 2 a.m. many times.

Frequently Asked Questions

Why does this DNS error only happen with Alpine Linux base images?

Alpine utilizes musl libc, which has different behavior for name resolution compared to glibc. When using Docker’s stub resolver (127.0.0.11), it may fail if the resolver is slow, and/or if the container has an option in its resolv.conf file that musl will ignore. Glibc-based images like Debian typically have workarounds for this misconfiguration, hence the lack of noticeable issues with them.

How do I force the Docker executor to use my host machine’s DNS?

Configuring the “dns” option in the “config.toml” file with your host nameserver IPs (e.g., the IP address of your host’s systemd-resolved or from /etc/resolv.conf) will require all containerized applications to use those nameservers. To ensure that you are using the same DNS setup as your host machine, you could use “network_mode = “host”” for your runner. However, this will eliminate any isolation for the network and is generally discouraged.

Will modifying network modes in Docker fix runner resolution drops?

Changing the runner’s network mode to “bridge” or “host” will change the way the container views its networking, but will not magically resolve any issues with DNS where the DNS forwarding of the daemon has been broken. In my personal experience, this may help in instances where the default bridge container was misconfigured, but the actual fix is to explicitly set the DNS servers. Using the implicit network modes of Docker only hides the true source of the problem.

By resolving the issues with DNS on the runner/daemon level once, and adding a health check stage to your pipelines, you can guarantee continued runtime functionality in the event of a network issue.

0 3 9 minutes read

Quick Summary

Understanding the gitlab runner dial tcp lookup no such host dns error

Identifying the Pipeline Network Timeout

Prerequisites for Debugging

Root Causes of DNS Failures in Docker Executors

Default Docker Daemon resolv.conf Override Rules

The Alpine Image DNS Resolution CI/CD Quirk

What didn’t work for me

Step-by-Step Resolution Paths for Host Lookup Failures

Modifying GitLab CI Docker Executor DNS Settings

Updating gitlab-runner config.toml dns_search Parameters

Restarting and Verifying the Runner Service

Handling Edge Cases and Network Namespace Quirks

Resolve Network Namespace Issues GitLab Runner Environments

Corporate Proxy and Custom DNS Server Conflicts

Proactive Prevention and Infrastructure Best Practices

Hardcoding Host Aliases for Resilience

Automating DNS Health Checks in Pipelines

Frequently Asked Questions

Why does this DNS error only happen with Alpine Linux base images?

How do I force the Docker executor to use my host machine’s DNS?

Will modifying network modes in Docker fix runner resolution drops?

Read Next

Releasing Stale Terraform State Locks in AWS DynamoDB After Pipeline Failures

Mitigating GitHub Actions “API rate limit exceeded for installation” Errors in Matrix Builds

Diagnosing ArgoCD “OutOfSync” and “Degraded” Status for Helm CustomResourceDefinitions

Resolving Jenkins Java Heap Space OutOfMemoryError During Large Maven Builds

Deploying Ephemeral Self-Hosted GitHub Action Runners on Proxmox VE

Structuring Jenkins Pipeline Shared Libraries for Modular CI/CD Workflows

Automating Terraform State Management via GitHub Actions OIDC Authentication

Implementing GitOps with ArgoCD for Kubernetes Cluster Synchronization

Architecting a GitLab CI/CD Pipeline for Multi-Stage Docker Builds

Resolving Jenkins Java Heap Space OutOfMemoryError During Large Maven Builds

Diagnosing ArgoCD "OutOfSync" and "Degraded" Status for Helm CustomResourceDefinitions

Leave a Reply Cancel reply