Varnish HTTP Cache is a dedicated opensource application for caching HTTP requests made by the client to the server.
It can be located as main-in-the-middle as a reverse proxy (which it is), doing load balancing, and taking some of the load from the webserver for the static content (JavaScripts, images, etc.)

A common use of this tool is to regionalize (in another country of continent, for example) part of the web-content (static/immutable files). Improving the overall user experience.

Another scenario is when a local server is placed in a company’s branch (or a big client’s office) to reduce the data traffic over the internet (excellent for metered or slow connections).

INITIAL SETUP

sudo apt update && sudo apt upgrade -y
sudo apt install debian-archive-keyring curl gnupg apt-transport-https -y
curl -fsSL https://packagecloud.io/varnishcache/varnish70/gpgkey|sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/varnish.gpg

IF: Ubuntu 20.04 LTS

sudo tee /etc/apt/sources.list.d/varnishcache_varnish70.list > /dev/null <<-EOF
deb https://packagecloud.io/varnishcache/varnish70/$ID/ $VERSION_CODENAME main
deb-src https://packagecloud.io/varnishcache/varnish70/$ID/ $VERSION_CODENAME main
EOF

OR IF: Ubuntu 22.04 LTS

sudo tee /etc/apt/sources.list.d/varnishcache_varnish70.list > /dev/null <<-EOF
deb https://packagecloud.io/varnishcache/varnish70/ubuntu/ focal main
deb-src https://packagecloud.io/varnishcache/varnish70/ubuntu/ focal main
EOF

INSTALLING VARNISH

sudo apt update && sudo apt install varnish -y

Configuring the service:

sudo cp /lib/systemd/system/varnish.service /etc/systemd/system/
sudo nano /etc/systemd/system/varnish.service
ExecStart=/usr/sbin/varnishd \
	  -a :6081 \
	  -a localhost:8443,PROXY \
	  -p feature=+http2 \
	  -f /etc/varnish/default.vcl \
	  -s malloc,1g

Note: here is where you can adjust the port Varnish is listening on and the amount of memory you want to allocate to the cache (e.g. 256m or maybe 2g). Highly recommended to set a limit because the default is unlimited.

Configuring the Varnish Configuration Language (VCL):

sudo nano /etc/varnish/default.vcl
backend default {
    .host = "127.0.0.1";
    .port = "8000";
}

OR (for multiple bakcends)

import directors;

backend backend1 {
    .host = "srv1.example.com";
    .port = "80";
}

backend backend2 {
    .host = "srv2.example.com";
    .port = "80";
}

sub vcl_init {
    new vdir = directors.round_robin();
    vdir.add_backend(backend1);
    vdir.add_backend(backend2);
}

sub vcl_recv {
    set req.backend_hint = vdir.backend();
}

Firing it up:

sudo systemctl daemon-reload
sudo systemctl start varnish
sudo systemctl enable varnish

TESTING

Monitor the traffic between the Varnish and the web server in one terminal:

tcpdump -i lo dst host 127.0.0.1 and port 8000

On another terminal, make multiple HTTP requests to Varnish and confirm if it is going to the web server every time or not.

curl "http://localhost:6081/"
curl "http://localhost:6081/"
curl "http://localhost:6081/"
curl "http://localhost:6081/"
curl "http://localhost:6081/"
curl "http://localhost:6081/"

Also, print the headers for additional information:

curl -I "http://localhost:6081/"

BONUS

On the default VCL configuration file (/etc/varnish/default.vcl) customize the internal request overwrite headers or define how long Varnish should cache the content:

# Changes will be applied to the internal request (between Varnish and the web server).
sub vcl_backend_response {
    # Change Header
    unset beresp.http.Cache-Control;
    set beresp.http.Cache-Control = "max-age=1209600";

    # Define Varnish Retention
    set beresp.ttl = 2w;

    # Defining Conditional Retention
    If (bereq.url == "/VoD") {
        beresp.ttl = 2d;
    }
}

Serving multiple sites with  Virtual Hosts:

sub vcl_recv {
  if (server.ip == "10.10.10.10")
  {
    include "/etc/varnish/site-one.vcl";
  }
  elsif (server.ip == "10.20.30.40")
  {
    include "/etc/varnish/site-two.vcl";
  }
}

OR

sub vcl_recv {
  if (! req.http.Host)
  {
    error 404 "Need a host header";
  }
  set req.http.Host = regsub(req.http.Host, "^www\.", "");
  set req.http.Host = regsub(req.http.Host, ":80$", "");

  if (req.http.Host == "site-one.com")
  {
    include "/etc/varnish/site-one.vcl";
  }
  elsif (req.http.Host == "site-two.com")
  {
    include "/etc/varnish/site-two.vcl";
  }
}

Than create the respective files (/etc/varnish/site-one.vcl and /etc/varnish/site-two.vcl) to specify individual configuration per IP/domain.

To cache everything:

sub vcl_recv {
    unset req.http.Cookie;
}

sub vcl_backend_response {
    # Force caching of all objects for 6 hours
    set beresp.ttl = 6h;
    set beresp.http.Cache-Control = "max-age=21600";
}

sub vcl_deliver {
    unset resp.http.Cache-Control;
}

To cache based on URL:

sub vcl_recv {
    if (req.url ~ "^/path/(A|B|C)$") {
        unset req.http.Cookie;
        return (hash);
    }
}
sub vcl_backend_response {
    # Only set TTL and cache headers for specific URLs
    if (bereq.url ~ "^/path/(A|B|C)$") {
        set beresp.ttl = 6h;
        set beresp.http.Cache-Control = "max-age=21600";
    }
}
sub vcl_deliver {
    if (req.url ~ "^/path/(A|B|C)$") {
        unset resp.http.Cache-Control;
    }
}

WATCH STATISTICS OF FORWARDED AND CACHED REQUESTS

sudo watch -n 1 'varnishstat -1 | grep -E "client_req|cache_hit"'

One Reply to “Setting Up a Private CDN with Varnish HTTP Cache on Ubuntu”

Comments are closed.