Varnish HTTP Cache is a dedicated opensource application for caching HTTP requests made by the client to the server.
It can be located as main-in-the-middle as a reverse proxy (which it is), doing load balancing, and taking some of the load from the webserver for the static content (JavaScripts, images, etc.)
A common use of this tool is to regionalize (in another country of continent, for example) part of the web-content (static/immutable files). Improving the overall user experience.
Another scenario is when a local server is placed in a company’s branch (or a big client’s office) to reduce the data traffic over the internet (excellent for metered or slow connections).
INITIAL SETUP
sudo apt update && sudo apt upgrade -y sudo apt install debian-archive-keyring curl gnupg apt-transport-https -y curl -fsSL https://packagecloud.io/varnishcache/varnish70/gpgkey|sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/varnish.gpg
IF: Ubuntu 20.04 LTS
sudo tee /etc/apt/sources.list.d/varnishcache_varnish70.list > /dev/null <<-EOF deb https://packagecloud.io/varnishcache/varnish70/$ID/ $VERSION_CODENAME main deb-src https://packagecloud.io/varnishcache/varnish70/$ID/ $VERSION_CODENAME main EOF
OR IF: Ubuntu 22.04 LTS
sudo tee /etc/apt/sources.list.d/varnishcache_varnish70.list > /dev/null <<-EOF deb https://packagecloud.io/varnishcache/varnish70/ubuntu/ focal main deb-src https://packagecloud.io/varnishcache/varnish70/ubuntu/ focal main EOF
INSTALLING VARNISH
sudo apt update && sudo apt install varnish -y
Configuring the service:
sudo cp /lib/systemd/system/varnish.service /etc/systemd/system/ sudo nano /etc/systemd/system/varnish.service
ExecStart=/usr/sbin/varnishd \ -a :6081 \ -a localhost:8443,PROXY \ -p feature=+http2 \ -f /etc/varnish/default.vcl \ -s malloc,1g
Note: here is where you can adjust the port Varnish is listening on and the amount of memory you want to allocate to the cache (e.g. 256m or maybe 2g). Highly recommended to set a limit because the default is unlimited.
Configuring the Varnish Configuration Language (VCL):
sudo nano /etc/varnish/default.vcl
backend default { .host = "127.0.0.1"; .port = "8000"; }
OR (for multiple bakcends)
import directors; backend backend1 { .host = "srv1.example.com"; .port = "80"; } backend backend2 { .host = "srv2.example.com"; .port = "80"; } sub vcl_init { new vdir = directors.round_robin(); vdir.add_backend(backend1); vdir.add_backend(backend2); } sub vcl_recv { set req.backend_hint = vdir.backend(); }
Firing it up:
sudo systemctl daemon-reload sudo systemctl start varnish sudo systemctl enable varnish
TESTING
Monitor the traffic between the Varnish and the web server in one terminal:
tcpdump -i lo dst host 127.0.0.1 and port 8000
On another terminal, make multiple HTTP requests to Varnish and confirm if it is going to the web server every time or not.
curl "http://localhost:6081/" curl "http://localhost:6081/" curl "http://localhost:6081/" curl "http://localhost:6081/" curl "http://localhost:6081/" curl "http://localhost:6081/"
Also, print the headers for additional information:
curl -I "http://localhost:6081/"
BONUS
On the default VCL configuration file (/etc/varnish/default.vcl
) customize the internal request overwrite headers or define how long Varnish should cache the content:
# Changes will be applied to the internal request (between Varnish and the web server). sub vcl_backend_response { # Change Header unset beresp.http.Cache-Control; set beresp.http.Cache-Control = "max-age=1209600"; # Define Varnish Retention set beresp.ttl = 2w; # Defining Conditional Retention If (bereq.url == "/VoD") { beresp.ttl = 2d; } }
Serving multiple sites with Virtual Hosts:
sub vcl_recv { if (server.ip == "10.10.10.10") { include "/etc/varnish/site-one.vcl"; } elsif (server.ip == "10.20.30.40") { include "/etc/varnish/site-two.vcl"; } }
OR
sub vcl_recv { if (! req.http.Host) { error 404 "Need a host header"; } set req.http.Host = regsub(req.http.Host, "^www\.", ""); set req.http.Host = regsub(req.http.Host, ":80$", ""); if (req.http.Host == "site-one.com") { include "/etc/varnish/site-one.vcl"; } elsif (req.http.Host == "site-two.com") { include "/etc/varnish/site-two.vcl"; } }
Than create the respective files (/etc/varnish/site-one.vcl
and /etc/varnish/site-two.vcl
) to specify individual configuration per IP/domain.
To cache everything:
sub vcl_recv { unset req.http.Cookie; } sub vcl_backend_response { # Force caching of all objects for 6 hours set beresp.ttl = 6h; set beresp.http.Cache-Control = "max-age=21600"; } sub vcl_deliver { unset resp.http.Cache-Control; }
To cache based on URL:
sub vcl_recv { if (req.url ~ "^/path/(A|B|C)$") { unset req.http.Cookie; return (hash); } } sub vcl_backend_response { # Only set TTL and cache headers for specific URLs if (bereq.url ~ "^/path/(A|B|C)$") { set beresp.ttl = 6h; set beresp.http.Cache-Control = "max-age=21600"; } } sub vcl_deliver { if (req.url ~ "^/path/(A|B|C)$") { unset resp.http.Cache-Control; } }
WATCH STATISTICS OF FORWARDED AND CACHED REQUESTS
sudo watch -n 1 'varnishstat -1 | grep -E "client_req|cache_hit"'