Intro On What & Why
This has been (and still is…) an interesting and somewhat challenging project. Challenging mostly in that it involves things I’ve never done before.
I learned DNS stuff back when “bind” was the only choice and there was no encryption nor certificate validation. All that is either “new to me” in DNS, or requires some “unlearning” of things I’ve done for decades. Plus, at the best of times, I only did DNS “at a distance”. There’s a lot of “stuff” in it that I’ve always just ignored as either I didn’t need it to make my tablet talk to the Telco router at home; or at work “I had a guy” working for me who was good at it on the industrial scale.
So y’all get to watch me flounder around a little bit on some of this ;-)
First, a little background on the set-up. I’ve got PiHole running on 2 Raspberry Pi boards, plus I’ve got the Telco Boundary Router (that they like to call a modem as it talks to phone wires). These sometimes like to argue with each other.
The Telco Router is SURE it is in charge, plus the Phone Company can fiddle with the config if they like (meaning anything I do can evaporate over night or be diddled in an “update”). Sometimes that’s a feature, often it isn’t. IT wants to be the DHCP server (for handing out IP numbers and domain names) and it wants to be the DNS Server target (though the actual Telco DNS Server is upstream somewhere).
Now it’s not that I don’t love and trust AT&T, it is just that they have been in bed with various Government Agencies since at least the 1960s, never saw any information they didn’t want to hand over to TLAs at the drop of hint, and are not very efficient. So DNS lookups to their server have a few negative properties: Slow. Subject to Government Edict so who knows who is dropped from it. Subject to all manner of Government Snoops requesting records (it is now just Standard Operating Procedure to get / hand over all Telco information). A constant target of Black Hats. And a few more minor things…
So I choose to use other upstream DNS providers.
Which means I run my own DNS servers AND often do a manual configuration of various equipment to point the gear at those servers, or sometimes at the upstream providers.
Traditional DNS requests are sent “in the clear”, meaning that anyone snooping on the wire (OR just be running your upstream DNS server and look in the logs) can see what sites you are visiting via their DNS resolution. They can also then attempt to feed you bogus lookup returns to send you to malware sites (Man In The Middle attack and / or Data Leakage) This is done by Black Hats and by Agencies (with full Telco cooperation) when convenient for them. Basically it is a HUGE privacy and security hole that is a common exploit.
So for a decade or two folks have been trying to work out ways to “fix that” while not breaking the internet in the process. For this reason there are a few competing ways to do some of these things. In particular, “bind” has become a legacy bloated beast with attempts to do it all. DNScrypt vs TLS vs SSL vs DNSSEC vs… So some folks just did a clean rewrite and named it “unbound” (bind, unbound, yeah,yet another cutesy crap name, but the software is good.
Sidebar On Squid Proxy Server & History
Oh, and installing a Squid Proxy Server is also a nice protective measure. It can cache some data locally speeding things up, plus any attack that tries to crawl back down the wire hits the Proxy Server and not your important personal machine.
First step was something I did years back. Install a local DNS Server. I ran one on Alpine Linux (linux router) using dnsmasq (yet another DNS server, but cut way back from “bind” to something a mortal can configure…) and some custom ban lists. Then along came PiHole. A marvelous little malware and advertisement blocking DNS server.
That was good for a while. The local DNS server does ONE lookup on a name then caches it locally. Subsequent DNS lookups from any machine in the house don’t need to talk out the Telco Wire so information leakage is reduced some. IF I get 100 DNS lookups for chiefio.wordpress.com in a day, only ONE goes to the upstream DNS provider, and that isn’t the TELCO once you set this up. Instead of being in their logs, they need to ‘tap the wire’ with a snoop to get that information (which they likely are doing anyway … firewalls now doing Deep Packet Inspection and all)
And yes, putting this all in a VPN hides all of it from the Telco / TLA@telco. But just moves the question / exposure to the VPN provider.
The local cache of lookups makes things MUCH faster. I was really surprised how much faster. Some lookups go from a couple of seconds to a couple of mili-seconds. IF you have a dozen of those in a web page, well, it adds up. Removing some of the high page weight crap that advertisers shove at you speeds things up a lot too.
BUT it is still going “in the clear” and it is still subject to Man In The Middle DNS spoofing / capture redirecting your MyBank.com to their “FraudBank.ch”…
In this posting I’m going to use DNSSEC to help block MITM attacks. Encrypting the tunnel will be left for later (because it looks complicated and I’ve not worked it out yet…). So this will still be “in the clear” DNS, but authenticated with encrypted certificates. You will be getting good and proven DNS lookups and, incidentally, blocking crap sites that are bogus.
First thing to do is get a Raspberry Pi (or similar SBC Single Board Computer) and install Linux on it. Armbian works fine and I ran it for a good many years. But I’ve also had odd problems from System D, so I now run Devuan. It can be a slightly long process to get the current Devuan installed, but IMHO is worth it. The first parts of this posting, when the OS is updated to current, but skipping the i2p bits, gives you a clean Beowulf Devuan 3.0:
Then toss on a copy of PiHole:
Into The Present With “unbound”
And I also installed “unbound”. It is trivial to install, though harder to figure out what to configure. After the usual apt-get update apt-get upgrade:
apt-get install unbound binutils
You may not need the binutils stuff, but it is good to have anyway.
At that point, the fun begins…
I followed a couple of models in doing this. One is:
It is pretty good, but a bit SystemD centric (so you get “systemctl restart foo” instead of “service foo restart”…)
Details Of My config File
The config file for “unbound” is in /etc/unbound/unbound.conf but the bulk of what you want in it is not.
root@XU4uDevuan3:/etc/unbound# head unbound.conf
# Unbound configuration file for Debian.
# See the unbound.conf(5) man page.
# See /usr/share/doc/unbound/examples/unbound.conf for a commented
# reference config file.
# The following line includes additional configuration files from the
# /etc/unbound/unbound.conf.d directory.
It wants you to put stuff in a gaggle of individual files in a conf.d directory for local mods. It also points you at a place where it has installed a copy of a FULL configuration file… “See /usr/share/doc…”
So I just go to the bottom of the config file and in vi it is easy to copy in another file. Copy that file in as a template for your configuration. For those unsure how to do this in an editor, you could just concatenate the first and second files with a linux command:
cat /usr/share/doc/unbound/examples/unbound.conf >> /etc/unbound/unbound.conf
Then you get to slog though a thousand lines of example comments and options trying to figure out which ones to change / turn on or off. No, really, 1000 lines:
root@XU4uDevuan3:/etc/unbound# wc -l /usr/share/doc/unbound/examples/unbound.conf 987 /usr/share/doc/unbound/examples/unbound.conf
987 + what was already in the /etc/unbound/unbound.conf
The good news is that almost all of it is just comments telling you what things do, and pointing out the default that is already set the way you want it. I stripped out the comments and blank lines and made a minimal list for this posting. Do note that I make NO CLAIM that this is best or even substantially correct. Just that it seems to work right. It is 186 lines, but I’m going to inject comments along the way. The rest of it was left as-is in the model:
root@headless1:/home/chiefio# cat SHORT.conf include: "/etc/unbound/unbound.conf.d/*.conf" server: verbosity: 1 # specify the interfaces to answer queries from by ip-address. # The default is to listen to localhost (127.0.0.1 and ::1). # specify 0.0.0.0 and ::0 to bind to all available interfaces.
I left it as 127.0.0.1 on one of my servers, added a interface line for the actual IP on another. Both seem to work fine. In debugging I had to turn verbosity up to 2 to find out what I was doing wrong. It is now set to 0 on the other server. This prevents big log files…
# port to answer queries from # port: 53 port: 5335 # specify the interfaces to send outgoing queries to authoritative # server from by ip-address. If none, the default (all) interface # is used. outgoing-interface: 192.168.16.252
I don’t know why 5335 was chosen, it was just in the model I was following so I kept it. 53 is the usual DNS port. IIRC 453 is used by TLS (or one of the encrypting versions) for DNS. I’ve seen 5353 used for some other one. Whatever. It just needs to match what you set in the PiHole config as the PiHole upstream IP#port.
You likely don’t need to explicitly call out the upstream interface, but I just wanted it constrained as the Pi M3 does have WiFi too.
In this next segment, I figured I was not going to have huge spikes like a full corporation, but 1 M was not that much memory to burn to assure a nice buffer.
# buffer size for UDP port 53 incoming (SO_RCVBUF socket option). # 0 is system default. Use 4m to catch query spikes for busy servers. # so-rcvbuf: 0 so-rcvbuf: 1m # buffer size for UDP port 53 outgoing (SO_SNDBUF socket option). # 0 is system default. Use 4m to handle spikes on very busy servers. # so-sndbuf: 0 so-sndbuf: 1m # EDNS reassembly buffer to advertise to UDP peers (the actual buffer # is set with msg-buffer-size). 1472 can solve fragmentation (timeouts) # edns-buffer-size: 4096 edns-buffer-size: 1472
I set that to the value that “solves fragmentation” just because “why not?”
I raised TTL minimums as my goal is maximum DNS lookups kept on site and I don’t care if I miss some site changing their TTL as they are doing maintenance or changing their IP Address. In professional operations I have set TTL down to a few minutes so as to drain cached values prior to an IP swap, but for home use? Do I care if I can’t get to the National Bank Of Ickystan for 8 hours as their IP changes happens?
# the time to live (TTL) value lower bound, in seconds. Default 0. # If more than an hour could easily give trouble due to stale data. # cache-min-ttl: 0 cache-min-ttl: 3600 # minimum wait time for responses, increase if uplink is long. In msec. # infra-cache-min-rtt: 50 infra-cache-min-rtt: 350 # Enable IPv4, "yes" or "no". # do-ip4: yes # Enable IPv6, "yes" or "no". # do-ip6: yes do-ip6: no
I’m not ready to let these guys to IPv6 yet. I like having all my traffic behind a NAT Firewall boundary router. Just one more hurdle for an intruder to get past. At some point, yeah, I’ll set up a minimal IPv6 subnet with the DNS servers in it and let them talk straight to the Upstream DNS provider over IPv6. But not right now.
Then you must tell it what IP ranges are local and allowed to talk to it. By using the “non-routing” private ranges only (and no IPv6…) that limits the potential interactions with it.
# control which clients are allowed to make (recursive) queries # to this server. Specify classless netblocks with /size and action. # By default everything is refused, except for localhost. # Choose deny (drop message), refuse (polite error reply), # allow (recursive ok), allow_setrd (recursive ok, rd bit is forced on), # allow_snoop (recursive and nonrecursive ok) # deny_non_local (drop queries unless can be answered from local-data) # refuse_non_local (like deny_non_local but polite error reply). # access-control: 0.0.0.0/0 refuse # access-control: 127.0.0.0/8 allow # access-control: ::0/0 refuse # access-control: ::1 allow # access-control: ::ffff:127.0.0.1 allow access-control: 10.0.0.0/8 allow access-control: 172.16.0.0/12 allow access-control: 192.168.0.0/16 allow
“unbound” will not make the needed directory for the log file, but will make the logfile itself. So you need to do a:
mkdir /var/log/unbound chown unbound:unbound /var/log/unbound
Otherwise at launch it will toss an error and not start. I also chose ascii time over decimal time…
# the log file, "" means log to stderr. # Use of this option sets use-syslog to "no". # logfile: "" logfile: "/var/log/unbound/unbound.log" # Log to syslog(3) if yes. The log facility LOG_DAEMON is used to # log to. If yes, it overrides the logfile. # use-syslog: yes use-syslog: no # print UTC timestamp in ascii to logfile, default is epoch in seconds. # log-time-ascii: no log-time-ascii: yes
This ‘grab root hints’is supposed to be a periodic task. Once every few months. I just did it long hand and will automate later, maybe. Then I turned on maybe 1/2 of the “harden” options.
Then I moved that file (and renamed it) to /etc/unbound/root.hints
# file to read root hints from. # get one from https://www.internic.net/domain/named.cache # root-hints: "" root-hints: "/etc/unbound/root.hints" # Harden against out of zone rrsets, to avoid spoofing attempts. harden-glue: yes # Harden against receiving dnssec-stripped data. If you turn it # off, failing to validate dnskey data for a trustanchor will # trigger insecure mode for that zone (like without a trustanchor). # Default on, which insists on dnssec data for trust-anchored zones. harden-dnssec-stripped: yes
We’re getting near the end, hang in there ;-)
We need to tell it some addresses are not for sharing:
# Enforce privacy of these addresses. Strips them away from answers. # It may cause DNSSEC validation to additionally mark it as bogus. # Protects against 'DNS Rebinding' (uses browser as network proxy). # Only 'private-domain' and 'local-data' names are allowed to have # these private addresses. No default. private-address: 10.0.0.0/8 private-address: 172.16.0.0/12 private-address: 192.168.0.0/16 private-address: 169.254.0.0/16 private-address: fd00::/8 private-address: fe80::/10 private-address: ::ffff:0:0/96 # Allow the domain (and its subdomains) to contain private addresses. # local-data statements are allowed to contain private addresses too. # private-domain: "example.com" private-domain: "chiefio.lab" private-domain: "chiefio.home"
One of the fun things about running your own “authoritative” DNS server for your own private domains is that you need not comply with the “rules” of whoever thinks they are in control of the name space. There’s a LOT of “rogue” domains in use (and a wiki on it) and it is more than just .onion domain.
I’ve chosen to accept that I may have “issues” if any formal .lab or .home network high level qualifier is put into production. Though I’ve not set up a DNSSEC key set and all, so I have to declare them outside the DNSSEC authentication:
# if yes, perform prefetching of almost expired message cache entries. # prefetch: no prefetch: yes # File with trusted keys, kept uptodate using RFC5011 probes, # initial file like trust-anchor-file, then it stores metadata. # Use several entries, one per domain name, to track multiple zones. # # If you want to perform DNSSEC validation, run unbound-anchor before # you start unbound (i.e. in the system boot scripts). And enable: # Please note usage of unbound-anchor root anchor is at your own risk # and under the terms of our LICENSE (see that file in the source). # auto-trust-anchor-file: "/etc/unbound/root.key" # Ignore chain of trust. Domain is treated as insecure. # domain-insecure: "example.com" domain-insecure: "chiefio.lab" domain-insecure: "chiefio.home"
Again I’m setting things to “give me a number” first, and worry about it having changed last. Almost always that will work fine, and when it doesn’t, odds are I’m not interested anyway. It isn’t like the numbers change a lot for most places I go. (Though folks doing round robin and dynamic DNS can cause grief, but I’m not going to care if I get a miss, I’ll just come back in a few minutes and try again.)
# Serve expired responses from cache, with TTL 0 in the response, # and then attempt to fetch the data afresh. # serve-expired: no serve-expired: yes # Limit serving of expired responses to configured seconds after # expiration. 0 disables the limit. # serve-expired-ttl: 0 serve-expired-ttl: 60 # Set the TTL of expired records to the serve-expired-ttl value after a # failed attempt to retrieve the record from upstream. This makes sure # that the expired records will be served as long as there are queries # for it. # serve-expired-ttl-reset: no serve-expired-ttl-reset: yes
Then there’s a section where you can put in a table of the stuff in your location, so you can serve your own IP numbers by name, if desired.
# You can add locally served data with # local-zone: "local." static # local-data: "mycomputer.local. IN A 192.0.2.51" # local-data: 'mytext.local TXT "content of text record"' local-data: "Netgear.chiefio.home. IN A 192.168.16.251" local-data: "netgear.chiefio.home. IN A 192.168.16.251" local-data: "PiHole.chiefio.home. IN A 192.168.16.252" local-data: "pihole.chiefio.home. IN A 192.168.16.252" local-data: "PiOne.chiefio.home. IN A 192.168.16.253" local-data: "pione.chiefio.home. IN A 192.168.16.253" local-data: "Telco.chiefio.home. IN A 192.168.16.254" local-data: "telco.chiefio.home. IN A 192.168.16.254" local-data: "Netgear.chiefio.lab. IN A 10.1.1.254" local-data: "netgear.chiefio.lab. IN A 10.1.1.254"
Zones… there must be zones…
I likely have this configured somewhat wrongly. It’s a bastard 1/2 way between a recursive and an authoritative. The first “forward-zone” says to just forward requests to a couple of DNS providers. In this case, the “filtering” DNS provided by OpenDNS (now owned by CISCO, so I’ll likely change it later).
Not that I don’t trust CISCO, but they signed up for the Prism Program to give all your data to the TLAs, so even though that program is supposedly scrapped, they stay on the Asshole List…
Then I’ve got an auth-zone set up that looks at the root servers (using that root.hints file) and tries to start at the top of authority and work down to where the actual source of authority is for any given identity, then send the DNS request directly to the authoritative server. In theory, this means that should I do a lookup on bobsmachine.someco.uk, only the authoritative server for someco.uk would see that I was asking about bobsmachine… Everyone else in the food chain just gets “he wants something from auth server for someco.uk”. A BIG plus.
But I’m not sure if “unbound” chooses forward first or auth-zone first, or what. So this part needs some work. I’ll likely try to just shut off the forward-zone and see if it all still works. But OTOH, for initial bring up, using a forward-zone worked while I figured out the root.hints and such.
forward-zone: name: "." forward-addr: 220.127.116.11 forward-addr: 18.104.22.168 auth-zone: name: "." master: 22.214.171.124 # b.root-servers.net master: 126.96.36.199 # c.root-servers.net master: 188.8.131.52 # d.root-servers.net master: 184.108.40.206 # f.root-servers.net master: 220.127.116.11 # g.root-servers.net master: 18.104.22.168 # k.root-servers.net master: 22.214.171.124 # xfr.cjr.dns.icann.org master: 126.96.36.199 # xfr.lax.dns.icann.org fallback-enabled: yes for-downstream: no for-upstream: yes
Once all that is working, you point the PiHole at it instead of at a regular forwarding / recursive DNS upstream. This is done in one panel on the PiHole and is nearly trivial.
But first, a word about /etc/resolv.conf. Due to various food fights over how to do DNS, the present process is a layer cake of folks all thinking they are in charge. DHCP, Wicd, Network Manager, etc. Various things try to “help” by constantly changing the contents of /etc/resolv.conf. I chose to just take a crowbar to it and lock it down so I KNOW where my DNS is resolving. “chattr +i /etc/resolv.conf” sets the “immutable” attribute and even root can’t edit the file. Yes, it means I also get to do a “chattr -i /etc/resolv.conf” any time I want to change it. OTOH, I no longer suddenly find myself in the att.net domain with my DNS resolution going back to the Telco…
I left in the “nag” about somebody ELSE thinking they were in charge… just as a nose tweak.
I’ve pointed DNS on this machine to itself and set the search and domain names to my choice… and LOCKED IT DOWN with chattr. I’ve also left in 2 nameserver entries though commented them out. This is useful in debugging. When getting “No Joy” on DNS and you really really want to check something in the browser on a web page… just swap them in and 127.0.0.1 out. Then lookups local to the machine go out to a recursive DNS provider, your browser works again, and you can get what you wanted they try again…
root@headless1:/# cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND — YOUR CHANGES WILL BE OVERWRITTEN
Here’s a screen shot of the PiHole config page (click to embiggen):
You just “uncheck” the “BIG NAMES” from the left hand column and “check” the custom names on the right, then enter your DNS server names. Because I have 2 of them, I have one entry for “self” and one that points at the other one. Likely a bit pointless as if “self” is having issues it is unlikely to be able to talk well to the other one, but who knows. Also note that in the above long config file, I do not have the interface IP defined as a DNS target for the .252 PiHole, only the 127.0.0.1 target. So a bit of an asymmetry. The .253 one can see requests both on 127.0.0.1 (from the local PiHole) and on the ethernet interface (for the other PiHole and testing), but the .252 can only see itself on 127.0.0.1 so only the PiHole or folks on the computer can talk to “unbound”.
Something else to decide which way to go…
Resource Usage & Swap Management
Here’s a ssh login / htop from the Pi Model 1 showing that with the LXDE Login out of the way, there’s very little resource usage for this thing.
Yeah, 98 MB of memory and 3.9% of CPU. CPU rises when the squid proxy is in use, but not by a lot. DNS load is nearly nothing. And yes, that says 8 GB of swap on it. Why?
Because an early attempt had swap use rise to about 1.6 GB over a couple of days. Not sure exactly why, but decided to change swapiness and such.
ems@devuan:/etc$ tail sysctl.conf [...] net.ipv6.conf.all.disable_ipv6=1 vm.swappiness 0 vm.vfs_cache_pressure=100 #vm.vfs_cache_pressure=500 vm.dirty_background=10 vm.dirty_ratio=20
So I shut off ipv6, set vm-swappiness to 0 so file handles and such get dumped fast, put cache_pressure up some (and prepared for if more was needed) to encourage dumping old cached crap sooner, and then set the background swap / cache cleanup to happen at 10% and to start blocking applications to go do a fast clean up at 20% stale (that only gets reached if the background cleanup is having a slow day…)
Now it does look like swap is basically unused (that 20 ish on swap was from when I was logged in using a full desktop and it is a ‘left over’)
So some swap tuning on low resource machines might be in order ;-)
There’s a nice manual page on unbound here:
So if you want to know what those 1000 lines are doing, look there.
Next up for me is attempting to sort out that whole encrypting tunnel thing. TLS vs HTTPS and all. What do I want to do and how to do it.
I published this 1/2 way step (really more like 3/4 of the way…) as it gets a LOT of DNS Bogosity and risk and snooping out of the way. Plus, if you are already using a VPN, you can stuff your traffic inside it anyway.
Also on my ToDo List, is to see about turning on HTTPS DNS in my FireFox browser but pointed at MY DNS servers. Sort of a ‘two fer’. Blocking FireFox from sending my browser DNS lookups off site, and at the same time getting HTTPS DNS working on my servers.
There’s 2 directions for encrypted links ( upstream to servers and downstream to clients) plus at least 2 major protocols (TLS / HTTPS) and so it’s a bit of a 4-way puzzle to sort out to set it up.
That’s next on the ToDo list.
This is an interesting page at a Germany site. Lists the test case too:
Test validationdig sigok.verteiltesysteme.net @127.0.0.1 (should return A record) dig sigfail.verteiltesysteme.net @127.0.0.1 (should return SERVFAIL)
If DNSSEC validation does not seem to work, check whether you’re using more than one DNS resolver and whether each of them has DNSSEC validation enabled. The most common configuration error is to use a secondary DNS resolver without DNSSEC validation. Upon validation error, the operating system will fall back to the secondary resolver and the security checks of the primary resolver will be moot.
I found it easiest to see the “SERVFAIL” notice in the PiHole Admin / logfile page where the things are a lot easier to read.
There’s an “unbound” tutorial here:
Where I’ll be spending some quality time trying to get a better handle on the encrypting bit and how forwarding interacts with auth-zone. Zones in general really.
Here’s a how-to for PiHole DNS over HTTPS (DoH), but using Cloudflare and SystemD stuff.
I’m not a big fan of Cloudflare, but they are OK. I may set up one of mine this way as a first immersion… but needing a ‘special daemon’ seems a bit much.
Here’s another model. Just doing DNSSEC from PiHole is not as hard as going through unbound, but I’m doing it that way as unbound lets me get the encryption step too, plus move to auth-zone tree traversal and a lot more:
And some docs:
Then some folks talking about it with hints:
And another POV:
So folks wanting to “Dig Here!” some more, y’all got your pointers.