shore.co.il infrastructure - May 2021 edition

Published

Hardware

The hardware I'm using consists of:

  • Netgate SG-2440 running OpenBSD.
  • Linksys EA6350 running OpenWrt.
  • ASrock N3150-NUC running Debian.
  • An online.net (now Scaleway) Dedibox running Debian.
  • An ADSL modem provided by my ISP.
  • A purpose built PC in the living room running Debian.
  • APC UPS (I don't remember the exact model I don't feel like getting and checking).

The OpenBSD box is ns1.shore.co.il. It's the router for the home network, the primary DNS server for the shore.co.il zone, DNS resolver for the network, DHCP server and running HAProxy. The local Debian box has a 0.5TB drive and is the LDAP server, mail server, Nextcloud, GitLab. I store all of my private information on it encrypted. I have off-site backups I take about every 2 weeks that I store in my mom's house (also encrypted). The living room PC runs Kodi, Transmission and my podcast downloader. It has a magnetic 8TB drive that holds music, movies and other such media. The OpenWrt box is the wireless access point. Lastly, the Dedibox is a bare metal instance which obviously has a faster internet connection than I have locally. It runs this blog, a few other web sites, the container registry, the secondary DNS server for shore.co.il and most GitLab CI jobs run on it. Also, it runs my workbench (a container that has all of the tools I need and use) that benefits from the faster internet connection, faster drives, abundant memory and beefy CPU. The data drive is also encrypted, but not backed up (everything on it can be recreated in less than a day).

Deployments

Initial setup is done using Ansible. Services on the different Debian boxes run in Docker containers and deployed using GitLab. The other OSes are maintained using just Ansible. There's no redundancy since it would take more money than I would like to spend. There's no infrastructure-as-code (no Terraform) since the only thing I could code is the single Dedibox instance (and it's not supported by the Scaleway provider last time I checked). All of the Debian instances run a GitLab Runner in a Docker container and have access to the dockerd socket so they can create containers, run jobs in them and build and push images. I know this is a security risk, but since only I use this GitLab instance I'm worried about it. The deployments are pretty consistent, projects have a docker-compose.yaml file, the GitLab runner runs docker-compose build, docker-compose pull and docker-compose up to deploy services. All of the code is in the Shore group in my GitLab instance. The templates for the CI pipelines that I use are also in my GitLab instance.

Security

Most services are only available over SSL (apart from services that don't support it like DNS, DHCP, etc.). I'm using Let's Encrypt to issue globally valid certificates. In my home network this has presented a problem. I wanted to have multiple hosts using SSL with the single IP address and not rely on the router to decrypt the traffic. I wanted the traffic to remain encrypted until it reaches the host and that the certificate to globally valid. For that I run HAProxy which uses SNI to identify the requested host and forwards the traffic accordingly.

The SSH servers all have rate limits and only allow public key authentication. I rely on SSH keys to authenticate and login instead of LDAP, I prefer it since if the LDAP server is down I'm not locked out entirely. Further more the only services that use the LDAP server are on the same host and connect to it via a Unix socket, further securing the access. The LDAP server is not available on the network.

For regular tasks like renewing the SSL certificates or updating the hosts I have written Ansible playbooks. I routinely rotate the SSL keys and also the DH parameters. I run them manually from my laptop, I don't want them to be updated automatically by the hosts themselves. Also, rebooting the NUC and Dedibox requires a manual step to unlock the encrypted drives. I can update all of the hosts, rebuild all of the container images and deploy in about 2 hours and with very little interaction and I do so every few weeks.

Changes from the previous iteration

There are a few changes in the infrastructure sine the last versions. First, instead of the Dedibox I used an EC2 instance. The Dedibox costs more, but the time saving from the beefier instance, from the CI improvements and being able to run VMs on it are worth it.

I used to use Ansible for everything, including deploying services which I do now with Docker Compose and GitLab runners. The development workflow with Docker is easier and faster than using Ansible along with Vagrant and Molecule. I can honestly say that I'm surprised more people don't this more often. It's really easy, simple, secure and reliable. I find that this approach is useful for simple setups like mine (or dev or QA environments), especially since the same Docker Compose setup can be used for local development.

The addition of the Nextcloud and GitLab services have made me entirely self-hosted. I use online.net and a DNS registrar but apart from that I don't rely on any external service and I hold all of my private data (except that most of the emails I send end up in Google's servers anyway). My sites are indexed by Google, Bing and Yandex but I'm not sure what can I do about that.

Future improvements

The UPS isn't supported by NUT or anything else available on Linux or OpenBSD. I will replace it sometime in the future with one that does so that I can trigger a clean shutdown when there's a power outage and the battery is running low. Also, I would like to replace the NUC with a newer and faster one.

I have external monitoring on services but I've yet to setup internal log aggregation or metrics collection. I plan on setting up an EFK stack (I have some POC code laying around but I need to update it and bring it in line with the rest of the infrastructure). I also want to investigate Sensu for running checks locally (a Nagios replacement), I have my eye on Testinfra for the host checks.

I'm using Z-Push along with Nextcloud for Activesync but it doesn't work with my phone so I want to evaluate SOGo as a replacement. I want to try the Dropbear-initramfs integration so I can unlock encrypted drives remotely over SSH. I want to replace the workbench using Docker with the toolbox project.

I avoided using a VPN for now and I don't want to go down that route. But I've been in very closed networks so I want to setup a Websocket proxy to my SSH server on the Dedibox so I can connect over port 443 and tunnel out from such networks.

Lastly, I see XMPP, Matrix or Mastodon (or maybe 2 of them) in the future for secure and self-hosted chatting with friends.