Dev Ups

Published 2022-10-03 in ci-cd

Racknerd's Ubuntu VPS and its missing adm group

This post describes shortcomings I found with Racknerd's Ubuntu installations.

I compared the same infrastructure as code (IaC) on local Vagrant VMs and on cloud VPSs. Fail2ban, tolerates a lot of log variations. It depends on the installation being "to-spec". RackNerd have butchered their Ubuntu images for some economy. What I can't script as IaC, is the operating system installation.

My IaC worked perfectly on Ubuntu and Debian 11 on Azure.

Happy path; Ubuntu 20.04 on Racknerd VPS

I reverted my Racknerd deployment to Ubuntu 20.04, because the rsyslog service appears to work on that version. Only "appears", because it's out of date now, and I did have problems banning hosts from the FORWARD chain of iptables. The diagnosis being:

root@racknerd-89c14a:~# systemctl status rsyslog
● rsyslog.service - System Logging Service
     Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2022-09-29 13:10:22 BST; 1min 3s ago
TriggeredBy:  syslog.socket
       Docs: man:rsyslogd(8)
             https://www.rsyslog.com/doc/
   Main PID: 378 (rsyslogd)
      Tasks: 4 (limit: 1714)
     Memory: 2.8M
     CGroup: /system.slice/rsyslog.service
             └─378 /usr/sbin/rsyslogd -n -iNONE

Sep 29 13:10:22 racknerd-89c14a systemd[1]: Starting System Logging Service...
Sep 29 13:10:22 racknerd-89c14a rsyslogd[378]: imuxsock: Acquired UNIX socket '/run/systemd/journal/syslog' (fd 3) from systemd.  [v8.2001.0]
Sep 29 13:10:22 racknerd-89c14a rsyslogd[378]: rsyslogd's groupid changed to 106
Sep 29 13:10:22 racknerd-89c14a rsyslogd[378]: rsyslogd's userid changed to 102
Sep 29 13:10:22 racknerd-89c14a rsyslogd[378]: [origin software="rsyslog201d" swVersion="8.2001.0" x-pid="378" x-info="https://www.rsyslog.com"] start
Sep 29 13:10:22 racknerd-89c14a systemd[1]: Started System Logging Service.
anonuser@racknerd-89c14a:~$ ll /var/log
total 2560
drwxrwxr-x   9 root   syslog             4096 Sep 29 18:19 ./
drwxr-xr-x  11 root   root               4096 Apr 30  2018 ../
-rw-r--r--   1 root   root               9444 Sep 29 13:19 alternatives.log
drwxr-xr-x   2 root   root               4096 Sep 29 15:00 apt/
-rw-r-----   1 syslog adm              143830 Sep 29 20:40 auth.log
-rw-rw----   1 root   utmp              13056 Sep 29 20:17 btmp
drwxr-xr-x   2 root   root               4096 Apr 29  2020 dist-upgrade/
-rw-r--r--   1 root   adm               42380 Sep 29 18:19 dmesg
-rw-r--r--   1 root   adm               42555 Sep 29 17:45 dmesg.0
-rw-r--r--   1 root   adm               12877 Sep 29 14:31 dmesg.1.gz
-rw-r--r--   1 root   adm               12567 Sep 29 13:10 dmesg.2.gz
-rw-r--r--   1 root   root              46084 Sep 29 15:00 dpkg.log
-rw-r-----   1 root   adm              109609 Sep 29 20:38 fail2ban.log
drwxr-xr-x   2 root   root               4096 Apr 30  2018 installer/
drwxr-sr-x+  3 root   systemd-journal    4096 Sep 29 13:10 journal/
-rw-r-----   1 syslog adm              583453 Sep 29 20:41 kern.log
-rw-rw-r--   1 root   utmp             292292 Sep 29 20:39 lastlog
drwxr-xr-x   2 root   root               4096 Sep 29 13:21 nginx/
drwx------   2 root   root               4096 Apr 29  2020 private/
-rw-r-----   1 syslog adm             1038580 Sep 29 20:41 syslog
-rw-r-----   1 syslog adm              345215 Sep 29 20:41 ufw.log
drwxr-xr-x   2 root   root               4096 Apr 29  2020 upgrade/
-rw-rw-r--   1 root   utmp             117504 Sep 29 20:39 wtmp

Sad path, default Ubuntu 22.04

I can still enjoy this distribution after applying my patches to fix rsyslog and the missing adm group. I'd needed to add the adm group in the past, to install Nginx. That one modification allowed Nginx to run stably for months.

Here's my assessment of rsyslog immediately after installing Ubuntu 22.04:

root@racknerd-89c14a:~# systemctl status rsyslog
× rsyslog.service - System Logging Service
     Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2022-09-29 12:04:39 UTC; 2min 31s ago
TriggeredBy: × syslog.socket
       Docs: man:rsyslogd(8)
             man:rsyslog.conf(5)
             https://www.rsyslog.com/doc/
    Process: 587 ExecStart=/usr/sbin/rsyslogd -n -iNONE (code=exited, status=1/FAILURE)
   Main PID: 587 (code=exited, status=1/FAILURE)
        CPU: 9ms

Sep 29 12:04:39 racknerd-89c14a systemd[1]: rsyslog.service: Main process exited, code=exited, status=1/FAILURE
Sep 29 12:04:39 racknerd-89c14a systemd[1]: rsyslog.service: Failed with result 'exit-code'.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: Failed to start System Logging Service.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: rsyslog.service: Scheduled restart job, restart counter is at 5.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: Stopped System Logging Service.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: rsyslog.service: Start request repeated too quickly.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: rsyslog.service: Failed with result 'exit-code'.
Sep 29 12:04:39 racknerd-89c14a systemd[1]: Failed to start System Logging Service.


root@racknerd-89c14a:~# ll /var/log
total 56
drwxrwxr-x   9 root      syslog            4096 Sep 29 12:04 ./
drwxr-xr-x  13 root      root              4096 Apr 21 01:01 ../
drwxr-xr-x   2 root      root              4096 May 13 17:41 apt/
-rw-rw----   1 root      utmp               384 Sep 29 12:05 btmp
drwxr-xr-x   2 root      root              4096 Apr 18 17:47 dist-upgrade/
-rw-r--r--   1 root      root                37 Sep 29 12:04 dmesg
drwxr-x---   2 root                    4   4096 May 13 17:41 installer/
drwxr-sr-x+  3 root      systemd-journal   4096 Sep 29 12:04 journal/
drwxr-xr-x   2 landscape landscape         4096 May 13 16:02 landscape/
-rw-rw-r--   1 root      utmp            292292 Sep 29 12:06 lastlog
drwx------   2 root      root              4096 Apr 21 01:00 private/
drwxr-x---   2 root                    4   4096 May 13 09:13 unattended-upgrades/
-rw-rw-r--   1 root      utmp              2688 Sep 29 12:05 wtmp
root@racknerd-89c14a:~#

The missing owners/groups in the listing of /var/log suggest that rsyslog is broken because of the missing adm group.

Solution

rsyslog is required for ssh logs to be made available to Fail2ban. These logs are available on the default 20.04 variation of RackNerd's Ubuntu. For some reason (probably permissions and owners), when my provisioning script targets RackNerd's Ubuntu 20.04, Fail2ban fails to enforce bans. The exception is the sshd jail, which is the only jail that isn't using the FORWARD chain within iptables to route traffic to containers after I finish the required provisioning.

Although fairly simple, the solution is not without downtime. I have only tested it as provisioning began, not on a system running a production workload.

Purge rsyslog and reinstall, after adding the missing adm group:

addgroup adm
apt-get purge rsyslog -y
apt-get install rsyslog -y

That's it. We now have auth.log for the sshd jail; which picks it up automatically. All my jails are working, and enforcing, exactly as they would on any other VPS (i.e., Vagrant, AWS, and Azure).

I recently repeated the above on a fresh 22.04. It still didn't come with an adm group so I just repeated all three commands:

:~$ ll /var/log/
total 452
drwxrwxr-x   9 root      syslog            4096 Nov 16 13:34 ./
drwxr-xr-x  13 root      root              4096 Apr 21  2022 ../
drwxr-xr-x   2 root      root              4096 Nov 16 13:27 apt/
-rw-r-----   1 syslog    adm               3952 Nov 16 13:43 auth.log
-rw-rw----   1 root      utmp              1920 Nov 16 13:15 btmp
drwxr-xr-x   2 root      root              4096 Apr 18  2022 dist-upgrade/
-rw-r-----   1 root      adm              44198 Nov 16 13:34 dmesg
-rw-r-----   1 root      adm              44151 Nov 16 13:27 dmesg.0
-rw-r-----   1 root      root                64 Nov 16 13:32 dmesg.1.gz
-rw-r-----   1 root      root                64 Nov 16 13:05 dmesg.2.gz
-rw-r--r--   1 root      root              1841 Nov 16 13:27 dpkg.log
drwxr-x---   2 root                    4   4096 May 13  2022 installer/
drwxr-sr-x+  3 root      systemd-journal   4096 Nov 16 13:05 journal/
-rw-r-----   1 syslog    adm             110770 Nov 16 13:34 kern.log
drwxr-xr-x   2 landscape landscape         4096 May 13  2022 landscape/
-rw-rw-r--   1 root      utmp            292292 Nov 16 13:43 lastlog
drwx------   2 root      root              4096 Apr 21  2022 private/
-rw-r-----   1 syslog    adm             171062 Nov 16 13:43 syslog
-rw-r--r--   1 root      root               157 Nov 16 12:50 ubuntu-advantage-timer.log
drwxr-x---   2 root                    4   4096 May 13  2022 unattended-upgrades/
-rw-rw-r--   1 root      utmp             10752 Nov 16 13:43 wtmp

Conclusion

This is the best way to fix up Ubuntu 22.04 (possibly 20.04, too) from Racknerd. Without auth.log, or ssh*.log, I could still see logged lines in systemctl status ssh so, probably, they could have been readable from journalctl too.

Fun fact: in 2019, about 20 or so VPS providers, all using RackNerd's parent company's servers (ColoCrossing, (CC)) were shut down at once, with no compensation to their customers. It blackened the company's reputation. CC still exists, so I guess this episode was "legal", for what that's worth. I've had only good experiences with Racknerd; but I don't trust any original content to any remote server, yet. While I learn stuff, I appreciate the cost saving.

Here are the ownerships in /var/log under the fully functional, patched Ubuntu 22.04:

anonuser@racknerd-89c14a:~$ ll /var/log/
total 692
drwxrwxr-x  10 root      syslog            4096 Sep 30 06:15 ./
drwxr-xr-x  13 root      root              4096 Apr 21 01:01 ../
drwxr-xr-x   2 root      root              4096 Sep 30 06:12 apt/
-rw-r-----   1 syslog               1001 120377 Sep 30 16:57 auth.log
-rw-rw----   1 root      utmp              6912 Sep 30 15:17 btmp
drwxr-xr-x   2 root      root              4096 Apr 18 17:47 dist-upgrade/
-rw-r-----   1 root                 1001  50992 Sep 30 06:10 dmesg
-rw-r-----   1 root      root                37 Sep 30 06:03 dmesg.0
-rw-r--r--   1 root      root              7499 Sep 30 06:12 dpkg.log
-rw-r-----   1 root      adm              43946 Sep 30 16:56 fail2ban.log
drwxr-x---   2 root      adm               4096 May 13 17:41 installer/
drwxr-sr-x+  3 root      systemd-journal   4096 Sep 30 06:03 journal/
-rw-r-----   1 syslog               1001  73454 Sep 30 07:03 kern.log
drwxr-xr-x   2 landscape landscape         4096 May 13 16:02 landscape/
-rw-rw-r--   1 root      utmp            292292 Sep 30 16:51 lastlog
drwxr-xr-x   2 root      root              4096 Sep 30 06:16 nginx/
drwx------   2 root      root              4096 Apr 21 01:00 private/
-rw-r-----   1 syslog               1001 241943 Sep 30 16:51 syslog
-rw-r--r--   1 root      root               237 Sep 30 12:41 ubuntu-advantage-timer.log
drwxr-x---   2 root      adm               4096 May 13 09:13 unattended-upgrades/
-rw-rw-r--   1 root      utmp             77184 Sep 30 16:51 wtmp

This differs from the almost functional Fail2ban, using Racknerd's stock Ubuntu 20.04, by having less privileged groups for many files. I should probably look at fixing the missing groups, but most were like that before rsyslog got reinstalled. Whilst not optimally secure, it is no less secure than before adding the adm group.

I chose a simple addgroup adm that has so far sufficed for me. The typical adm group has a group id of 4, which I learned I didn't need to specify; it became that anyway. I could probably get away with setting adm as group, where the group is missing above, even for owner root. If it isn't broke, don't fix it.