File System and Btrfs – Advanced Features

Btrfs maintenance

One of the special features of btrfs is, that all the data is check-summed. The scrub job should be scheduled once a week, it reads all data on a HDD/SDD/VM, re-calculates the checksum and compares it to the previous calculated one. If it doesn’t match, the hard drive may fail soon and you can decide to replace it in time. Additional if you use e.g. RAID1 or 10 then btrfs can automatically correct the wrong copy via the correct copy from another disk.

Run check and balance from time to time as well, especially after adding/removing a drive in a btrfs RAID1/10. Balance can be a long running task if you have multiple terabytes of data.

Example scrub systemd service:

$ sudo nano /usr/local/bin/rf-scrub.sh
#!/bin/bash
btrfs scrub start / | systemd-cat
btrfs scrub status / | systemd-cat
$ sudo chmod 700 /usr/local/bin/rf-scrub.sh
$ sudo nano /etc/systemd/system/rf-scrub.service
[Unit]
Description=rf-scrub.service for Btrfs

[Service]   
Type=oneshot
ExecStart=/usr/local/bin/rf-scrub.sh
User=root  
Group=root
$ sudo nano /etc/systemd/system/rf-scrub.timer
[Unit]
Description=Run rf-scrub weekly, start immediatly if system was off

[Timer]
#OnCalendar=weekly
OnCalendar=Fri *-*-* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target
$ sudo systemctl enable rf-scrub.timer 
&& sudo systemctl restart timers.target && systemctl list-timers

IO Scheduler (Desktop)

The default IO scheduler’s on Arch Linux offers now multi queue. The boot parameter “scsi_mod.use_blk_mq=1” is not required anymore. Your default IO schedule paramters for the specific disk (replace X) should look like:

$ cat /sys/block/sdX/queue/scheduler
[mq-deadline] kyber bfq none

The active one for the disk is marked in brackets.

For a SSD by default mq-deadline is activated and for a normal HDD bfq. For USB sticks bfq should be enabled manually as well, and for a HDD with SMR mq-deadline (see http://lkml.iu.edu/hypermail/linux/kernel/1810.0/03048.html, to avoid using the wrong scheduler especially for SMR devices and to improve USB performance there might be patches for bfq in Linux > 4.21).

Bfq in low latency mode can highly increase desctop interactivity and application startup times on NVME devices, read https://www.phoronix.com/scan.php?page=article&item=linux-420-io&num=1.

bfq

With bfq (https://wiki.archlinux.org/index.php/Improving_performance#Changing_I/O_scheduler) you can improve startup times, to check if low latency mode is enabled run.

cat /sys/block/sd*/queue/scheduler

You can enable bfq via a udev rule on startup.

/etc/udev/rules.d/60-ioschedulers.rules
# set scheduler for non-rotating disks
ACTION=="add|change", KERNEL=="sd[a-z]|mmcblk[0-9]*|nvme[0-9]*", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"
# set scheduler for rotating disks
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"

File System Feature Matrix

So far most of my machines are running with Btrfs as all data is check summed and it allows to recognize drive errors earlier and in RAID configurations it can recognize and correct bit rot errors.

                               Btrfs              GlusterFS            IPFS
Implementation Level           kernel             user space            P2P
Disk Layout              raw/gpt/dos/partition    partition             any
Additional FS required           -              ext4/xfs/btrfs          any
Disk Encryption¹                 -                    -                  -
Secure Network               not required            SSL³
Data Check Sum                   X                    X                  X
Local RAID                       X                    X                 any
Distributed RAID²                -                    X                 any
Heterogen. SSD/HDD aware RAID    -                    -                 any
RAID growth plus 1 disk          X           - (only by replica number) any
Bit Rot Correction               X                    X
De-duplication               service job              -                 auto
Master Server                not required         every host             No
Snapshots                        X                   100           no delete
Geo Replication                  -                    X                 auto
Performance                      ?                    ?                 P2P

¹ only possible via luks (cryptsetup), it is maybe added in the future to Btrfs

² By distributing you get higher availability

³ Encryption not enabled by default, requires either setup on all hosts or devices on the network must be trusted

GlusterFS issues

GlusterFS is designed to work on a server infrastructure with a heterogeneous disk storage setup. It doesn’t like different types of disks (SSD, HDD, NVME, USB, …) as the slowest disk is defining the write spreads. Then the disks sizes should be the same for paired disks if you use parity like in a RAID, combining a 1TB drive with a 4TB driver doesn’t work well automatically. If a node goes down, the data is not replicated automatically as well to other nodes to recover RAID level.

Tmpfs (usually not required)

If you want to compile a linux kernel or e.g. kodi and you run into temp disk space issues during compiling and you have plenty of RAM or if you want to reduce writes to a SSD disk, create a tmp file system in RAM of at least 6G on Arch Linux.

tmpfs /tmp tmpfs defaults,noatime,size=8G,mode=1777 0 0

Tools

HDD Alignment Check

$ blockdev –getalignoff /dev/sda1

Btrfs

Btrfs RAID Replace

If you replaced a failed device and the following command shows still errors, then you will have to manually reset the stats:

$ btrfs device stats /mountpoint
[/dev/mapper/mountpoint1].write_io_errs 0 
[/dev/mapper/mountpoint1].read_io_errs 0 
[/dev/mapper/mountpoint1].flush_io_errs 0 
[/dev/mapper/mountpoint1].corruption_errs 0 
[/dev/mapper/mountpoint1].generation_errs 0 
[/dev/mapper/mountpoint2].write_io_errs 75734255 
[/dev/mapper/mountpoint2].read_io_errs 396608381 
[/dev/mapper/mountpoint2].flush_io_errs 1827 
[/dev/mapper/mountpoint2].corruption_errs 0 
[/dev/mapper/mountpoint2].generation_errs 0

If replace worked without any issue and scrub doesn’t show errors, you can reset the stats with:

$ btrfs device stats -z /mountoint

Then run scrub for the mountoint and check the results of scrub and stats.

Btrfs repair

Repairing a btrfs unmounted drive should always be the last option if nothing else worked and a backup of your data is required as wrong data will be deleted. Do it only if you can allow to loose the data, something might be not recoverable especially if you didn’t use a RAID. Read https://btrfs.wiki.kernel.org/index.php/Btrfsck.

Btrfs RAID

Actually only RAID1 and RAID10 are stable. To calculate the available space in a RAID you can use a calculator http://carfax.org.uk/btrfs-usage/index.html.

Btrfs snapshots

If you want to run Btrfs snapshots before pacman updates, follow the guide https://wiki.archlinux.org/index.php/Snapper#Wrapping_pacman_transactions_in_snapshots

Hardware Health Checks

If you buy new hardware or installed a new machine, you should always check the health and prepare for failures.

RAM

New RAM should be tested once fully for failures. In case you need to test RAM adhoc, add Memetest to your boot manager.

Memtest86 – UEFI

Attention: Memtest is proprietary software! Recommendation: Use the free and open source Memtest86+ on a normal Arch Linux USB Stick in CSM mode (non UEFI boot) instead of installing Memtest86-efi.

yay memtest86-efi 
memtest86-efi --install

If you use systemd-boot you can add a boot menu entry by selecting option 4.

Memtest86+ – GRUB

Memtest86+ (license GPL) doesn’t support DDR4. Recommendation: Use a normal Arch Linux USB Stick in CSM mode (non UEFI boot) instead of installing it.

pacman -Sy memtest86+
grub-mkconfig -o /boot/grub/grub.cfg

ECC RAM

ECC RAM is used in servers and workstations to detect bit errors, but if available like with most Ryzen mainboard, it is recommended as well for desktops (at least Linus Torvalds recommends it https://plus.google.com/+LinusTorvalds/posts/VdLMbfmgmGJ). It is usually able to detect and correct single bit errors, multi bit errors can only be detected but NOT corrected.

sudo pacman -Sy edac-utils

Now check, if ECC is activated by the memory controller and if it reports any issue.

$ edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: mc#0csrow#0channel#0: 0 Corrected Errors
edac-util: No errors to report.

In this case ECC is working. Next check:

dmesg | grep -i edac
[ 0.586752] EDAC MC: Ver: 3.0.0
[ 4.108749] EDAC amd64: Node 0: DRAM ECC enabled.
[ 4.108751] EDAC amd64: F17h detected (node 0).
[ 4.108802] EDAC MC: UMC0 chip selects:
[ 4.108803] EDAC amd64: MC: 0: 8192MB 1: 0MB
[ 4.108804] EDAC amd64: MC: 2: 0MB 3: 0MB
[ 4.108805] EDAC amd64: MC: 4: 0MB 5: 0MB
[ 4.108805] EDAC amd64: MC: 6: 0MB 7: 0MB
[ 4.108808] EDAC MC: UMC1 chip selects:
[ 4.108808] EDAC amd64: MC: 0: 0MB 1: 0MB
[ 4.108809] EDAC amd64: MC: 2: 0MB 3: 0MB
[ 4.108809] EDAC amd64: MC: 4: 0MB 5: 0MB
[ 4.108810] EDAC amd64: MC: 6: 0MB 7: 0MB
[ 4.108810] EDAC amd64: using x8 syndromes.
[ 4.108811] EDAC amd64: MCT channel count: 1
[ 4.108915] EDAC MC0: Giving out device to module amd64_edac controller F17h: DEV 0000:00:18.3 (INTERRUPT)
[ 4.108929] EDAC PCI0: Giving out device to module amd64_edac controller EDAC PCI controller: DEV 0000:00:18.0 (POLLED)
[ 4.108929] AMD64 EDAC driver v3.5.0

It states ECC is enabled and that the extra bit for ECC is available via x8 syndromes.

Ryzen supports ECC, for more details read http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/75030-ecc-memory-amds-ryzen-deep-dive.html. But it will not halt the system in case of an ECC error, so there is definitely still an open task for ECC on Ryzen at least somewhere in Linux, microcode or any other component.

SSD/HDD

Smartctl tests

HDD/SSD health should be checked regularly, especially on file servers.

Install Smartmontools

pacman -Sy smartmontools

Graphical interface:

pacman -Sy gsmartcontrol

or from AUR the KDE disKmonitor

yay diskmonitor

Smartd

Enable automatic smart checks and mail service (very important on real hardware servers). For details read https://wiki.archlinux.org/index.php/S.M.A.R.T..

sudo systemctl enable smartd.service

If you need you can modify the settings in:

$ nano /etc/smartd.conf

To get the complete overview about a drive, run:

$ smartctl -a /dev/sda

To get just a short health status, run:

$ smartctl -H /dev/sda

To run the short, conveyance or long self checks (not every check is provided by every disk, and sadly it doesn’t work for USB drives), execute:

$ smartctl -t long /dev/sda

To view test results:

smartctl -l selftest /dev/sda

Setup automatic Mails about drive healt (OPEN)

# -m Send a Mail to this address, -M send a mail after each start of the service so that you know which drive will be monitored

DEVICESCAN -m address@domain.com -M test

Do not spin up disk in standby:

DEVICESCAN -n standby,15,q

Apache 2.4 Webserver

Certificate

Pin domain to certification authority https://blog.qualys.com/ssllabs/2017/03/13/caa-mandated-by-cabrowser-forum

TLS1.3 Support on OpenSSL (in development)

A good explanation why TLS1.3 is great https://blog.cloudflare.com/tls-1-3-overview-and-q-and-a

Test connection in your browser with a TLS1.3 only server https://tls13.crypto.mozilla.org

Man in the Middle Attacks (MITM)

So far it is not clear if TLS1.3 will effectively warn and disallow man in the middle attacks of so called security boxes. These boxes are mostly outdated security voodoo and make web traffic insecure without warning the users that this will allow the security box admin to read all web traffic including passwords.

Firewall

Open port check see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/3/html/Security_Guide/s1-server-ports.html

Cockpit – PC Monitoring and Management

Cockpit 217 (license LGPL2.1, http://cockpit-project.org, https://github.com/cockpit-project/cockpit) is a web based server monitoring and management tool. As soon as you run multiple computers and servers for 24h at 365 days you wonder how can you monitor them. Cockpit doesn’t have a lot of dependencies, basically you just need a Linux system that is running systemd as a system and process manager:

sudo pacman -Sy cockpit

The required dependencies are installed automatically. Optionally you need udisks2 (to be replaced with storaged in the future, not yet available in Arch Linux) to show storage information, firewalld to configure the open ports of the firewall and packagekit to show installed and update packages.

sudo pacman -Sy udisks2 firewalld packagekit

If you want to enable the web interface permanently on a central access point server then run the following commands, but the second with your normal user:

sudo pacman -Sy cockpit-dashboard
systemctl enable --now cockpit.socket

The cockpit service/web interface is only required on one single/central server. All other remote servers with cockpit installed you can manage with this central instance. There is no need to enable there as well the cockpit.socket. If you connect from the central instance to a remote server it will spin up and down on demand the cockpit functionalities after creating the SSH tunnel.

The web interface is accessible in the browser:

localhost:9090

You can replace localhost with the IP address or a domain name. Connecting with a normal IP address of domain name will automatically change the connection to https.

If you use a SSH key with a password make sure that it is identical to the password that you use for the user to login to the central cockpit webfrontend, otherwise single sign in will not work and you always need to type in manually the password to connect to remote machines.

If you have linked multiple servers to your cockpit instance, you should backup the following file:

/etc/cockpit/machines.d/99-webui.json

Let’s Encrypt – Certbot Certificate – Manual

sudo certbot --manual --preferred-challenges dns certonly

Manual procedure example:

  • Get temporary FIX IP Address with all Ports (80/443) open and reachable from the internet
  • Add this fix IP to all domains
  • Add Textrecord for every domain to your domains, wait 1 minutes and then press enter, saved in /etc/letsencrypt/live/DOMAINNAME.com/
  • root: rm /etc/cockpit/ws-certs.d/DOMAINNAME.com.cert && cat /etc/letsencrypt/live/DOMAINNAME.com/fullchain.pem /etc/letsencrypt/live/cDOMAINNAME.com/privkey.pem > /etc/cockpit/ws-certs.d/DOMAINNAME.com.cert

Increase Security of Connections

By default cockpit is using gnutls with AES 128bit for compatibility. To increase security, do the following (https://gnutls.org/manual/html_node/Priority-Strings.html#Priority-Strings):

sudo mkdir /etc/systemd/system/cockpit.service.d sudo nano /etc/systemd/system/cockpit.service.d/ssl.conf
[Service] Environment=G_TLS_GNUTLS_PRIORITY=SECURE256:-VERS-ALL:+VERS-TLS1.3:-KX-ALL:%COMPAT

TLS 1.3 will is available since gnutls 3.6. To check, if gnutls works correctly, run:

gnutls-cli --priority SECURE256:-VERS-ALL:+VERS-TLS1.2:-KX-ALL:+ECDHE-RSA:+ECDHE-ECDSA:%COMPAT -l                               
Cipher suites for SECURE256:-VERS-ALL:+VERS-TLS1.2:-KX-ALL:+ECDHE-RSA:+ECDHE-ECDSA:%COMPAT                                                      
TLS_ECDHE_RSA_AES_256_GCM_SHA384                        0xc0, 0x30      TLS1.2                                                                  
TLS_ECDHE_RSA_CAMELLIA_256_GCM_SHA384                   0xc0, 0x8b      TLS1.2                                                                  
TLS_ECDHE_RSA_CHACHA20_POLY1305                         0xcc, 0xa8      TLS1.2                                                                  
TLS_ECDHE_ECDSA_AES_256_GCM_SHA384                      0xc0, 0x2c      TLS1.2                                                                  
TLS_ECDHE_ECDSA_CAMELLIA_256_GCM_SHA384                 0xc0, 0x87      TLS1.2                                                                  
TLS_ECDHE_ECDSA_CHACHA20_POLY1305                       0xcc, 0xa9      TLS1.2                                                                  
TLS_ECDHE_ECDSA_AES_256_CCM                             0xc0, 0xad      TLS1.2                                                                  
                                                                                                                                                
Certificate types: CTYPE-X.509                                                                                                                  
Protocols: VERS-TLS1.2                                                                                                                          
Compression: COMP-NULL                                                                                                                          
Elliptic curves: CURVE-SECP384R1, CURVE-SECP521R1                                                                                               
PK-signatures: SIGN-RSA-SHA384, SIGN-ECDSA-SHA384, SIGN-RSA-SHA512, SIGN-ECDSA-SHA512

Even though the above setup looks correct, cockpit didn’t consider the setup. Is this a bug?

2 Factor Authentication

If you want to make the cockpit web interface accessible from the internet, then you should use 2 factor authentication.
https://scottlinux.com/2017/05/13/enable-two-factor-auth-for-cockpit-with-google-authenticator

Install google-authenticator from AUR:

yay libpam-google-authenticator

Then run on the cental cockpit instance server the following command as a normal user and not as root or sudo to generate a secret key.

google-authenticator

It is recommended to answer to all security questions with yes.

You will see now a barcode on the terminal. Scan it with the mobile phone.

Write down all the shown security codes in a secure location.

Single Machine Monitoring Tools

There are some system tools than can be really helpful to measure and understand your system.

CPU: htop

GPU: radeontop

HDD/SDD: hdparm or iotop

hdparm –direct -tT /dev/XXX

Network: nmap

nmap -sP 192.168.1.0/24