Linux System Administration

  • fix lvm device excluded by a filter

    Resolving LVM Error: Device Excluded by a Filter During pvcreate

    The Problem: The issue arose while provisioning a 2TB SAN LUN on a customer’s bare-metal Oracle Linux 8 server. The drive mounted as /dev/sdb but when running the command pvcreate /dev/sdb LVM responded with the error message “Device /dev/sdb excluded by a filter.” Nothing had been changed in the filter rules, but there was something on this specific drive that caused the LVM filter to block access…

    Read More »
  • fix ssh connection closed by remote host timeout

    Fixing Intermittent SSH Connection Closed by Remote Host Dropouts

    The Problem: I would connect with SSH to a staging environment, begin the process of performing a prolonged database migration, then switch over to another terminal and later would have a disconnection from the server; I would now be told “Connection Closed by Remote Host” with absolutely no warning or chance for me to recover the migration that was midway through.…

    Read More »
  • debug linux oom killer dmesg

    Debugging Out of Memory (OOM) Killer Terminations via dmesg Logs

    The first time I lost my MariaDB instance was around 3 AM. I woke up to see the monitoring dashboard completely flat. The server logs provided no indication of multiple access attempts. They represented zero in terms of activity. The only activity recorded was that the database server had been “panting” right before it was killed by the OOM (out-of-memory)…

    Read More »
  • troubleshoot systemd-journald high cpu usage

    Diagnosing High CPU Usage and File Corruption in systemd-journald

    I distinctly recall the 7 am morning when I entered work and discovered a production server with all 16 cores sitting at 100 percent of their dedicated processing power. The entire dashboard monitoring this server was an all-red display, and my on-call mobile phone had already gone off on two different occasions prior to my arrival. Logging in to this…

    Read More »
  • fix kernel panic vfs unable to mount root fs

    Resolving Kernel Panic: VFS Unable to Mount Root File System After Update

    You just completed apt upgrade. The server reboots hard stopping at “Kernel panic – not syncing: VFS: Unable to mount root fs on unknown-block(0,0)”. By now your Slack is a blaze & all you see is tons of messages on console, & none of them make sense. You know it is something between bootloader & initramfs and time is of essence.…

    Read More »
  • configure linux auditd rules monitoring

    Deploying Auditd Rules for System Call Monitoring and Compliance

    The moment I realized that we had been “flying blind” actually happened just after I received an admin ping from a junior admin about unusual CPU usage on one of our production web app servers. There was no other indication of an issue such as anomalous network traffic or a suspicious process running in “top”, all that was there were…

    Read More »
  • configure chroot jail sftp linux

    Implementing Secure Chroot Jails for SFTP Users via sshd_config

    I’m haunted by the memory of that frantic phone call from a small business owner who needed help. His contract with a freelance developer was coming to an end in two weeks. The freelance developer requested access to the shared staging environment via SFTP to upload files, but the client wanted to limit their access and prevent them from snooping…

    Read More »
  • automate lvm expansion ansible playbook

    Automating Logical Volume Management (LVM) Disk Expansion via Ansible Playbooks

    At 12 a.m., I remember looking at a dashboard and seeing a critical application down due to /var/lib/mysql disk space having reached 100%. I had put together an LVM resize script using Bash months ago, and the script had failed without a message indicating failure when I mistakenly omitted the pvresize command for the newly added disk. The result of that night was…

    Read More »
  • setup haproxy keepalived high availability

    Architecting a High-Availability Load Balancer with HAProxy and Keepalived

    A few months ago, my self-hosted Nextcloud instance went offline. It wasn’t due to a malfunctioning hard drive or a poorly designed network loop. The issue was caused by an HAProxy VM that became unresponsive. Until I went to the server and performed a hard reboot, I couldn’t access any of my services. That experience led me to implement redundancy…

    Read More »
  • configure rsyslog tls centralized logging

    Configuring Secure Centralized Logging Using Rsyslog and TLS Certificates

    The memories of three separate SSH sessions I had to connect to three different virtual machines (VMs) in order to look through their ‘/var/log/auth.log’ files are pretty clear in my head as of now. I’m saying that I’m currently looking at the logs of three different VMs in order to determine what caused the failed brute-force attack on SSH. As…

    Read More »
Back to top button