Error Medic

Ultimate Linux Troubleshooting Guide: Fixing 'Permission Denied', 'Connection Refused', '502 Bad Gateway', and 'OOM Killer'

Comprehensive SRE guide to resolving common Linux system errors: permission denied, port 22 connection refused, 502 Bad Gateway across stacks, and Linux OOM.

Last updated:
Last verified:
2,099 words
Key Takeaways
  • 'Permission denied' errors usually stem from incorrect file ownership (chown), missing execution bits (chmod +x), or SELinux/AppArmor enforcing policies.
  • 'Connection refused' on Port 22 or 80 is often caused by stopped services (sshd, nginx/apache), restrictive firewall rules (iptables, ufw, AWS Security Groups), or services bound to localhost instead of 0.0.0.0.
  • '502 Bad Gateway' indicates a reverse proxy (Nginx/OpenResty/Apache) cannot communicate with the upstream application server (Node.js, Gunicorn, PHP-FPM) due to the upstream crashing, timing out, or listening on the wrong socket/port.
  • Resource exhaustion manifests as 'OOM killer' terminating processes to free memory, or database 'Too many connections' when connection pools run dry. Monitoring and resource limits are key fixes.
Common System Errors and Initial Triage Methods
Error SymptomPrimary Root Cause CategoryDiagnostic ToolFirst Remediation Step
permission denied linuxFile System / Security Contextls -la, getfacl, dmesg (for SELinux)Check user execution bit (chmod +x) or chown to current user.
port 22 connection refusedNetworking / Daemon Statesystemctl status sshd, netstat -tulpnVerify SSH daemon is running and Security Groups/Firewall allow port 22.
502 bad gatewayReverse Proxy / Upstreamtail -f /var/log/nginx/error.logRestart upstream service (e.g., systemctl restart php8.1-fpm or pm2 restart all).
oom killer linuxMemory / Resource Limitsdmesg -T | grep -i oomAdd swap space or optimize application memory footprint.
1040 too many connectionsDatabase Connection PoolingSHOW PROCESSLIST; (MySQL)Increase max_connections or implement a connection pooler like PgBouncer.

Comprehensive Linux System Diagnostics

As a DevOps engineer or SRE, encountering errors is a daily reality. However, the vast majority of server outages, deployment failures, and application crashes boil down to a handful of fundamental categories: permission issues, network connectivity failures, reverse proxy upstream errors, and resource exhaustion. This guide provides a deep dive into diagnosing and resolving these critical failure modes.

1. Resolving 'Permission Denied' Errors

The permission denied linux error is perhaps the most ubiquitous hurdle for developers and system administrators. Linux relies on a strict Discretionary Access Control (DAC) system, often augmented by Mandatory Access Control (MAC) like SELinux or AppArmor.

Shell Scripts and Executables

When you encounter permission denied shell script or zsh permission denied kali linux upon trying to run a script, the issue is almost always a missing executable bit. Files created or downloaded do not automatically have execute permissions for security reasons.

Diagnosis: Run ls -la script.sh. If the output looks like -rw-r--r--, there is no x (execute) flag.

Resolution:

chmod +x script.sh
./script.sh

This same principle applies when installing packages manually, such as facing google chrome stable_current_amd64 deb permission denied. Always ensure you are running installation commands with elevated privileges (sudo dpkg -i package.deb) and that the file itself is readable.

Repository and Configuration Files

Errors like bash etc yum repos d kubernetes repo permission denied usually happen when standard users try to write to system directories. Using sudo echo "..." > /etc/... will fail because the shell redirection (>) happens before sudo escalates privileges.

Resolution: Use tee to write to protected files:

echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
Advanced Permissions: Azure, NTFS, and Kali
  • azure cloud shell permission denied: Often relates to mounted storage. Ensure the storage account linked to your Cloud Shell hasn't had its IAM roles modified.
  • ntfs deny permissions: When mounting Windows NTFS drives in Linux, standard chmod won't work. You must specify the uid, gid, dmask, and fmask options in your /etc/fstab or mount command.
  • kali linux permission denied / ubuntu permission denied: When operating in highly secure or hardened distributions, always verify if AppArmor or SELinux is actively blocking an action, even if standard file permissions look correct. Check dmesg or /var/log/audit/audit.log.

2. SSH and Network 'Connection Refused' Errors

A connection refused error implies that the network packet reached the destination server, but no service was actively listening on that port, or a firewall actively rejected the connection (though firewalls often drop packets resulting in a timeout, an explicit REJECT rule causes a connection refused).

SSH: Port 22 Connection Refused

Whether it's ec2 port 22 connection refused, cpanel port 22 connection refused, or a generic port 22 connection refused, being locked out of your server is a critical emergency.

Diagnosis & Resolution:

  1. Is the service running? If you have out-of-band access (like AWS Systems Manager or a hypervisor console), run sudo systemctl status sshd. If it's dead, sudo systemctl start sshd.
  2. Is it listening on the right interface? Check /etc/ssh/sshd_config to ensure ListenAddress is correct and it isn't bound strictly to a local interface if you need external access.
  3. Firewalls and Security Groups: The most common cause for aws port 22 connection refused or port 22 connection refused ec2 is an AWS Security Group that doesn't allow inbound TCP on port 22 from your current IP address. Update the SG rules via the AWS Console.
  4. Cisco and Network Gear: If you see cisco connection refused or the remote system refused the connection cisco (often seen in putty connection refused scenarios), verify that the management plane hasn't locked out your IP due to failed login attempts (ACLs or SSH rate limiting).
Public Key Authentication Denied

Errors like ubuntu permission denied publickey or git github com permission denied publickey linux mean the server rejected your SSH key.

Resolution for GitHub:

  1. Ensure your key is loaded: ssh-add -l.
  2. Ensure the key is added to your GitHub account.
  3. Test the connection: ssh -T git@github.com.

Resolution for Ubuntu/EC2: Ensure the private key permissions are secure (chmod 400 private_key.pem) and that the corresponding public key is correctly placed in ~/.ssh/authorized_keys on the server.

Web Traffic: Port 80 Connection Refused

If you encounter aws port 80 connection refused or nodejs connection refused, verify that your web server (Nginx/Apache) or Node.js application is actively running and bound to 0.0.0.0 (all interfaces) rather than 127.0.0.1 (localhost only).

3. Untangling '502 Bad Gateway' Errors

The 502 Bad Gateway error is the bane of modern web architectures. It means that an edge server or reverse proxy (Nginx, Apache, OpenResty, Tengine, AWS ALB) received an invalid response from the upstream application server.

The Anatomy of a 502

When Nginx (openresty 502 bad gateway, 502 bad gateway ubuntu) passes a request to a backend, it expects a timely, valid HTTP response. If the backend is dead, crashing, or takes too long, Nginx throws a 502.

Framework-Specific 502 Troubleshooting
  • Node.js (node js 502 bad gateway, 502 bad gateway node js): The Node application (Express/NestJS) likely crashed due to an unhandled exception or ran out of memory. Check PM2 logs (pm2 logs) or systemd logs (journalctl -u my-node-app). Ensure your reverse proxy configuration points to the correct local port.
  • Django & Python (django 502 bad gateway, 502 bad gateway elastic beanstalk django): In WSGI setups (Gunicorn/uWSGI), a 502 usually means Gunicorn is not running, or it crashed while processing a heavy request. In AWS Elastic Beanstalk, check the /var/log/eb-engine.log and /var/log/web.stdout.log. Often, this is caused by a syntax error in the code causing the WSGI worker to fail on boot.
  • PHP & PHP-FPM (php bad gateway, phpmyadmin 502 bad gateway, valet 502 bad gateway): Nginx communicates with PHP via a Unix socket (e.g., /run/php/php8.1-fpm.sock) or a TCP port. A 502 means the PHP-FPM service is down, or the socket path in your Nginx config (fastcgi_pass) doesn't match the actual socket path. Restart PHP-FPM.
  • Magento 2 (502 bad gateway magento 2, magento 502 bad gateway): Magento is resource-intensive. A 502 often occurs during compilation (setup:di:compile), caching, or heavy Elasticsearch queries that cause the PHP process to time out. Increase max_execution_time and memory_limit in php.ini.
  • Monitoring Tools (librenms 502 bad gateway): LibreNMS relies on PHP-FPM and a database. If the poller consumes all resources, PHP-FPM might drop connections. Check the LibreNMS validate script (./validate.php).
  • Alternative Proxies (openresty 502, 502 bad gateway tengine): These Nginx forks operate similarly. Check their respective error logs. Often, Lua scripts in OpenResty might crash, leading to a gateway error.

4. Resource Exhaustion: OOM Killer and Connection Limits

When a server runs out of critical resources, the symptoms can be chaotic, ranging from dropped connections to suddenly terminated processes.

Out of Memory: linux oom and oom killer linux

The Linux kernel has an Out-Of-Memory (OOM) killer. When the system's RAM and swap are exhausted, the kernel steps in and terminates a process to save the system from a complete crash. It uses a heuristic to pick the "badness" of a process, often targeting memory-hungry applications like Java, MySQL, or Node.js.

Diagnosis: If your service mysteriously restarts or you get a 502 Bad Gateway, always check if the OOM killer was invoked:

sudo dmesg -T | grep -i 'killed process'

Resolution:

  1. Add Swap Space: fallocate -l 2G /swapfile && mkswap /swapfile && swapon /swapfile.
  2. Optimize application memory usage.
  3. Increase the server's physical RAM.
  4. Tune oom_score_adj to protect critical services (use with extreme caution).
Database Exhaustion: 1040 too many connections

Databases have a hard limit on concurrent connections. If your application leaks connections or experiences a massive traffic spike, you will see 1040 too many connections (MySQL/MariaDB) or similar errors in PostgreSQL, often manifesting as too many connections rds in AWS.

Diagnosis: Log into the database (if possible) and run SHOW PROCESSLIST; (MySQL) or check pg_stat_activity (PostgreSQL) to see where connections are coming from.

Resolution:

  1. Immediate: Restart the application servers to forcefully close dangling connections, or temporarily increase max_connections in your database parameter group/config.
  2. Long-term: Implement connection pooling. Use PgBouncer for PostgreSQL or ProxySQL for MySQL. Ensure your application framework is correctly configured to return connections to the pool after use.

By systematically checking logs, understanding the flow of network traffic, and monitoring system resources, DevOps teams can quickly identify the root cause of these common Linux system errors and restore service reliability.

Frequently Asked Questions

bash
#!/bin/bash
# Quick SRE Diagnostic Script for Linux Servers

echo "--- Checking for recent OOM Killer invocations ---"
dmesg -T | grep -iE "out of memory|killed process" | tail -n 5

echo -e "\n--- Checking listening ports (investigate connection refused) ---"
sudo ss -tulpn | grep -E ":22|:80|:443|:3000|:8080"

echo -e "\n--- Checking Nginx Error Logs (investigate 502 Bad Gateway) ---"
sudo tail -n 10 /var/log/nginx/error.log 2>/dev/null || echo "Nginx logs not found."

echo -e "\n--- Checking Memory Usage ---"
free -h
E

Error Medic Editorial

Written by our team of Senior DevOps and Site Reliability Engineers. We specialize in untangling complex Linux, cloud, and containerized infrastructure issues, providing actionable solutions for developers and sysadmins.

Sources

Related Articles in Other

Explore More Linux Sysadmin Guides