Question 1

Why does my AWS Lambda function timeout when connecting to RDS?

Accepted Answer

This almost always means your Lambda function can't reach the RDS instance over the network. If your RDS is in a VPC (which it should be), your Lambda must also be configured with VPC access, placed in a subnet that can route to the database's subnet, and the RDS security group must allow inbound traffic on the database port from the Lambda's security group. Also ensure the subnet has a NAT gateway if the Lambda needs internet access for other calls.

Question 2

How do I fix S3 Access Denied errors?

Accepted Answer

S3 access is controlled by the intersection of IAM policies, bucket policies, and ACLs. Check all three. Common causes: the IAM role lacks s3:GetObject or s3:PutObject permission on the specific bucket ARN, the bucket policy has an explicit deny, S3 Block Public Access settings are blocking the operation, or the object is encrypted with a KMS key your role can't use. Use the IAM Policy Simulator to test your permissions.

Question 3

What causes cold starts in serverless functions and how do I reduce them?

Accepted Answer

Cold starts occur when the cloud provider provisions a new execution environment for your function — this includes downloading your code, starting the runtime, and running initialization. Reduce them by: keeping deployment packages small, initializing database connections outside the handler, using provisioned concurrency (AWS) or minimum instances (GCP/Azure), choosing lighter runtimes (Node.js/Python over Java/.NET), and avoiding VPC attachment unless necessary (VPC cold starts add extra time on AWS).

Question 4

How do I troubleshoot cloud networking issues between services?

Accepted Answer

Follow the OSI model from bottom up: verify the services are in the same VPC (or have peering/PrivateLink), check route tables for a path between subnets, verify security groups allow traffic on the correct port and protocol, confirm NACLs aren't blocking (they're stateless — check both inbound and outbound), and test DNS resolution. Use VPC Flow Logs (AWS), NSG Flow Logs (Azure), or VPC Flow Logs (GCP) to see if traffic is being accepted or rejected.

Question 5

Why is my cloud bill higher than expected?

Accepted Answer

Unexpected costs usually come from: data transfer charges (especially cross-region or to the internet), resources running in regions you forgot about, unattached EBS volumes or static IPs, NAT gateway data processing fees, or over-provisioned instances. Use Cost Explorer (AWS), Cost Management (Azure), or Billing Reports (GCP) to identify the top cost drivers. Set up billing alerts to catch surprises early.

Question 6

How do I handle multi-region deployments and failover?

Accepted Answer

Start with DNS-based failover using Route 53 (AWS), Traffic Manager (Azure), or Cloud DNS (GCP). Replicate your data layer across regions (RDS read replicas, Cosmos DB multi-region, Cloud Spanner). Use infrastructure-as-code to ensure environments are identical across regions. Test failover regularly — an untested disaster recovery plan is not a plan. Keep in mind that cross-region data transfer has both latency and cost implications.

Symptom	Likely Cause	First Step
Access Denied / 403 Forbidden	IAM policy missing or denying the action	Check CloudTrail/Activity Log for the denied API call; review IAM policies
Function timeout	Cold start + processing exceeds timeout limit	Increase timeout setting; add provisioned concurrency; optimize init code
Cannot connect to database from Lambda/function	VPC/subnet/security group misconfiguration	Verify Lambda is in same VPC; check security group inbound rules on port
Container fails health check	App not ready before health check deadline	Increase health check grace period; optimize startup time; check health endpoint
S3 Access Denied on upload/download	Bucket policy or object ACL blocking access	Review bucket policy + IAM policy; check bucket ownership and encryption settings
Rate exceeded / throttling (429/503)	Service quota or request rate limit hit	Request quota increase; implement backoff; distribute load across regions
EC2 instance unreachable	Security group or NACL blocking traffic	Check inbound rules; verify instance has public IP or is behind a load balancer
DNS resolution failure in VPC	DNS settings disabled on VPC or missing hosted zone	Enable DNS hostnames/resolution on VPC; associate private hosted zone

Cloud Infrastructure Errors: AWS, Azure & GCP Troubleshooting Guide

Browse by Category

Common Patterns & Cross-Cutting Themes

IAM & Permission Errors

Resource Limits & Throttling

Networking & Connectivity Issues

Cold Starts & Timeout Errors

Quick Troubleshooting Guide

Category Deep Dives

AWS API Gateway

AWS CloudFront

AWS EC2

AWS ECS

AWS EKS

AWS Lambda

AWS RDS

AWS S3

Azure Functions

Azure VM

GCP Cloud Functions

GCP Cloud Run

Other

Frequently Asked Questions