About ATS

ATS helps organizations transform the way they hire by delivering a modern, AI-driven recruitment platform. The solution streamlines the entire hiring lifecycle—from candidate sourcing and intelligent screening to interview management and onboarding—enabling faster, smarter, and more personalized hiring experiences.

Context: ATS was running a set of application services on EC2 instances. The platform was growing into microservices, but the infrastructure remained EC2-centric.

Problem Statement (Before)

ATS’s EC2 deployment had three core issues:

  • No High Availability
    • Services weren’t distributed across multiple Availability Zones.
    • A single instance/AZ failure could impact uptime.
  • Public Exposure
    • Workloads were reachable from the internet more than necessary, increasing security risk.
  • Over-Provisioned Compute
    • High-config EC2 instances were hosting smaller microservice workloads.
    • Resulted in idle capacity and inflated cost. 
Goals
  • Introduce multi-AZ high availability.
  • Reduce public attack surface by isolating workloads in private subnets.
  • Shift to right-sized microservices with independent scaling.
  • Achieve measurable cost optimisation (target ~25% savings).
Solution Architecture (After)

We proposed and designed a containerised microservices platform on ECS Fargate:

Core components

  • ECS Fargate Cluster to run microservices as tasks (careerpage, searchservice, bulkupload, cvparser, mailgunservice, sixsense modules, etc.).
  • Application Load Balancer (public) to handle inbound traffic routing.
  • AWS WAF in front of ALB for Layer-7 security.
  • Private subnets for ECS tasks; only ALB/WAF exposed publicly.
  • Cloud Map for service discovery between microservices.
  • CI/CD pipeline: GitHub → Build → Push to ECR → Deploy to ECS via IAM OIDC.
  • Observability & Governance: CloudWatch logs/monitoring, CloudTrail auditing, IAM, Parameter Store for secrets.

     

Supporting services retained

  • MySQL 8 on Amazon RDS/EC2, Elasticsearch Service/EC2, S3, Route53, Amplify frontend.
Implementation Approach (What we did)

Service decomposition

  • Confirmed EC2 workloads and mapped each to an ECS task definition.

Containerization

  • Built Docker images per service and published to ECR.

Networking hardening

  • ALB and WAF in public subnets.
  • ECS tasks in private subnets with controlled SG rules.

HA & scaling

  • Spread tasks across Availability Zones and set autoscaling policies.

CI/CD rollout

  • Automated build and deploy using GitHub Actions + OIDC.

Monitoring baseline

  • Centralized logs and metrics in CloudWatch.
  1.  
Results / Impact

Cost

  • ~25% infrastructure cost reduction
    • By removing oversized EC2 and paying per-task usage in Fargate.

Reliability

  • Multi-AZ task placement and ALB health checks improved availability.

Security

  • Public exposure reduced to ALB + WAF only.
  • Private microservices lowered the attack surface.

Operations

  • Faster, safer deployments through repeatable CI/CD.
  • Independent scaling per microservice.
Key Takeaways
  • EC2-based monolith hosting is expensive and fragile for microservices.
  • ECS Fargate enables right-sizing, elastic scaling, and HA without server management.
  • Security posture improves when workloads move to private subnets, and only the edge is public.