Cloud Engineer
About Candidate
Introduction:
A DevOps/SRE Engineer with 9 years of experience designing, implementing, and managing large-scale infrastructure across on-premises and cloud platforms (AWS, Azure, GCP). Proficient in Kubernetes, Rancher, and OpenShift for container orchestration, with expertise in Infrastructure as Code using Terraform and GitHub Actions. Experienced in building CI/CD pipelines, automating deployments, and optimizing system performance. Skilled in database administration, including PostgreSQL, MySQL, Cassandra, and MongoDB. Developed and maintained monitoring solutions using DataDog, Prometheus, and Grafana. Led infrastructure migration projects, moving legacy applications to orchestrated cloud environments. Improved platform security by integrating SSDLC and implementing best practices for AI and microservices deployments. Managed high-availability solutions for mission-critical applications, ensuring scalability and reliability. Strong experience in incident management, troubleshooting, and optimizing middleware services. Adept at collaborating with cross-functional teams to enhance service delivery and operational efficiency.
Responsibilities:
- Designed and deployed Kubernetes clusters on Azure, AWS, and GCP for AI, financial, and enterprise applications.
- Built and automated Infrastructure as Code (IaC) pipelines using Terraform, GitHub Actions, and Azure DevOps.
- Developed and maintained CI/CD pipelines for microservices and AI models, ensuring seamless deployments.
- Implemented monitoring and observability solutions using DataDog, Prometheus, Grafana, and ELK Stack.
- Migrated applications from legacy VMs to containerized environments, improving scalability and performance.
- Managed PostgreSQL, MySQL, Cassandra, and MongoDB clusters, optimizing for large-scale workloads.
- Ensured security compliance by integrating SSDLC, vulnerability assessments, and access controls.
- Automated cloud infrastructure provisioning, configuration management, and deployment workflows.
- Troubleshot and resolved production issues, minimizing downtime and optimizing system reliability.
- Worked closely with developers, security teams, and stakeholders to enhance DevOps and cloud strategies.