SRE Support Engineer : Job Description: -
Duties & Responsibilities:
• Support production grade SaaS and PaaS systems with systems administration, configuration, troubleshooting and monitoring.
• Evaluate Linux systems and make recommendations to improve security, scalability, performance, and availability.
• Manage the scalability and efficiency of the big data platform.
• Respond to platform problems and develop automated solutions to resolve them.
• Develop automated solutions for capacity planning and forecasting, system performance analysis, and system tuning.
• Review and influence ongoing design architecture of multiple production platforms / products.
• Should be ready to work majorly in night shifts
Qualification & Experience: -
• Should have experience of more than 5 years in handling SRE teams and Kubernetes based systems
• Insatiable desire to learn and grow, curiosity about all things technology, development, operations, and cloud
• Hands-on experience in handling large CD system with various deployment architectures and technologies like Blue/Green, Active/Active etc
• Should have experience with DR systems
• Hands-on experience deploying and managing infrastructure with Terraform, HELM
• Hands-on experience with configuration management tools like Ansible
• Hands-on experience with devops tools such as Docker, Git, Jenkins, Argo CD, GitOps
• Hands-on experience with monitoring and logging tools such as Dynatrace, Prometheus, Grafana, ELK
• Experience with and knowledge of cloud native architectures; ability to design highly available, resilient,
multi-region systems in Azure/AWS
• Experience with and deep knowledge of Linux systems
• Strong bias for action and ownership
• Basic understanding on Istio is an added advantage
• Strong experience with observability - tracing, monitoring, logging using open-source tools is a must
• Basic understanding of Linux fundamentals like tcpdump, wire sharks, kernel commands
• Certifications in Kubernetes, cloud environments will be added advantage