Hitachi Vantara, a wholly-owned subsidiary of Hitachi, Ltd., guides our customers from what's now to what's next by solving their digital challenges. Working alongside each customer, we apply our unmatched industrial and digital capabilities to their data and applications to benefit both business and society. More than 80% of the Fortune 100 trust Hitachi Vantara to help them develop new revenue streams, unlock competitive advantages, lower costs, enhance customer experiences, and deliver social and environmental value.The Role
We are seeking Principal SRE Engineer
Experience : 8-12 Years Job Responsibilities:
- Design, implement and support large scale infrastructure with monitoring, logging, and alerting with promised uptime
- Engage in and improve the whole lifecycle of services-from inception and design through deployment, operation, and refinement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity management, and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless post-mortems
- Be part of an on-call rotation to support production systems.
Qualifications Any Engineering Graduate
- Experience in Kubernetes to deploy scale, load balance and manage Docker containers with multiple names spaced versions.
- Must have experience with containers and / or container management systems, such as Kubernetes, Docker, etc.
- Deep knowledge of build automation tools, such as Jenkins, Ansible, etc. is required.
- Good knowledge of cloud deployment tools, such as Terraform.
- Strong Technical background with an ability to troubleshoot issues impacting large scale service architectures and application stacks.
- Experience leveraging cloud architecture, applying site reliability principles, and/or demonstrating sensitivity to operational concerns
- Familiarity with large scale cloud networking infrastructure, including network architectures, DNS, TCP/IP protocols, firewall management, routing, switching, ACLs, SSL/TLS
- Cloud Service monitoring and incident management
We are an equal opportunity employer. All applicants will be considered for employment without attention to age, race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.