• Provide clear details of Customer-facing outages to field/service delivery teams to communicate with Customers.
• Lead post-mortem meeting so that improvements are built into the product to avoid a future outage
• Strong analytical skills to understand production system metrics, drive change, optimize system utilization and drive cost efficiency
• Production rollout of new releases
• Constantly improve Monitoring/Alerting posture so as to proactively detect issues before customers report it.
• Build reports/dashboard/KPI and lead monthly operations review. Some examples include but are not limited to - Platform/Application/Infrastructure KPIs, security reports, audit reports.
• Support Application, OS and database patches/updates
• Execute security tools, analyze vulnerabilities/findings and work towards remediating it by working with Dev/DevOps team
• Key stakeholders to participate in case of IR (Incident Response), DR (Disaster Recovery)
• Improve the process posture from change management, postmortem, support workflow, ticketing system etc.Requirements:
• Distributed application monitoring using Splunk, Prometheus, build KPI, and PagerDuty is much much-desired
• Fundamental knowledge of cloud security is MUST
• Possesses excellent verbal and written communication skills and the ability to interact professionally with a diverse group of developers, product owners, and subject matter experts.
• Strong cross-functional collaboration skills, relationship-building skills, and ability to achieve results without direct reporting relationships
• Has a track record as a coach, mentor, and developer of talent
• Ability to quickly identify and drive to the optimal solution when presented with a series of constraints
• Excellent judgment, analytical thinking, and problem-solving skills
• Self-motivated individual who possesses excellent time management and organizational skills
• Strong sense of personal responsibility and accountability for delivering high-quality work.
• Bachelor's or Master's degree in Computer Science, Computer or Electrical Engineering, Mathematics, or a related field.
GlobalLogic estimates the starting pay range for this role to be performed Remotely and the salary range will be $120,000/yr to $130,000/yr and reflects base salary only. This pay range is provided as a good faith estimate and the amount offered may be higher or lower. GlobalLogic takes many factors into consideration in making an offer, including candidate qualifications, work experience, operational needs, travel and onsite requirements, internal peer equity, prevailing wage, responsibilities, and other market and business considerations.Preferences:
• In this role you will be responsible for Operating and Managing production and staging cloud platforms, responsible for Ops (executing runbook/SOP/ Maintain up-time/SLA) as well as Site Reliability engineering.
• You will constantly work to build and improve CICD pipelines, monitoring/alerting capabilities, reliability capabilities etc.
• Maintain high - 99.99 uptime and bring required improvements to get there.
• Resolve alerts as per runbook and/or drive the resolution of the alert by collaborating with different teams and stakeholders.
• Update relevant runbooks post-resolution with relevant future resolution steps. Aim to automate runbooks as much as possible.
• You will work towards reducing the number of alert escalations to next level team - dev/DevOpsJob Responsibilities:
• Operate, manage and enhance the pre-production/staging environment. The purpose of pre-production/staging activities is to do continuous improvements which can be rolled out to production
• 4-6 years of software development experience with the last 3-4 years of solid DevOps/SRE experience building Infrastructure as a code (IaaC) and a security-first approach
• Experience working in a Production environment with production change management, ticketing system escalation, emergency patching etc.
• Hands-on experience with deploying applications using AWS services like Elastic Beanstalk and Cloudformation. EC2/VPC/NAT/IG/Route53, S3, Secret Manager, IAM, ELB/NLB, Serverless/Lambda, SES, CloudFront, Cloud Formation, Docker Container
• 2-3 years+ solid hands-on Kubernetes, Containers, K8 operator/side car components, stateful, k8 security is a MUST. K8 certification desired.
• Must be well versed with scripting in Powershell, C# and configuring Windows environments.
• Must have solid 2-3 years Cloud experience; AWS/AWS Certification preferred.
• Must have worked across breadth and depth of DevOps cycle - Orchestration and Configuration Management, CICD, Monitoring, Security
• Hands-on experience with a few of these - Oracle (working experience required), good to have - Kafka, Aurora RDS, Cassandra, Dynamodb, Neo4j, S3.
• Microsoft Windows experience is a MUST, Linux and security preferred
• Automation first approach, must have a proven track record of automating large-scale, complex distributed systems.What We OfferExciting Projects:
Come take your place at the forefront of digital transformation! With clients across all industries and sectors, we offer an opportunity to work on market-defining products using the latest technologies.Collaborative Environment:
You can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment - or even abroad in one of our global centers or client facilities! Work-Life Balance:
GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules and opportunities to work from home.Professional Development:
We provide continuing education classes, professional certification and training (technical, soft skills, language, and communication skills) to help you realize your professional goals. Being part of a global organization, there are additional learning opportunities through international knowledge exchanges.Excellent Benefits:
We provide our employees with competitive salaries, health and life insurance, short-term and long-term disability insurance, a matched contribution 401K plan, flexible spending accounts, and PTO and holidaysAbout GlobalLogic
GlobalLogic is a leader in digital engineering. We help brands across the globe design and build innovative products, platforms, and digital experiences for the modern world.
By integrating experience design, complex engineering, and data expertise-we help our clients imagine what's possible, and accelerate their transition into tomorrow's digital businesses.
Headquartered in Silicon Valley, GlobalLogic operates design studios and engineering centers around the world, extending our deep expertise to customers in the automotive, communications, financial services, healthcare and life sciences, manufacturing, media and entertainment, semiconductor, and technology industries.
GlobalLogic is a Hitachi Group Company operating under Hitachi, Ltd. (TSE: 6501) which contributes to a sustainable society with a higher quality of life by driving innovation through data and technology as the Social Innovation Business.