DevOps Engineer #609236 IND- MH- Pune

DevOps Engineer

IND- MH- Pune

Apply
Employee Referral
Tell Us Who You Are

First Name
Last Name
E-Mail Address

Please complete all 3 fields.

How Do You Want to Share?
Employee? Click Here to Refer

Job Description

About This Role:

We are hiring a hands-on DevOps Engineer to manage and support production-grade cloud infrastructure for Kibo’s commerce platform. This role focuses on Kubernetes (EKS), Terraform, and real-time production troubleshooting in a 24/7 on-call environment.

ABOUT KIBO

KIBO is a composable digital commerce platform for B2C, D2C, and B2B organizations who want to simplify the complexity in their businesses and deliver modern customer experiences. KIBO is the only modular, modern commerce platform that supports experiences spanning B2B and B2C Commerce, Order Management, and Subscriptions. Companies like Ace Hardware, Zwilling, Jelly Belly, Nivel, and Honey Birdette trust Kibo to bring simplicity and sophistication to commerce operations and deliver experiences that drive value.

KIBO's cutting-edge solution is MACH Alliance Certified and has been recognized by Forrester, Gartner, IDC, Internet Retailer, and TrustRadius. KIBO has been named a leader in The Forrester Wave™: Order Management Systems, Q1 2025 and in the IDC MarketScape report “Worldwide Enterprise Headless Digital Commerce Applications 2024 Vendor Assessment”.

By joining KIBO, you will be part of a team of Kibonauts all over the world in a remote-friendly environment. Whether your job is to build, sell, or support KIBO’s commerce solutions, we tackle challenges together with the approach of trust, growth mindset, and customer obsession. If you’re seeking a unique challenge with amazing growth potential, then come work with us!

WHAT YOU’LL DO

Manage and operate production-grade Kubernetes clusters (EKS preferred), ensuring high availability and scalability
Troubleshoot real-time production issues across distributed systems and microservices
Diagnose and resolve issues such as:
- Pod failures (CrashLoopBackOff, Pending, OOMKilled)
- Node failures, autoscaling, and resource constraints
- Networking, ingress, and service connectivity issues
Build, maintain, and debug infrastructure using Terraform (modules, remote state, locking, drift handling)
Implement and enhance monitoring & alerting systems using Prometheus, Grafana, and related tools
Perform root cause analysis (RCA) for incidents and drive permanent fixes to improve system reliability
Participate in a 24/7 on-call rotation, owning incidents and resolving them independently
Collaborate with engineering teams to improve system performance, resilience, and deployment processes
Automate deployments, infrastructure provisioning, and operational workflows to reduce manual effort
Ensure adherence to security best practices across infrastructure and deployments

Skills & Requirements

WHAT YOU’LL NEED

8 + Years of experience as a Developer Engineer, owning and operating production Kubernetes clusters (EKS preferred), including cluster health, scaling, and availability
Troubleshoot real-time production issues independently across microservices and distributed systems
Debug and resolve critical issues such as:
- Pods stuck in CrashLoopBackOff, Pending, OOMKilled states
- Node failures, node pressure, autoscaling issues
- Service connectivity, ingress, and networking issues
Investigate and fix cluster-level issues including scheduling, resource constraints, and misconfigurations
Build and maintain infrastructure using Terraform, including:
- Writing and modifying modules
- Managing remote state and locking
- Handling drift and failed deployments
- Design and implement reusable Terraform modules for scalable infrastructure
- Troubleshoot and resolve Terraform apply failures and infrastructure inconsistencies in production
Monitor system health using Prometheus, Grafana, and logging tools, and proactively identify issues
Perform root cause analysis (RCA) for production incidents and implement long-term fixes
Handle on-call incidents (24/7 rotation) and take full ownership until resolution
Work closely with development teams to improve system reliability, performance, and scalability
Automate operational tasks and improve deployment and infrastructure processes
Ensure security best practices across infrastructure, networking, and access controls

KIBO PERKS

Flexible schedule and hybrid work setting
Paid company holidays and global volunteer holiday
Generous health, wellness, benefits, and time away programs

Commitment to individual growth and development and opportunity for internal mobility
Passionate, high-achieving teammates excited to help you succeed and learn
Company-sponsored events and other activities

At Kibo we celebrate and support all differences. Kibo is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital, disability, and veteran status.

Qualifications