Summary
Overview
Work History
Education
Skills
Websites
Projects
Timeline
Generic

JAIN UTKARSH

Site Reliability Engineer

Summary

Enthusiastic Site Reliability Engineer eager to contribute to team success through learning and implementing. With 5 plus years of experience of automation projects within different fields of engineering, I have developed strong belief in community based projects, learning and importance of details. Motivated to learn, grow and excel.

Overview

8
8
years of professional experience
5
5
years of post-secondary education

Work History

Site Reliability Engineer

Grasshopper Asia Pte Ltd
04.2024 - Current
  • Developed and deployed prometheus exporter for monitoring of Puppet runs and Puppet control nodes (Puppet server, Puppetdb)
  • Deployed monitoring and alerting for the health of ELK cluster running in GKE. This not only includes Elasticsearch nodes and indices but also health of different Beats and Logstash instances
  • Continuously worked on huge tech debt, which includes old terraform modules, centos to almalinux, GCP IAM permissions, puppet modules etc

Senior Site Reliability Engineer

Vortexa
8 2022 - 04.2024
  • Created and deployed a highly scalable solution to implement and enforce different rate limits for different set of clients
  • Created a scalable solution to use stackgres and traefik on kubernetes so that applications can use postgres on kubernetes for every branch.
  • Completed a set of comprehensive PoCs with Crossplane, Terraform operator for kubernetes and ACK, to implement server side tooling todeploy cloud infrastructure with simpler templates.
  • Upgraded all the terraform modules and states from 0.12 to 1.3.5.
  • Tech debt: Upgraded multiple tooling and technologies including EKS, atlantis, helm, terragrunt, stackgres, terraform.
  • On call: Support on call single handedly on Asian timezones.

Site Reliability Engineer

Grab (GXS Bank)
07.2021 - 08.2022
  • Designed and deployed infrastructure in AWS (Production and Staging) to create IKEv2 tunnel between GXS bank and BCS (Banking ComputerSystems) using CheckPoint firewalls in cloud and Direct Connects.
  • Designed and deployed infrastructure for TimesScale databases and Aerospike Databases for Risk assessment data applications.
  • Created automated way for EKS upgrades on Control Plane and Worker nodes.
  • Created (with team) Staging, QA and Production environment from scratch.
  • Successfully migrated (with team) CD from Jenkins to ArgoCD for much simpler and developer friendly infrastructure deployment.
  • Did my bit in creating a bank from scratch.

System Design Engineer

SingTel
08.2019 - 07.2021
  • Lead design and implementation of Automation (Terraform, Ansible, Python), Log Analysis (Elastic Stack), Monitoring tools (TICK).
  • Designed, Developed and Deployed health monitoring stack for all network devices, DNS, DHCP, RHEL servers and appliances in PNE usingTelegraf-InfluxDB-Grafana and Ansible.
  • Developed tool for CIS (Center for Internet Security) benchmark audit for Cisco, Juniper and Redhat nodes using Ansible.
  • Designed and Deployed stack for DNS logging to support 1.2 Million QPS requests across IPNE.

Assistant Manager

Bharti Airtel
06.2017 - 06.2018
  • Developed a Python based tool for planning of 13,000 LTE-TDD sites.
  • Designed real time KPI analysis system for mass events using real time data acquisition through RESTful API.
  • Reduced PCI/RSI collision from more than 23% cells to 0% cells by effective use of spatial analysis tools.

Senior Executive

Bharti Airtel
06.2016 - 06.2017
  • Developed a MS Excel and VBA based tool for end to end analysis of "Worst Performing Cells in 3G Network".
  • Developed Python based tool for KPI analysis for radio networks for corporate customers and highways.
  • Analysis and optimization of VRRP timers for EPC edge routers.

Education

MSc - Communication Engineering

Nanyang Technological University
08.2018 - 05.2019

B. Tech - Electronics and Communication

Motilal Nehru National Institute Of Technology
07.2012 - 05.2016

Skills

Operating Systems: Linux

Version Control System: Github, Gitlab

Cloud: AWS, GCP

Technologies, Tools and Protocols: Kubernetes, Elasticsearch, Terraform, Istio, Puppet, Crossplane, Kafka, ArgoCD, Ansible, DNS, Flask, Grafana, New Relic

Languages: Golang, Python

Projects

Design, Implementation and hosting of LitTeach Android App, Designed and developed backend of REST based school management and video classroom android app, LitTeach. Designed and hosted the solution on GCP. Reporting Plugin for OSS Grafana, Developed a GO based plugin to generate reports and alert reports from Grafana dashboards. HA service for OSS Telegraf, Developed python script to ensure HA between distributed open sourced Telegraf instances.

Timeline

Site Reliability Engineer

Grasshopper Asia Pte Ltd
04.2024 - Current

Site Reliability Engineer

Grab (GXS Bank)
07.2021 - 08.2022

System Design Engineer

SingTel
08.2019 - 07.2021

MSc - Communication Engineering

Nanyang Technological University
08.2018 - 05.2019

Assistant Manager

Bharti Airtel
06.2017 - 06.2018

Senior Executive

Bharti Airtel
06.2016 - 06.2017

B. Tech - Electronics and Communication

Motilal Nehru National Institute Of Technology
07.2012 - 05.2016

Senior Site Reliability Engineer

Vortexa
8 2022 - 04.2024
JAIN UTKARSHSite Reliability Engineer