Summary
Overview
Work History
Education
Skills
Languages
Certification
Timeline
Generic

Parthiban Murugan

Senior Software Engineer
Singapore

Summary

Overall i have 12 years of Experience in IT Industry, Where i have worked on various backend technologies like Scala,Python, Java,Spring Boot/Batch,SQL,PL/SQL,Hadoop Ecosystem ,Apache Spark,Kafka ,AWS,GCP and Azure.

For Past 7 Years i have been working in Data Engineering areas like On-Premise Datalake(Hortonworks) and Azure Datalake.I did worked on Migrating the On-Premise Datalake to Azure Datalake where we have used Spring Batch/Azure API to Feed the Data to Azure Gen2 Storage,Minio created Spark jobs for Normalizing and Transforming the data in Datalake. Created Multiple Avro Models for Data Modeling and Transformation.

Have experience on Handing huge amount of data in Datalake and have exposure on Collecting the data, Cleansing and Storing the data in the optimized way in On-Premise Datalake

Worked on Creating the DataPipe lines to load the data from External System to Data Lake and worked on Batch Process that does data cleansing and data aggregation to be used in the data marts using Apache Spark , Hive, Impala,Snowflake and BigQuery.

Worked on performance tuning of SQL in batch processing..

Have Experience on building CI/CD pipelines using Jenkins and Kubernetes, I have used Kubernetes to deploy the services and Application


Have Experience in Leading and mentoring the team of 5

Overview

12
12
years of professional experience
7
7
years of post-secondary education
1
1
Certification

Work History

Senior Data Engineer

DBS Bank(Contarct)
6 2023 - Current

Technologies Used:

  • Scala/Python
  • Apache Spark/PySpark
  • Java(Spring Boot Microservices)
  • Apache Kafka
  • Hive & Impala
  • MariaDB
  • Airflow/TWS Scheduler
  • Shell Scripting(UNIX)
  • Azure Databricks

Cloud Stack:

  • AWS
  • Azure
  • EMR Cluster

On-Premise HDP:

  • Cloudera (HDP) Framework
  • 48 Node Cluster
  • Storage Formats (Parquet,AVRO)


Project Details:

  • Develop and maintain the Data Pipelines using Scala Spark , PySpark and Spark SQL by adapting TDD Approach
  • Build and maintain the microservices developed using Spring Boot to expose the API's to consume the compute and config data to users
  • Build and enhance various regulatory reports and load the data in Hive External table using Scala and PySpark jobs on adhoc and schedule basis, and expose the data to end users by building the Impala Views on top oh Hive External Tables.
  • Build Micro-Services using Spring Boot to consume the HDFS data as Service for Business Users
  • Working on production issue and peformance issues on the existing modules and tech debt activies
  • Performaing the House Keeping acivity like Purging, identify and cleaning up the small files in the HDFS Partitions, Data Download activies in PROD and UAT
  • Collaborate with cross-functional team to gather the requirement and desgin the scalabale solutions to deliver with high quality and document the requirments in JIRA
  • Write Unit-test cases and code coverage for the modules to imporve the code quality and system resillance,Mockito for Scala and Pytests for Python
  • Worked on Building the CICD pipelines using jenkins and kubernetes
  • Perform the release activities by adhearing to the process wich includes the JIRA documentation, users sign-off,regression and performance test

Technical Lead

CirclesLife
12.2022 - 06.2023

Technologies Used:

  • Python/Scala/Java
  • PySpark/Spark
  • Kafka
  • MongoDB/MariaDB/Oracle
  • Kuberneties
  • AWS & GCP

Project Details:

  • Develop and maintain the Data pipelines using Apache Airflow and Batch Jobs, where we have used the Multi-Cloud Storage like AWS , GCP and MinIo, since we have different platforms for each regions
  • Developing the Data pipelines from various source like MySQL,MOngoDB,Oracle and MariaDB load to the Data Lake using SQOOP,Python and Apache Spark
  • Have Experience on Cloud Datalake Platforms like SnowFlake and BigQuery
  • Building the Spark Jobs for Aggregation and Data Enrichment using PySpark and Scala
  • Working on performance tuning of the Spark Jobs which are running long and improvising the existing Spark Pipelines
  • Work with BI team and help them to create the reports and dashboards using Metabase
  • Creating and scheduling the spark jobs on Dataprocs and EMR cluster and setup the Spark on AKS solution for the Minio Storage
  • Have expose with Cloud SDK like boto3 and GCPClient
  • Have hands on Experience with Spark on Kubernetes , created the Kubernetes resources to submit the Spark Application and migrated the Spark Application which are running on EMR and DataProcs
  • Have hands on development on Python and Scala , used python for automation and adhoc reports and used scala for Spark Batch processing
  • Have hands on Experience on creating and managing the data using Delta Lake with Spark and Trino
  • Have Experience on working Streaming Applications like Apache Kafka and ClickStream
  • Technology used: Python,Scala,Apache Saprk,Trino,Snowflake,BigQuery,MongoDB,MySQL,Apache Kafka

Senior BigData Engineer

Société Générale
06.2019 - 06.2022

Technologies Used:

  • Java/Scala/Python
  • Apache Spark
  • H-Base,Hive,Oracle,Sqoop
  • Apache Kafka
  • Azure

Project Details:

  • Develop and Maintain the Hadoop & ETL Data warehouse, where we have used Hadoop Hartonworks Sandbox On-Premise Platform, Where we have used sources like Sqoop,HDFS API,HIVE,H-Base ,Flume,Kafka,Oozie and Spark for creating the data pipelines to migrate the data to Datalake.
  • Worked on Migration of On-Premise Datalake to Cloud Datalake with AZURE Platform, Where we have used Spring Batch/Azure API to Feed the Data into RAW, Azure Gen2 Storage ,Azure HDI,AKS,resources
  • Building the Pipelines in Azure for Feeding and Consuming the Data from Datalake
  • Build the pipelines to ingest the data into Raw Layer using Sqoop , Flume and Kafka and created the external tables in Hive foe data visualization
  • Modeled the AVRO Schemas to store and Normalize the data in RAW and LAKE, worked the Hadoop File like AVRO,PARQUET,ORC and CSV
  • Worked on Mapping engine which will transform the RAW model to LAKE Model using Spark Transformation using AVRO Schemas
  • Created the Oozie Workflow for Scheduling the Pipelines
  • Created Spark Jobs for report Generation , Data Normalization and Transformation, where we have used Complex Hive Queries, Spark UDF Functions to built
  • Used Kafka and Flume to Integrate the RealTime Trade data into Datalake
  • Worked on Apache Nifi for Scheduling the Pipelines in Data Warehouse
  • Worked on Airflow to Schedule the Jobs in Azure

Software Developer

Great West Global
04.2016 - 12.2018
  • Worked on Multi-Technology project where i have worked on SQL,PL/SQL,Shell Scripting,Perl,Proc*C,Java , Spring Boot,Spring Batch,Apache Spark and Hadoop EcoSystem
  • Worked on Migrating the Proc*C modules to Spring Boot ,JPA and Spring Batch project for Batch and Transactional Processing
  • Created the Micro-Services using Spring Boot project
  • Created an Package ,Procedures & Functions using PL/SQL where i have used Collections and advanced PL/SQL components for building In-House transactional processing Engine
  • Worked on Hadoop Eco-System to push the data into Datalake
  • Used Sqoop to migrate the Structured Data into Datalake and Normalized the data using Apace Spark
  • Created the Spark UDF Functions , Oozie Workflow for creating Data pipelines
  • I have used Shell Scripting and Perl for automating deployments and generating Business reports
  • Worked on Pipelined Functions,Triggers,Analytical Queries and Complex SQL for generating Business reports on Adhoc baisis
  • Worked on performance tuning of complex and long running queries
  • Performed the Unit testing using junit,mocito for Java based projects

Senior Software Engineer

Midtree Ltd
12.2014 - 04.2016
  • Worked on Migrating the Database Application from 11g to 12c
  • Created the Stored Procedures,Functions & Packages program unit using PLSQL, Where i have used Advanced concepts in PLSQL like Dynamic Execution of queries,Collections,Exception Handling
  • Worked on Complex Queries for generating report and data validation while migration
  • Used SQL loaders to load the flat file and created the External tables to load the data from flat files
  • Created the Views & Materialized views to expose the data to downstream applications
  • Used Shell Scripting to automate the process
  • Been part of Data modeling and Normalization of Data

Project Engineer

Wipro Technologies
03.2012 - 03.2014
  • Worked on Cost Management, Purchasing & Inventory Module
  • Worked Quoting and Rolling up the Approved cost from the vendor for Invoice and Costing the Material Transaction
  • Closing and Opening the Inventory Periods and Opening the Purchasing Period every month
  • Write Complex and Analytical SQL queries for data analysis and generating business report
  • Involving in project release activities.
  • Monitoring of alert logs, scheduled jobs.
  • Involving in major priority issues and engaged the appropriate team to fix the issue. Handled client escalated issues.
  • Interact with customer and resolve the incidents within the regular SLAs.
  • Took responsibility to guide teams on major issues.
  • Performed RCA for Business cases during production window

Education

BE - Mechatronics

Kongu Engineering College
Erode,India
06.2008 - 04.2012

HSC -

Sri Vinayaga Hr Sec School
01.2006 - 03.2008

SSLC -

Concord Matric Hr Secondary School
01.2005 - 03.2006

Skills

    Core Java

Apache Spark

Hive

Spring Boot/Batch

Apache Oozie

Scala

Hadoop EcoSystem

Azure & GCP & AWS

Apache Nifi

Kafka

SQL

Shell Scripting

Python

PySpark

Apache Airflow

MongoDB

Risk analysis

Languages

Tamil, English, Telugu

Certification

Databricks Certified Associate Developer for Apache Spark 3.0

Timeline

Databricks Certified Associate Developer for Apache Spark 3.0

01-2023

Technical Lead

CirclesLife
12.2022 - 06.2023

Senior BigData Engineer

Société Générale
06.2019 - 06.2022

Software Developer

Great West Global
04.2016 - 12.2018

Senior Software Engineer

Midtree Ltd
12.2014 - 04.2016

Project Engineer

Wipro Technologies
03.2012 - 03.2014

BE - Mechatronics

Kongu Engineering College
06.2008 - 04.2012

HSC -

Sri Vinayaga Hr Sec School
01.2006 - 03.2008

SSLC -

Concord Matric Hr Secondary School
01.2005 - 03.2006

Senior Data Engineer

DBS Bank(Contarct)
6 2023 - Current
Parthiban MuruganSenior Software Engineer