Overall i have 12 years of Experience in IT Industry, Where i have worked on various backend technologies like Scala,Python, Java,Spring Boot/Batch,SQL,PL/SQL,Hadoop Ecosystem ,Apache Spark,Kafka ,AWS,GCP and Azure.
For Past 7 Years i have been working in Data Engineering areas like On-Premise Datalake(Hortonworks) and Azure Datalake.I did worked on Migrating the On-Premise Datalake to Azure Datalake where we have used Spring Batch/Azure API to Feed the Data to Azure Gen2 Storage,Minio created Spark jobs for Normalizing and Transforming the data in Datalake. Created Multiple Avro Models for Data Modeling and Transformation.
Have experience on Handing huge amount of data in Datalake and have exposure on Collecting the data, Cleansing and Storing the data in the optimized way in On-Premise Datalake
Worked on Creating the DataPipe lines to load the data from External System to Data Lake and worked on Batch Process that does data cleansing and data aggregation to be used in the data marts using Apache Spark , Hive, Impala,Snowflake and BigQuery.
Worked on performance tuning of SQL in batch processing..
Have Experience on building CI/CD pipelines using Jenkins and Kubernetes, I have used Kubernetes to deploy the services and Application
Have Experience in Leading and mentoring the team of 5
Technologies Used:
Cloud Stack:
On-Premise HDP:
Project Details:
Technologies Used:
Project Details:
Technologies Used:
Project Details:
Apache Spark
Hive
Spring Boot/Batch
Apache Oozie
Scala
Hadoop EcoSystem
Azure & GCP & AWS
Apache Nifi
Kafka
SQL
Shell Scripting
Python
PySpark
Apache Airflow
MongoDB
Risk analysis
Databricks Certified Associate Developer for Apache Spark 3.0
Databricks Certified Associate Developer for Apache Spark 3.0