Summary
Overview
Work History
Education
Skills
Timeline
Generic

RAJARAM SAHU

Summary

Highly skilled Senior Data Engineer with 9 years of experience in designing and implementing data solutions. Proficient in AWS and Azure cloud platforms, specializing in building scalable and efficient data pipelines. Strong expertise in data processing frameworks like Spark and data integration tools like Apache Kafka. Proven track record of delivering end-to-end data solutions that drive business insights.

Overview

11
11
years of professional experience

Work History

Senior Data Engineer

HMI Group
06.2023 - Current

Led the design and implementation of end-to-end data engineering solutions on Azure, utilizing services such as Azure Data Factory, Azure Synapse Analytics, and Azure Logic Apps.

Led the design and implementation of end-to-end data engineering solutions on Azure, utilizing services such as Azure Data Factory, Azure Synapse Analytics, and Azure Logic Apps.

Implemented data integration solutions, enabling seamless connectivity between on-premises and cloud-based systems, ensuring data consistency and integrity.

Designed and deployed complex data integration workflows using Azure Logic Apps and Azure Functions, automating critical business processes and ensuring data consistency across platforms.

Spearheaded the development of interactive and visually compelling Power BI dashboards, providing actionable insights to business users and stakeholders.

DATA ENGINEER

U3 infotech
04.2022 - 06.2023
  • Designed and implemented end-to-end data solutions on AWS and Azure platforms, enabling efficient data processing and analysis for various business units.
  • Processed and analyzed 10 terabytes of data using AWS EC2 instances and EMR clusters, providing timely insights to stakeholders.
  • Developed and maintained data pipelines using Spark, handling the ingestion, transformation, and loading petabytes of data into data lakes and data warehouses.
  • Implemented data governance and data quality frameworks, ensuring high-quality and reliable data for decision-making.
  • Orchestrated complex workflows using Apache Airflow, automating data pipelines and improving operational efficiency.
  • Leveraged AWS Lambda functions for serverless data processing, reducing costs and improving scalability.
  • Maintained data pipeline up-time of 99.8% while ingesting streaming and transactional data across 8 different primary data sources using Spark,Kafka,S3,Python.
  • Led the successful migration of a legacy data processing system from SAS to PySpark, significantly improving data processing efficiency and scalability.

DATA ENGINEER

Mindteck
05.2021 - 02.2022
  • Contributed to the design and implementation of a real-time data ingestion and streaming analytics platform using Apache Kafka.
  • Processed and analyzed 10 gigabytes of streaming data daily, enabling real-time insights and decision-making .
  • Integrated the platform with Azure Data Lake Storage (ADLS) Gen2, facilitating efficient storage and retrieval of terabytes of data.
  • Automated ETL processes across billions of rows of data, which reduced manual workload by 29% monthly.

AZURE DATA ENGINEER

Mindtree
10.2018 - 04.2021
  • Coordinated with business customers to gather business requirements, and also interacted with other technical peers to derive Technical requirements.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Automated ETL processes across billions of rows of data, which reduced manual workload by 29% monthly.
  • Ingested data from disparate data sources using a combination of SQL,Google Analytics API, and Azure.
  • Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Sourced, processed, validated, transformed, aggregated, and distributed data from 50+ sources.

SENIOR SOFTWARE ENGINEER

Hcl Technologies
08.2013 - 11.2017
  • Build Big Data Solution on the Azure cloud to ingest and write data from and to Azure blob storage and Delta to keep the cost of data storage to a minimum
  • Designed the Hadoop platform for high performance and low cost when compared with existing in-house data warehousing systems
  • Created Pipelines in ADF using Linked Services/Datasets /Pipeline/ to Extract, Transform and load data from different sources like Azure SQL,Blob storage, Azure SQL and Data warehouse.
  • Utilized Spark in Python to distribute data processing on large streaming datasets to improve ingestion and processing speed of that data by 67%
  • Built basic ETL that ingested transactional and event data from a web app with 12,000 daily active users that saved over $85,000 annually in external vendor costs.

Education

MBA -

SIKKIN MANIPAL UNIVERSIT

B.Sc Mathematics - Mathematics

UTKAL UNIVERSITY D.R. NAYAPALLI COLLEGE
India

Skills

AWS (EMR, S3,EC2)

Azure

Azure Databricks

Azure Data Factory

SQL

Postgres

Spark,Kafka

Airflow

Hive

Snowflake

Timeline

Senior Data Engineer

HMI Group
06.2023 - Current

DATA ENGINEER

U3 infotech
04.2022 - 06.2023

DATA ENGINEER

Mindteck
05.2021 - 02.2022

AZURE DATA ENGINEER

Mindtree
10.2018 - 04.2021

SENIOR SOFTWARE ENGINEER

Hcl Technologies
08.2013 - 11.2017

MBA -

SIKKIN MANIPAL UNIVERSIT

B.Sc Mathematics - Mathematics

UTKAL UNIVERSITY D.R. NAYAPALLI COLLEGE
RAJARAM SAHU