Passionate and dedicated Developer with 6+ years' experience in the data-driven industry. Expertise in optimizing ingestion and computation frameworks for enhanced operational efficiency and actionable insights. Strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Demonstrated success in AI/ML implementation for data center prediction and pattern matching, identifying relationships, and building solutions to business problems
Overview
9
9
years of professional experience
Work History
Lead Data Engineer
Keppel DC&N Singapore
11.2023 - Current
Led the implementation of AI/ML solutions across multiple Keppel data centers, enhancing operational efficiency and predictive maintenance capabilities.
Automated routine tasks such as monitoring, alerting, and reporting, reducing manual intervention.
Deployed IoT edge devices equipped with diverse sensor modules, connecting to a central server IP for data transmission and monitoring. Established communication protocols for real-time data reads, enabling seamless integration with the server infrastructure. Implemented robust monitoring mechanisms to track sensor data streams, ensuring reliability and responsiveness in IoT ecosystem operations.
Managed Spark cluster to process massive data files using Scala, implementing efficient algorithms for data processing and analysis. Leveraged Spark DataFrame for reading and transforming data, optimizing performance through caching and partitioning strategies.
Implemented sharding and partitioning strategies in PostgreSQL Citus DB to optimize query processing and divide large files for batch and streaming data processing. Leveraged Citus's distributed architecture to shard data across multiple nodes, enabling parallel processing and efficient query execution for improved scalability and performance.
Developed and implemented a Grafana dashboard to visualize assets, ensuring real-time monitoring with data updates every 5 seconds
Utilized machine learning libraries to build a prediction model for deriving insights and detecting failures.
File Formats - Avro,Parquet, blob
IoThub, Eventhub, IoTEdge Modules
Data Engineer
NCS - Client : DBS Singapore
10.2020 - 10.2023
Create, build and maintain the data infrastructure required for ingestion, computation, extraction, transformation and loading of data from a wide variety of sources.
Batch processing and Real Time Data Management, Data Governance and end-to-end analysis for control tower and dashboards (solution design)
Development experience using Apache Kafka, Python,Spark, Hadoop, HDFS
Proficient in tools such as Jenkins pipeline, sparkola, Theia, Jira, Bitbucket, Postman API, Superset, Inbuilt AI Platforms for streaming data monitoring
Strong understanding of SQL and MariaDB with different source systems using ELK and kafka topics to achieve fast search response.
Technical analysis, solution design, developing, unit testing, code review, architecture and documentation, engage with QA to prepare infrastructure for integration and load testing activities
Development of the solution for SIT,UAT testing activities and demonstration sessions to business stakeholders.
Utilized Python libraries such as Pandas, NumPy, and Requests for data manipulation and API integration
Leveraged pyspark distributed processing capabilities to handle large volumes of data efficiently.
Employed SQL databases (e.g., MySQL) for data storage and retrieval.
Implemented scheduling tools like Apache Airflow to automate data pulling processes
Designed and developed scalable Python scripts and PySpark jobs for automated data extraction from various sources.
Implemented data cleaning and transformation techniques to ensure data quality and consistency
Collaborated with cross-functional teams to understand data requirements and optimize data pulling processes.
Created and maintained data pipelines, incorporating error handling, logging, and monitoring mechanisms.
Worked closely with stakeholders to identify and address performance bottlenecks, resulting in a significant reduction in data pulling time.
Data Analyst
NIF
01.2017 - 04.2019
Data modeling, data cleaning, data wrangling and data enrichment skills: establishment of new data processing procedures.
Quality assurance, validation and data linkage: ensured that data and models were managed and documented according to quality standards and procedures.
Designed and executed 52 completely bespoke tests of models and software using various libraries for unique traditional medicinal plants.
Continuously improved data preparation process through scripting and automation.
Bioinformatics Analyst
BioInnovations
09.2015 - 11.2016
Researched and adopted new technologies to add value to existing offerings.
Assessed data modeling and statistics to integrate high-level business processes with data rules.
Devised and implemented processes and procedures to streamline operations.
Leveraged big data technologies to manage large datasets efficiently while maintaining high levels of performance.
Developed new analytical models that improved forecasting accuracy and reduced risk exposure.
Increased efficiency by streamlining data analysis processes and implementing automation tools.
Statistical Methods : Regression/Classification analysis, Time Series, RNN, LSTM
Real-time Analytics, Data Engineering & Visualisation
Worked on : User Acceptance Testing (UAT), SIT, Staging, PROD env
MLOps & Big Data
Projects
Project Title: AI/ML-Based Time Series Analysis for Data Center Prediction and Pattern Matching
Keppel DC&N division
Developed and deployed Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) models to perform time series analysis for predicting data center performance and identifying operational patterns.
Implemented batch and streaming data processing pipelines to handle data from both staging and production environments, ensuring real-time and accurate model predictions.
Built and maintained Apache Beam pipelines to process and transform data post-AI/ML analysis, enabling efficient data flow and integration into downstream systems.
Utilized Python and key ML libraries (TensorFlow, Keras) for model development and training on extensive datasets.
Integrated ML models into data center operations for proactive maintenance, leading to a 25% reduction in downtime and a 20% increase in efficiency.
Conducted thorough testing and validation of models, achieving high accuracy and robustness in predictions.
Presented findings and insights to stakeholders, demonstrating the value of ML-driven predictions in improving data center operations.
Timeline
Lead Data Engineer
Keppel DC&N Singapore
11.2023 - Current
Data Engineer
NCS - Client : DBS Singapore
10.2020 - 10.2023
Data Analyst
NIF
01.2017 - 04.2019
Bioinformatics Analyst
BioInnovations
09.2015 - 11.2016
Master of Technology -
Banasthali Vidyapith
Similar Profiles
Mohammad AshaduzzamanMohammad Ashaduzzaman
Senior HSE OFFICER (Investigation) at Seatrium (Singapore) Limited, (Ex Keppel Shipyard Limited)Senior HSE OFFICER (Investigation) at Seatrium (Singapore) Limited, (Ex Keppel Shipyard Limited)