Feb’2022 – Present – asSenior Data/Data Warehouse Engineer
- Ingestion (real-time events and batch) for different mission across Telstra.
- Used ETL & ELT to transform and clean data, improving data quality.
- Data Ingestion to one or more Azure Services - (Azure DataLake, Azure Synapse) and processing the data in Azure Databricks.
- Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Flat Files.
- Flattened JSON events and pivots key-value pairs in events as columns and rows using Azure Databricks & Spark based configuration driven framework.
Technologies: Azure Data Lake, Azure Data Factory, AzureDatabricks, SQL, PySpark, GIT.
Jul’2021 – Jan’2022 –, asSoftware Engineer – Data Engineer
- Design & Development using Teradata utilities BTEQ, TPT, Multiload, Fast load and Shell scripts to do the data transformations and data loading as per the business requirements.
- Using Informatica Power Center to create the ETL workflows to load the data from source to target.
- Design Source to Target Mappings for Data Sources to ETL layer views.
- Using JIRA to create tickets for the work assigned and the user queries.
- Creating Tableau Dashboards for the matching of source and target count to ensure the data quality.
Technologies: SQL, Informatica Power Centre, Teradata, Shell Scripting, Spark,Qubole, S3, GIT, JIRA and Airflow.
Jul’2020 – Jun’2021 –asAssociate Consultant(Data Engineer)
- Working on the ETL front of the project and creating pipelines in Azure Data Factory for the movement and processing of the data in and out of Azure Data Lake.
- Using Azure Databricks notebooks for the transformations of the data from source to target using Spark, PySpark and SQL.
- Generate data from various sources like databases and flat files and storing them at ADLS for further operations.
- Design Source to Target Mappings for Data Sources to MDM and ETL layer views.
Technologies: SQL, Azure Databricks, Azure Datafactory, Azure Datalake, Azure Synapse, Spark SQL,PySpark, Airflow,SQL server and Informatica MDM.
Oct’2016 – Jun’2020- asDWH/BI Developer
- Design & Development using Teradata utilities BTEQ, TPT, Multiload, Fast load and Shell scripts to do the data transformations and data loading as per the business requirements.
- Generate data for different products and export to client.
- Data Analysis working closely with Data Owners to formulate design (data selection criteria, load frequency), data masking, scheduling of the jobs in Tivoli for the data pipelines.
- Application has different modes to run (Automated/Ad-hoc) to overcome failure scenario.
- Design Source to Target Mappings in Informatica for Data Sources to Data Mart and Presentation layer views.
- Developing reports using Microsoft Power BI
- Design & Development for Batch Scheduling using IBM Tivoli Workload Scheduler.
- Design Facts and Dimensions and do the data modelling as per the business requirement.
- Also done POC to move a subset of data mart to Azure Synapse using Azure Databricks and Azure Datalake.
Technologies: Teradata, Shell Scripting, SQL, TWS, Informatica, Erwin, GIT, Jenkins, Power BI, JIRA.