Thirunagari, Data Engineer | Bench Resource on Contract

Data Engineer

Location : Jaipur

Experience : 5 Year

Rate: $15 / Hourly

Availability : Immediate

Work From : Offsite

Category : Information Technology & Services

Shortlisted : 1

Total Views : 60

Key Skills

Data Engineer Python SQL Server Big Query CI/CD Jenkins Terraform Pyspark

Discription

Project Profile:

Client

Confidential

Duration

Jan 2021 to Apr2023

Roles and Responsibilities

Analyze and define researcher’s strategy and determine system architecture and requirement to achieve goals.

Developed multiple Kafka&Kafka ConnectorsProducers and Consumers from as per the software requirement specifications.
Configured Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
Spark Streaming Jobs are Configured in Airflow Automated Script deployment for production Systems easily and Schedule the Jobs easily and less complex.
Spark & Scala MVC Play frameworks for integrate with Java & Scala together and call the restful services .
Writing the Data Processing and Data Transformation & data Cleaning Operation Spark Scala
Design the Jobs Using Pyspark, Python with Streaming, SQL Server SSIS migrations etc.
Create Pipelines, datasets, dataflows, Integration runtimes and monitoring Pipelines and trigger runs
Experience in Big Data Batch Processing Solutions; Interactive Processing Solutions; Real Time Processing Solutions
Design the jobs in the Azure Databricks, Azure Data Storage, Azure Synapse ETL, Azure Cosmos DB, EventHub, Azure Data Catalog, Azure Functions, Azure Purview,MDM etc.

Used various spark Transformations and Actions for cleansing the input data.
Design the different processors in Apache NIFI workflows.

Developed shell scripts to generate the hive create statements from the data and load the data into the table.

Wrote Map Reduce jobs using Java API and Pig Latin
Optimized Hive QL/ pig scripts by using execution engine like Tez, Spark.
Extensive experience in Microsoft Cloud solutions, i.e., Designing, Developing, and Testing Technologies
Create Pipelines, datasets, dataflows, Integration runtimes and monitoring Pipelines and trigger runs
Create SQL scripts to perform complex queries.
Create Synapse pipelines to migrate data from Gen2 to Azure SQL.
Data Migration pipeline to Azure cloud (Azure SQL).
Design the Search Operation Using Index creation on Elastic Search,Kibana,Logstash and Integrate with Spring Boot Microservice Application
Design Dashboard in ELK,Kibana According to Client Specification.

Involved in writing custom Map-Reduce programs using java API for data processing.
End to End Machine Learning Concepts like Classification,Linear Regression,Clustering.etc
Develop End to End ML Application Deployed in Azure.

Involved in developing a linear regression model to predict a continuous measurement for improving the observation on wind turbine data developed using Spark ML with Scala API.
Extract Transformation and Load Data from source system and processing the data in Azure Databricks
The hive tables are created as per requirement were Internal or External tables defined with appropriate static, dynamic partitions and bucketing, intended for efficiency.
To design the Talend ETL workflow for repeated bulk data loading and perform the transformation on the workflows.
Writing the complex SQL queries & Stored procedures in Snowflake and Snow pipe, DBT for integration with Apache Spark for ETL Pipelines.
To Design the Jobs using Azure Databricks and Data factory.
To Design Workflow for copying the datasets local on prem to azure cloud environment like Azure Data Factory and Azure Dataflow, Azure Synapse, Talend,Informatica etc.
Design the workflows for heavy Bulk loading operations using Talend on-prem to Cloud Environment.
Design Talend Workflows are bulk loading for ETL Applications.
Design the Google GCP Big Query Data proc SQL Script for improve the Query performance for search operations are very faster way.