OnBenchMark Logo

Thirunagari (RID : 210ulp801n7u)

designation   Data Engineer

location   Location : Jaipur

experience   Experience : 5 Year

rate   Rate: $15 / Hourly

Availability   Availability : Immediate

Work From   Work From : Offsite

designation   Category : Information Technology & Services

Shortlisted : 1
Total Views : 60
Key Skills
Data Engineer Python SQL Server Big Query CI/CD Jenkins Terraform Pyspark
Discription

Project Profile:

 

Client

Confidential

Duration

Jan 2021 to Apr2023

Roles and Responsibilities

  • Analyze and define researcher’s strategy and determine     system architecture and requirement to achieve goals.
  • Developed multiple Kafka&Kafka ConnectorsProducers and Consumers from as per the software requirement specifications.
  • Configured Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
  • Spark Streaming Jobs are Configured in Airflow Automated Script deployment for production Systems easily and Schedule the Jobs easily and less complex.
  • Spark & Scala MVC Play frameworks for integrate with Java & Scala together and call the restful services .
  • Writing the Data Processing and Data Transformation & data Cleaning Operation Spark Scala
  • Design the Jobs Using Pyspark, Python with Streaming, SQL Server SSIS migrations etc.
  • Create Pipelines, datasets, dataflows, Integration runtimes and monitoring Pipelines and trigger runs
  • Experience in Big Data Batch Processing Solutions; Interactive Processing Solutions; Real Time Processing Solutions
  • Design the jobs in the Azure Databricks, Azure Data Storage, Azure Synapse ETL, Azure Cosmos DB, EventHub, Azure Data Catalog, Azure Functions, Azure Purview,MDM etc.
  • Used various spark Transformations and Actions for cleansing the input data.
  • Design the different processors in Apache NIFI workflows.
  • Developed shell scripts to generate the hive create statements from the data and load the data into the table.
  • Wrote Map Reduce jobs using Java API and Pig Latin
  • Optimized Hive QL/ pig scripts by using execution engine like Tez, Spark.
  • Extensive experience in Microsoft Cloud solutions, i.e., Designing, Developing, and Testing Technologies
  • Create Pipelines, datasets, dataflows, Integration runtimes and monitoring Pipelines and trigger runs
  • Create SQL scripts to perform complex queries.
  • Create Synapse pipelines to migrate data from Gen2 to Azure SQL.
  • Data Migration pipeline to Azure cloud (Azure SQL).
  • Design the Search Operation Using Index creation on Elastic Search,Kibana,Logstash and Integrate with Spring Boot Microservice Application
  • Design Dashboard in ELK,Kibana According to Client Specification.
  • Involved in writing custom Map-Reduce programs using java API for data processing.
  • End to End Machine Learning Concepts like Classification,Linear Regression,Clustering.etc
  • Develop End to End ML Application Deployed in Azure.
  • Involved in developing a linear regression model to predict a continuous measurement for improving the observation on wind turbine data developed using Spark ML with Scala API.
  • Extract Transformation and Load Data from source system and processing the data in Azure Databricks
  • The hive tables are created as per requirement were Internal or External tables defined with appropriate static, dynamic partitions and bucketing, intended for efficiency.
  • To design the Talend ETL workflow for repeated bulk data loading and perform the transformation on the workflows.
  • Writing the complex SQL queries & Stored procedures in Snowflake and Snow pipe, DBT  for integration with Apache Spark for ETL Pipelines.
  • To Design the Jobs using  Azure Databricks and  Data factory.
  • To Design Workflow for copying the datasets local on prem to azure cloud environment like Azure Data Factory and Azure Dataflow, Azure Synapse, Talend,Informatica etc.
  • Design the workflows for heavy Bulk loading operations using Talend on-prem to Cloud Environment.
  • Design Talend Workflows are bulk loading for ETL Applications.
  • Design the Google GCP Big Query Data proc SQL Script for improve the Query performance for search operations are very faster way.
  • Load and transform large sets of structured, semi structured data using hive.
  • All the code repository and CI/CD Jenkins & Terraform
 
Matching Resources
My Project History & Feedbacks
Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon
Loading…

stuff goes in here!