OnBenchMark Logo

Taj (RID : 15xwtl60o42y4)

designation   Senior Data engineer

location   Location : Hyderabad, India

experience   Experience : 10 Year

rate   Rate: $0 / Hourly

Availability   Availability : Immediate

Work From   Work From : Offsite

designation   Category : Information Technology & Services

Shortlisted : 93
Total Views : 185
Key Skills

Being a Big Data Enthusiast looking for a challenging opportunity in Big data, I intend to be a part of an organization where I can constantly learn and apply my knowledge, integrate myself into the organizational culture, and honestly, diligently work to add value to it.


●      I Have 8. 10 years of experience in Software Development.

●      Hands-on experience in Hadoop environment using Spark, Scala, and AWS.

●      Have thorough knowledge of spark architecture and how RDDs work internally. Have exposure to Spark Core and Spark SQL.

●      Having good knowledge of Kafka, Spark Streaming, and Machine Learning.

●      Strong knowledge of Hadoop Framework and its Ecosystem (Map-Reduce, HDFS, Hive, Pig, Sqoop, and Spark.)

●      Hands-on experience on AWS cloud using EMR, S3, Lambda, GLUE, RDS, DynamoDB, etc.

●      Proficient to work with Spring MVC framework and REST API.

●      Comprehensive knowledge of Software Development Life Cycle (SDLC), having thorough under- standing of various phases like Requirements Analysis, Design, Development, and Testing.

●      Working experience in Sonar Tool (Code Quality Analysis Tool), SVN, Bitbucket, and Google spreadsheet API.

●      Translate complex business requirements into scalable technical solutions

●      Experienced in leading diverse remote teams and ensuring timely delivery.

●      Focused, Hardworking, Quick learner, and an effective communicator at all levels.

●      Good people management skills and a technical bent of mind.

TECHNICAL SKILLS                                                                                                                                          

Operating Systems LINUX 6.5,UNIX,Windows 10,8,7

Database Server: Postgres, H base, DynamoDB

Big Data & Eco-system Spark, Scala, JAVA, HIVE, Presto, AWS EMR, Lambda, Glue,

JAVA & J2EE Technologies Java, Spring, Hibernate, Rest, MVC architecture,

Tools Hue (Cloudera specific), Putty, Win SCP, SVN, SONAR tool, ALM


PROJECT SUMMARY                                                                                                                                        

Project #1

Name Health QX

Duration Jan 2021-Till now

Environment Apache Spark, SCALA, AWS EMR, RDS, S3, AWS Lambda, Presto

Role Big Data Engineer

Domain Healthcare


This project is migrating from a typical SQL server to big data technology. Using spark data frame API for all kinds of transformation and store output on S3 location. Using AWS cloud to manage all lifecycle.

Using EMR for deploying spark job and applying optimization if required.

On top of s3 use Presto for analyzing results.



●      Analyze datasets and process them through Spark Scala.

●      Writing spark jobs with transformation using Data frame API.

●      Set up Presto cluster and write lambda function for deploying new connectors.

●      Write jobs for cleanup S3, RDS, and hive schema.

●      Write a utility to support update statements in Presto.

●      Design Metrics Platform for JVM and Non-JVM applications.

●      Optimizing spark jobs.


Project #2 

Name Media and Advertising campaign (DSP and SSP)

Duration March 2018-Till now

Environment Apache Spark, HDFS, Yarn, spark core, Spark SQL, H base, AWS(S3), Oozie, Scala

Role Data Analyst and Big Data developer

Domain Media and Advertising


This project is about working on the DSP and SSP side of campaign management. It reads telecommunication data from the S3 bucket in parquet format and put the same in the HDFS location. The data goes enrichment process and is then stored in

H base which can be further used for campaign purposes



●      Analyze datasets (parquet, JSON and CSV) and process them through Spark Scala.

●      Process data and enrich the same using specified ad indicators helpful for a media campaign

●      Build Recommended system for users by using collaborating filters.

●      Writing Scala code to read data from google spreadsheet and create JSON out of it.

●      Writing Scala jobs for the enrichment process of CRM and URL data.

●      Writing UDF whenever required.

●      Writing unit test cases using Scala Test API under TDD(Test Driven Development).

●      Strictly following Agile methodology.

●      Reviewing code on bitbucket.


Matching Resources
My Project History & Feedbacks
Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon

stuff goes in here!