No of Positions: 1
Location:
Pune
Tentative Start Date:
June 04, 2023
Work From :
Any Location
Rate : $ 8
-
12 (Hourly)
Experience :
2 to 4 Year
Total years of Exp: 2 to 4 Years
Main focus is on Python and SQL
And good coding skills
Primary Responsibilities:
Experience in Database programming using multiple flavor of SQL and Python
Understand and translate data, analytic requirements and functional needs into technical requirements
Build and maintain data pipelines to support large scale data management projects
Ensure alignment with data strategy and standards of data processing
Deploy scalable data pipelines for analytical needs
Experience in Big Data ecosystem - on-prem (Hortonworks/MapR) or Cloud (Dataproc/EMR/HDlnsight)
Experience in Hadoop, Pig,SQL,Hive, Sqoop and SparkSQL
Experience in any orchestration/workflow tool such as Alrflow/Oo2ie for scheduling pipelines
Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow
Understand and execute IN memory distributed computingframeworks like Spark (and/or DataBricks) and Its parameter tuning, writing optimized queries in Spark
Hands-on experience in using Spark Streaming,Kafka and Hbase
BE/BS/MTech/MS in computer science or equivalent work experience.
4 to 6 years of experience in building data processing applications using Hadoop, Spark and NoSQL DB and Hadoop streaming
Responsibilities:
Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow is a plus
Expertise in data structures,distributed computing,manipulating and analyzing complex high-volume data from variety of Internal and external sources
Experience in building structured and unstructured data pipelines
Proficient in programming language such as Python/Scala
Good understanding of data analysis techniques
Solid hands-on working knowledge of SQL and scripting
Good understanding of in relational/dimensional modelling and ETL concepts
Understanding of any reporting tools such as Looker, Tableau,Qlikview or PowerBI