OnBenchMark Logo

Amanpreet (RID : g8qolht2o3s1)

designation   Data Engineer

location   Location : Delhi, India

experience   Experience : 8 Year

rate   Rate: $16 / Hourly

Availability   Availability : Immediate

Work From   Work From : Any

designation   Category : Information Technology & Services

Shortlisted : 2
Total Views : 85
Key Skills
Spark Azure Hive hadoop PYSpark Jira Databricks MySQL Jenkins
Discription

PROFESSIONAL SUMMARY

  • Total 7 Years of IT Experience
  • 4+ Years of Experience on Big data technologies like Hadoop Spark, Scala, Hive, HBase.
  • 2 Years of Experience in Azure
  • Working with Spark, Hive and HBase for processing millions of data each day.
  • Working Experience in handling complex AVRO data format.
  • Exposure of using Hadoop 2.X in Cloudera distribution.
  • Sound Knowledge on HDFS architecture and distribution data processing.
  • Capable of processing large sets of structured, semi-structured and unstructured data.
  • Capable of writing Spark jobs using Scala.
  • Improved the performance of Spark-HBase Jobs.
  • Developed Spark/Scala code for validating Semi Structured Data and loading data to System.
  • Create hive table on complex HBase data for analysing purposes
  • Developed Orchestrating Data Pipelines using ADF.
  • Created CI/CD pipeline for ARM Templates.
  • Created CI/CD for Databricks Notebooks in Azure Dev Ops.
  • Used multiple Nodes of Azure to run Spark Codes
  • Hands on Knowledge on Azure, ADLS, ADF, File Share, Blob Storage
  • Hands on Knowledge on PySpark
  • Hands on Knowledge on Scala
  • Hands on Knowledge on DataBricks
  • Improve the Performance of Job with Iterative Broadcast.
  • Knowledge on Kafka, Docker, Kubernetes
  • Knowledge on AWS,EMR,CloudWatch,Sagemaker,Lambda

TECHNICAL SKILLS

Languages

Scala

Tools

Spark, Databricks, Hadoop, Map Reduce, Jenkins, Jira, Hive, Maven, Dremio, Nifi, Azure ADF, Azure ADLS

Scripting

Python

Operating Systems

Linux, Unix and Windows

IDEs

Eclipse, Intellij, PyCharm

Database

MySQL, HBase

Projects Undertaken at Deltacubes Technology Pvt. Ltd.

Project Name: Depletions

 

Duration: 3 Year(April 2019- Presents)

 

Description: Diageo is a worldwide liquor manufacturer and distributor. Data lake team helps the Diageo Business to get the data from all customer/distributor across the world.

So Data Lake team collects the data from different Sources then cleansing, standardizes and harmonizes to get meaningful data for business analytics team.

Responsibilities:-

1.) Cleansing and Standardizing Raw data from multiple file formats (xlsx ,csv, JSON) using Spark/Scala.

2.) Generating Parquet/CSV files after Harmonizing Data for Business Analytics Team

3) Generating CSV files for Anaplan Team with EU Calculations in Blob Storage

4) Generating Parquet files for Sellout Team

5.) Orchestrating Data pipelines using ADF.

6) Creating Views on Dremio on top of ADLS.

7) Parsing JSON for Standardizing Raw Data to get the Rules.

8) Trigger the Pipeline from ADF.

9) Created CI/CD pipeline for ARM Templates.

Technology: Spark, Scala, Dremio, Azure(ADF,ADLS, Blob), Databricks

Projects Undertaken at Tavant Technologies

Project Name: Experian BIS SALT

 

Duration: 2 Years(Jan 2017- Apr 2019)

 

Description:

Experian is a consumer credit reporting agency. Experian collects and aggregates information on and over one billion people and businesses. It is one of the ‘Big Three’ credit reporting agencies.

This project is to import the SBFE data that Experian got recently into Big Data system and make it available to internal/external customers for analysis and credit score modelling for any business.

Responsibilities:-

1.) Develop Spark/Scala code for validating Semi Structured Data and loading data to system.

2.) Generate Avro file from CSV file for integrating with external System(One Search) using Spark/Scala.

3.) Create Hive tables for Data Management team.

4.) Make data available for Commercial Data Sciences team for analyzing in SAS.

5.) Run validation job on historical data and make data ready for using.

6.) Support functional testing and Bug Fixing in Spark code.

7.) Write Spark Data Frame code to implement product view rule on processed data.

8.) Write Spark Data Frame to read and analyze nested complex Avro data.

 

Technology: Spark, Scala, HBase, Hive

Projects Undertaken at AMD India Pvt. Ltd(Contingent Worker Through Magna Infotech)

Project Name: Scan-view

 

Duration: 11 Months(Feb 20

 
Matching Resources
My Project History & Feedbacks
Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon
Loading…

stuff goes in here!