OnBenchMark Logo

FAIZAN (RID : 210ulopujxz3)

designation   DATA ENGINEER

location   Location : PUNE, India,

experience   Experience : 8 Year

rate   Rate: $18 / Hourly

Availability   Availability : Immediate

Work From   Work From : Offsite

designation   Category : Information Technology & Services

Shortlisted : 0
Total Views : 50
Key Skills
DATA ENGINEER AWS EMR REDSHIFT SNOWFLAKE PYSPARK PYTHON HADOOP ETL MY SQL POSTGRE SQL
Discription

Shaikh

AWS Data Engineer Snowflake

Terraform

 

Summary

Experienced as Data Engineer with around 8 years of experience in IT Industry. Started career as Hadoop Administrator where was responsible for handling Peta byte scale Hadoop Cluster. Got experienced working with different Big Data technologies. Currently working as AWS Data Engineer and have knowledge for Service like: Lambda, EMR, Redshift, Glue, RDS. Developed and Implemented multiple ETL Data Pipeline using Lambda. Also performed data transformation & cleansing using PySpark. Designed and Implemented ER Models and Data Modelling charts to implement the data basestructure from scratch. Created Views and Team tables for Business team. Worked with different file formats to in gest the data into SQL using Python Pandas. Also deployed AWS components with automation tools like Terraform. Have also worked in Data Warehouse tool Snowflake.

Skills

  • Cloud: Amazon Web Services.
  • Cloud Services: Lambda, EMR, Redshift ,Glue, RDS, S3,EC2,VPC,IAM,Cloud Watch.
  • IaC: Terraform
  • Data Warehouse: Redshift, Snowflake
  • Programming :PySpark ,Python(Pandas) ,SQL,YAML
  • Hadoop Distributions: Cloudera, Hortonworks
  • Hadoop Ecosystems: HDFS ,YARN ,Hive, Spark ,Knox.
  • Security: KerberosAD, Ranger, Encryptions.
  • Ticketing Tools: ServiceNow, Issue Trak

Experience

 

Organization

Designation

Duration

Confidential

AWS Data Engineer

Sept2021-Present

Confidential

AWS Administrator

May2021–Sept2021

Confidential

Senior Technical Associate

Nov2018–April 2021

Confidential

Hadoop Administrator

Jan2015–Nov2018

NOV2018–PRESENT

Data Engineer / Confidential AWS

  • Implemented ETL Data Pipeline for ingestion of data from API's to AWS environment.
  • Using Lambda, created Data Pipeline and perform edits file conversion
  • Automated in gestion of data into Redshift cluster using Lambda
  • Performed file format conversion using Glue Job and Lambda
  • Extracted valuable data from RDS and exported it to S3
  • Provisioned new EMR cluster in transient and long- running mode with different configuration
  • Debugged Hive & Spark application in case of app failure

 

  • Performed Spark optimization in case of any slowness
  • Performed Trouble shooting on various EMR issues faced by users.
  • Created Jupyter Notebooks as per the user's requirements
  • From scratch configured Edge nodes for users, to run the spark- submit application
  • Provisioned Redshift cluster as per the requirements and created Snap shots for its
  • Well versed with the Architecture of Red shift cluster
  • Performed optimization of Redshift cluster at table level
  • Imported the Data into Redshift Data lake from sources
  • Also worked on Redshift Workload Management queues.
  • Used Glue Crawler for populating the Data into Glue Data Catalog
  • Also worked on developing some Glue ETL scripts for data transformation
  • Enabled to use Glue Data Cata login EMR instead of Hive Meta store
  • Well versed with other AWS services like EC2, VPC, S3, IAM, Load Balancers, Cloud Watch, SQS.
  • Performed Deployment of EC2 instance from scratch to the End Project with EBS volumes.
  • Configured some scripts with Users to be executed at EC2 instance deployment
  • Configured some restrictions on  Incoming traffic to EC2 instance by specifying some Rules
  • Create custom VPC as per the requirements from the clients
  • Have worked in creating the Routing Policy to redirect the traffic
  • Configured NAT gateway for the Instances in Private Subnet
  • Have created some NACL inbound and outbound rules to avoid not required Traffics
  • Enabled VPC peering between the VPCs for establishing communication between them
  • Used S3 Storage Classes Lifecycle for data to transit between various Storage Classes

 

SQL

  • Designed and Implemented ER Models and Data Modelling charts to implement the database structure from scratch.
  • Worked with different file fo
 
Matching Resources
My Project History & Feedbacks
Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon
Loading…

stuff goes in here!