AWS Data Engineer
Location : Jaipur, India, India
Experience : 8 Year
Rate: $18 / Hourly
Availability : Immediate
Work From : Offsite
Category : Information Technology & Services
8.6 years of experience in implementing Complete Big Data solutions, including data acquisition, storage, transformation, analytics using Big Data technology that includes Hadoop, Hive, Spark, Python, Sqoop, PL SQL and Informatica.
Building complete data ingestion (ETL) pipeline from traditional databases and
file systems into Hadoop using Hive, spark, Python, Pyspark, Sqoop, and SFTP/SCP
Experienced with different Relational databases like Oracle, SQL Server.
Have good experience on data modeling.
Involved in Spark query tuning and performance optimization
Experienced in creating UNIX shell scripting for Batch jobs.
Experienced in developing business reports by writing complex SQL queries using views, volatile and global temporary tables.
Identifying long running queries, scripts, Spool space issues etc..., implementing appropriate tuning methods.
Reporting errors in Error tables to clients, rectifying known errors and running the scripts.
Following the given standard approaches while restarting and error handling.
Worked with Explain Command and identified Join strategies, issues and bottlenecks.
Written Unit Test cases and submitted Unit test results as per the quality process.
Strong Problem solving & Communication skills and ability to handle multiple projects and able to work in teams or individually.
Cloud exposure to AWS,AZURE and GCP
Experienced in creating dashboards using Tableau.
Glue.
Requirement.
ACOE analytics responsible for maintaining E2E data flow from various sources to AWS EDGE redshift.
This centralized data used to work on the business use cases to build dashboards and analytics.
StitcherX project mainly gets the ingested data from the Bigquery to redshift using talend jobs. Once the data is available in the redshift will be curated for the business requirement. All the services are running on aws and for scheduling the jobs we are using airflow.
The nitro project is to inject various types of source data into the data lake. The data in the warehouse can be used to build data marts, downstream systems and developing reports and analytical models.
The astrid project is asset servicing to multiple banking clients. We will process various MT messages and provide data to down streams. data will be used for developing reports and analytical. Complete process will be done in oracle and data will be curated and used in the front end.
Purpose of this project is to build the various applications for the different clients and make sure the code will be reusable in future clients as well.