Mridushi

  • Software Engineer
  • Gurgaon

Rate

$ 9.00 (Hourly)

Experience

7 Years

Availability

Immediate

Work From

Offsite

Skills

PythonUnix Shell ScriptingJenkinsGITHadoopSQLAWSMavenData AnalysisData ExtractionData WarehousingData ModelingDB2MySQLOracleMongoDB

Description

Project Experience Bank of America / Data Engineer May 2022 - Present, Gurgaon

• Working with Spark SQL engine to populate data in Hive tables.

• Analyzed and optimized cluster resources for Spark application for efficient performance.

• Upgrading existing spark application to newer spark version frameworks. Capgemini / Big Data Developer September 2020 - May 2022, Mumbai

• Writing PySpark script to import data into landing zone from third party system.

• Involved in Enrich Layer for data cleaning process using PySpark.

• Spark jobs deployed to AWS EMR cluster and stored the result to Amazon S3 storage

• Extracted the data from MySQL into HDFS using SQOOP.

• Created and worked on SQOOP jobs with incremental load to populate Hive External tables.

• Query designing which involved concept of Partitioning and Bucketing that improved performance by 70%

• Used Spark to load data and create schema RDD and loaded the data into Hive tables. Gemini Solutions Pvt Ltd./ Data Engineer/Data Analyst Feb 2020 - September 2020, Gurgaon

• Used Spark for interactive queries, processing of Streaming data.

• Cleaning and processing unstructured data in Spark and Scala.

• Implemented spark using Scala and Spark SQL for faster testing and processing of data and improved performance by 65%

• Extracted data from different sources and created ETL pipeline using python and visualizing data for Report for management. Wipro Technologies /Trainee(Database Administrator/Python Developer) July 2015 - February 2020, Pune

• Worked as Python Developer and DB administration.

Submit Query icon