OnBenchMark Logo

Srilatha

Data Engineer
placeholder
Location Pune
view-icon
Total Views77
availability
Shortlist0
back in time
Member since30+ Days ago
 back in time
Contact Details
phone call {{contact.cdata.phone}}
phone call {{contact.cdata.email}}
Candidate Information
  • User Experience
    Experience 6 Year
  • Cost
    Hourly Rate$4
  • availability
    AvailabilityImmediate
  • work from
    Work FromOffsite
  • check list
    CategoryEngineering & Design
  • back in time
    Last Active OnJune 17, 2024
Key Skills
PythonAWS
Education
2007-2010

(Computer Science)
JNTU

Summary
PROJECT DETAILS

 

ADI (Aboitiz Data Innovation)

 

Project                                   SAS Script Migration to Pyspark (CITI – Union Bank)

                                                                               I.           CVI

                                                                             II.           Speed Cash

                                                                           III.           Cross Sell to CITI Gold        

                                                                           IV.           Portfolio Segmentation

Client                                     Aboitize Data Innovation, Singapore.

Environment                        PySpark, Spark SQL, Python, AWS EMR, S3, Glue, Athena, Airflow, Oracle, CML

Role                                       Sr. Data Engineer

 

Aboitize Data Innovation is providing top class transformative AI Consulting and data-driven IoT and sustainability solutions to business across diverse sectors

 

The objective of this project is, to migrate the entire CITI Bank’s data platform from SAS to Union Bank’s Pyspark platform by applying updated transformations and created AWS Data Pipelines to optimize the data utilization and informed decision making.

 

Roles & Responsibilities:

·    Performed in-detail Code analysis to start with the migration of SAS scripts to Pyspark and actively participated in client calls to collect the business requirements and highlight the technical dependencies in advance.

·    Collecting data from various sources into S3.

·    Creating Spark scripts using Python based on the optimized logic and requirements.

·    Creating and modifying the Data pipelines and deploying it as required in CML.

·    Created documentation for Processes, Coding best practices, Code review guidelines.

·    Performed Code reviews before merging Pull requests (PRs).

·    Responsible to create DAGs and execute the scripts in the Airflow.

·    Responsible to create and perform Data quality checks using Pyspark, SQL and Hive queries.

·    Responsible in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.

·    Collaborated with the infrastructure, network, database, application, and Data Governance team to ensure data quality and availability.

Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon