You have successfully applied. Need to upgrade your plan to view contact details of client.
Shortlisted : 0
Total Views : 44
Key Skills
Python
MongoDB
Hadoop Admin
Apache
AWS
Tensorflow
OpenCv
Jupyter
Tableau
Snowflake
Redshift
SQL
S3 Buckets
Databricks
Power BI
Discription
Projects:
Data Pipeline and ETL (December 2015 – May 2018)
Designation: Data Engineer
Client: Technology Company, Portland, ME
Role and Accomplishments:
Designed and implemented a data pipeline using GCP services like Google Cloud Storage and BigQuery for processing semi-structured data from 100 million raw records across 14 data sources.
Integrated data with GCP's Pub/Sub and DataFlow for real-time processing, enhancing the paid conversion rate by 6%.
Led the migration of data storage and processing from Oracle to Google BigQuery, resulting in a 14% performance increase and significant cost savings.
Developed data pipeline architectures using GCP's Compute Engine and Data Fusion, enabling rapid scaling to handle increased user traffic.
Data Modeling and ETL Pipeline Creation (June 2018 – August 2019)
Designation: Data Engineer
Client: Health Care Company, NY
Role and Accomplishments:
Enhanced web-based EHR by integrating data using GCP tools like Cloud SQL and Cloud Functions.
Employed PySpark within the GCP environment, leveraging DataProc for parallel data processing.
Used GCP’s Cloud Composer for workflow orchestration, streamlining deployment for visualization and analytics purposes.
ETL Process (August 2019 – Feb 2020)
Designation: Data Engineer
Client: Payment Processing Company, CA
Role and Accomplishments:
Managed the ingestion of streaming and transactional data using GCP services like DataFlow, Pub/Sub, and BigQuery.
Created a custom Python library to parse and format data, integrating it with GCP’s Cloud Functions for efficient data handling.
Automated ETL processes with GCP's Data Fusion, significantly reducing manual effort and enhancing data pipeline reliability.