OnBenchMark Logo

Shubham (RID : 4kv2lq3feb5z)

designation   Data Engineer

location   Location : Ahmedabad, India

experience   Experience : 8 Year

rate   Rate: $14 / Hourly

Availability   Availability : Immediate

Work From   Work From : Offsite

designation   Category : Information Technology & Services

Shortlisted : 0
Total Views : 67
Key Skills
Python MongoDB Hadoop Admin Apache AWS Tensorflow OpenCv Jupyter Tableau Snowflake Redshift SQL S3 Buckets Databricks Power BI


  • Data Pipeline and ETL (December 2015 – May 2018)
  • Designation: Data Engineer
  • Client: Technology Company, Portland, ME
  • Role and Accomplishments:
    • Designed and implemented a data pipeline using GCP services like Google Cloud Storage and BigQuery for processing semi-structured data from 100 million raw records across 14 data sources.
    • Integrated data with GCP's Pub/Sub and DataFlow for real-time processing, enhancing the paid conversion rate by 6%.
    • Led the migration of data storage and processing from Oracle to Google BigQuery, resulting in a 14% performance increase and significant cost savings.
    • Developed data pipeline architectures using GCP's Compute Engine and Data Fusion, enabling rapid scaling to handle increased user traffic.


  • Data Modeling and ETL Pipeline Creation (June 2018 – August 2019)
  • Designation: Data Engineer
  • Client: Health Care Company, NY
  • Role and Accomplishments:
    • Enhanced web-based EHR by integrating data using GCP tools like Cloud SQL and Cloud Functions.
    • Employed PySpark within the GCP environment, leveraging DataProc for parallel data processing.
    • Used GCP’s Cloud Composer for workflow orchestration, streamlining deployment for visualization and analytics purposes.
  • ETL Process (August 2019 – Feb 2020)
  • Designation: Data Engineer
  • Client: Payment Processing Company, CA
  • Role and Accomplishments:
    • Managed the ingestion of streaming and transactional data using GCP services like DataFlow, Pub/Sub, and BigQuery.
    • Created a custom Python library to parse and format data, integrating it with GCP’s Cloud Functions for efficient data handling.
    • Automated ETL processes with GCP's Data Fusion, significantly reducing manual effort and enhancing data pipeline reliability.
Matching Resources
My Project History & Feedbacks
Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon

stuff goes in here!