OnBenchMark Logo


Cloud Data Engineer
Location new jersey
Total Views165
back in time
Member since30+ Days ago
 back in time
Contact Details
phone call {{contact.cdata.phone}}
phone call {{contact.cdata.email}}
Candidate Information
  • User Experience
    Experience 8 Year
  • Cost
    Hourly Rate$10
  • availability
    Availability1 Week
  • work from
    Work FromAny
  • check list
    CategoryInformation Technology & Services
  • back in time
    Last Active OnJune 12, 2024
Key Skills
Cloud Data EngineerHiveSpark SQLPysparkScaleTableau


Cloud Data Engineer

Client: Conduent

Location:Florham Park, NJMay 2022 to Present


·        Implemented Core Framework leveraging Spark that can handle the whole pipeline in one Config.

·        Understanding of structured data sets, data pipelines, ETL tools, data reduction, transformation and aggregation technique, Knowledge of tools such as DBT, data stage

·        Work with relational SQL and NoSQL databases, including Postgresql and Hadoop.

·        Worked on development of data ingestion pipelines using ETL tool, Talend & bash scripting with big data technologies including but not limited to Hive, Impala,Spark, Kafka, and Talend.

·        Designed, Installed and configured Collibra operating model and helped business to model enterprise metamodels.

·        Automated Regular AWS tasks like snapshots creation using python scripts.

·        Created architecture stack blueprint for data access with NoSQL Database Cassandra.

·        Involved in various NOSQL databases like Hbase, Cassandra in implementing and integration

·        designing data models in QlikView/QlikSense to accommodate the application requirements.

·        Worked with business users for RACI activities and incorporated them in Collibra.

·        Trained and mentored development colleagues in translating and writing NOSQL queries vs legacy RDBMS.

·        Manage end to end complex data migration, conversion, and data modeling (using Alteryx SQL), and create visualization using tableau to develop high quality dashboards.

·        Developed executive dashboards in QlikView. Delivered through the web, enabling to measure the performance of business with analytical capabilities.

·        Using AWS Redshift, me Extracted, Transformed and loaded data from various heterogeneous data sources and destinations

·        Set up Collibra communities, Domains, Types, Attributes, Status, Articulation, Workflow and customize attribution and solution including custom dashboard with metrics, status, workflow initiation and issue management for each domain specific requirements.

·        Working with Star Schema data warehouse structure to design a complete QlikView optimized data model.

·        Used AWS Glue for data transformation, validate and data cleansing.

·        Design and Develop ETL processes in AWS Glue to migrate campaign data from external sources like S3, Parquet/ text files into AWS Redshift.

·        Implementation installation and configuration of multi-node cluster on cloud using amazon web services(AWS) on EC2.

·        Developed scripts in Python (Pandas, Numpy) for data ingestion, analyzing and data cleaning and Data sources are extracted, transformed and loaded to generate CSV data files with Python programming and SQL queries and analyzed the SQL scripts and designed the solution to implement using Pyspark.

·        Implemented data streaming capability using Kafka and Talend for multiple data sources.

·        Implementing Continuous Integration and Continuous Delivery framework using Bitbucket, Maven, Jenkins, Bamboo, Nexus, Control Tier, Make in Linux environment.

·        Integrated AWS Dynamo DB using AWS lambda to store the values of items and backup the Dynamo DB streams.

·        Manage Collibra DGC across the enterprise, driving governance activities for all participating business units and ensuring all work activity is completed on time and to standards; while mitigating risks as needed.

·        Used Git for source code version control and integrated with Jenkins for CI/CD pipeline, code quality tracking, and user management with build tools Maven and written Maven pom.xml build script.

·        Developed logical & physical data model using data warehouse methodologies, including Star schema - Star-joined schemas, conformed dimensions’ data architecture, early/late binding techniques, data modeling, designing & developing ETL applications using Informatica Power Center.

·        Used Alteryx for Data Preparation and then Tableau for Visualization and Reporting

·        Data Driven and nighty analytical with working knowledge and statistical model approaches and methodologies (ClusteringRegression analysis, Hypothesis testing, Decs on trees. Machine learning) rules and ever-evolving regulatory environment.

·        Analyze and resolve conflicts related to merging of source code for GIT.

·        Hands on experience with different programming languages such as Python, SAS.

·        Experienced in building ADF pipelines. Loading and unloading data from snowflake.

·        Developed highly complex Python and Scala code, which is maintainable, easy to use, and satisfies application requirements, data processing and analytics using inbuilt libraries.

·        Work related to downloading BigQuery data into pandas or Spark data frames for advanced ETL capabilities.

·        Worked with google data catalog and other google cloud API’s for monitoring, query and billing related analysis for BigQuery usage.

·        Migrating an entire oracle database to BigQuery and using of power bi for reporting.

·        Involved in creating informatica mapping to populate staging tables and data warehouse tables from various sources like flat files DB2, Netezza and oracle sources.

·        Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark, Hadoop, Cassandra and developed enhancements to MongoDB architecture to improve performance and scalability and worked with MapReduce frameworks such as Hadoop and associated tools (pig, Sqoop, etc.)


Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon