As a Big Data Engineer, you will work with multiple teams to deliver solutions on the Azure Cloud using core cloud data warehouse tools (Azure Data Factory, Azure Databricks, Azure Synapse Analytics and other Big Data related technologies). In addition to building the next generation of application data platforms (not infrastructure) and/or improving recent implementations. Experience in Databricks and Spark on other cloud platforms like AWS and GCP is also relevant.
- Defines, designs, develops and test software components/applications using Microsoft Azure- Databricks, ADF, ADL, Hive, Python, Databricks, SparkSql, PySpark.
- Expertise in Azure Data Bricks, ADF, ADL, Hive, Python, Spark, PySpark
- Strong T-SQL skills with experience in Azure SQL DW, Redshift, BigQuery
- Experience handling Structured and unstructured datasets
- Experience in Data Modeling and Advanced SQL techniques
- Experience implementing Azure Data Factory, AWS Glue or any other data orchestration tool using latest technologies and techniques.
- Good exposure in Application Development.
- The candidate should work independently with minimal supervision.
- Hands on experience with distributed computing framework like DataBricks, Hadoop, Hive, Spark-Ecosystem (Spark Core, PySpark, Spark Streaming), SparkSQL
- Willing to work with product teams to best optimize product features/functions.
- Experience on Batch workloads and real time streaming with high volume data frequency
- Performance optimization on Spark workloads
- Environment setup, user management, Authentication and cluster management on Databricks
- Professional curiosity and the ability to enable yourself in new technologies and tasks.
- Good understanding of SQL and a good grasp of relational and analytical database management theory and practice.
Good To Have:
- Hands on experience with distributed computing framework like DataBricks, Hadoop, Hive,
- Experience with Databricks migration from On-preimse to Cloud OR Cloud to Cloud
- Migration of ETL workloads from Vanilla Spark implementations to Databricks
- Experience on Databricks ML will be a plus
- Migration from Spark 2.0 to Spark 3.0
- Databricks Solution Architect Essentials badge
- Databricks Developer Essentials
- Apache Spark Programming with Databricks
- Data Engineering with Databricks
- Lakehouse with Delta Lake Deep Dive
- Fundamentals of Unified Data Analytics with Databricks
Python, SQL, Pyspark, Hive, Sqoop, YARN, Spark Ecosystem, Databricks and Azure Data Factory, Data Modelling, ETL Methodology, Azure Synapse Analytics, HDInsights, Hadoop.