Worked as a Backend Engineer for the ETL tool we created. Wrote Dynamic code in python which generate DAG(Airflow). Extensive experience in working with various data formats including JSON, Parquet, Excel and CSV, ensuring data compatibility and integrity throughout the processing lifecycle. Proactive approach to monitoring, managing, and scaling Spark clusters, resulting in optimized resource utilization and enhanced overall system performance.(Databricks) Used Spark 3.0 to leverage AQE and other features. Created few UDF'S. Created notebooks for features like some cleansing and filter features profiling,replace, join,etc in scala Created Data PipeLine using Airflow and stored all the metadata in MongoDB Worked with file formats like parquet and new table called 'Delta table' in databricks Scheduling the workflows created from the ETL tool by the user using Airflow. Implemented and upheld the deployment of Airflow utilizing Docker as part of the development process. This entails ensuring the setup, management, and continuous operation of the Airflow platform within the Docker environment. Created Azure Containers and used Blob storage for Storage.(Microsoft Storage Explorer) Have done some POC on Azure Managed Airflow(ADF) and MWAA(Azure).