Total experience of almost 7 years & 2 months in IT, 4 years & 2 months of experience in Big Data Technologies and around 3 years experience Automation using Selenium java.
• Good knowledge of Hadoop ecosystem, HDFS, Big Data, PySpark.
• Worked on PySpark and have knowledge of Spark Archtechure.
• Worked on PySpark RDD, DataFrame, SparkSQL
• Worked on AWS Cloud, S3 Bucket, EC2
• Optimization in Spark.
• Working knowledge on Python.
• Hands on Experience in working with ecosystems like Hive, Sqoop, Map Reduce.
• Understanding of Partitioning and Bucketing in Hive .
• Working Knowledge of Hadoop Cluster architecture and Hive Architecture.
• Worked on basic SAS.
• Efficient in building hive, pig and map Reduce scripts.
• Implemented Proofs of Concept on Hadoop and different big data analytic tools, migration from different databases to Hadoop.
• Loaded the dataset into Hive for ETL Operation.
• Experience in using DBvisualizer
. • Good analytical and technical skills with strong interpersonal, written and verbal communication skills
Current Company : UNITED HEALTH GROUP
Duration : 10-Sep-2018 to Till Date Job
Designation : Big Data Engineer
Project NewMember Project
Description
It is the dashboard having full analysis, trends, information and comparison of new members joined in UHC insurance plan in current year and previous year
Role
Big Data Engineer
Roles & Responsibility
1. Extracted the data from source using Sqoop and storing it in Hive warehouse.
2. Developed the PySpark application for the requiremnts.
3. Processing of data using PySpark with Spark core, Spark SQL.
4. Worked on Python & Python Libraries.
5. Worked on Hadoop Cluster
6. Storage of files in HDFS.
7. Analytics in PySpark .
8. Automation/scheduling Job using Airflow
9. Monitoring the storage activities .
10. Attending scrum meetings
11. Maintenance of the project.
12. Following Agile Methdology.