Professional Summary
9.5+ Years of experience in Pentaho/Hadoop Data Engineer/Hadoop Admin for Investment & Financial and Logistic domain
- Experience in Azure-HDInsight as a data Engineer along with the Administration of Hadoop On premises Hortonworks available services.
- Experience in Pentaho/Hadoop-Business Intelligence & Analytics with proficiency in ETL design and development and Data Warehouse Implementation/development.
- Excellent knowledge in Data Warehousing, ETL and Business Intelligence experience using Pentaho and Informatica.
- Experience in SQL and PL/SQL, backend programming, creating database objects like Stored Procedures, Views and Functions.
- Experienced in overall Data Warehouse, Database, ETL and performance tuning.
- Experience in working with core java to run the ETL applications.
- Experience with Slowly Changing Dimension, Error logging, and Tuning complex jobs and Error recovery using Pentaho.
- Expert in Relational and Dimensional Modeling techniques like Star Schema, Snowflake Schema, Fact and Dimensional Tables.
- Follow and implement Agile Best Practices for Data Warehousing, including solution versioning through GitHub and SVN.
- Experience working in Agile methodology and ability to manage change effectively.
- Additional experience on Manual QA - Testing of Data warehousing ETL Jobs, Created Excel Spread sheets for test reports.
- Experience in working as Production support / Performance Engineer on Hadoop and Pentaho.
- Understand the business requirements with clients based on High Level document specifications and implements the data transformations.
- Ability to meet deadlines and handle multiple tasks, decisive with strong leadership qualities, flexible in work schedules and possess Good communication skills.
Primary Skills
PDI (Kettle), BI/ETL, DWH, Azure – HDInsight
Hadoop Components – HDFS, Hive, Spark, Sqoop, Tez, Ranger, Zeppelin, Ambari, Knox, Kerberos, Map Reduce, Hotronworks, Cloudera, Python
Technology
Tools
Cloud
Azure HDInsight
BI/ETL/DW
Pentaho 6/7 (KETTLE), Informatica 9.5
Hadoop Services
Hdfs, Sqoop, Hive, Spark, Tez, Zeppelin, Ambari, Kerberos, Knox, Ranger
Hadoop Distributions
Hortonworks, Cloudera
RDBMS
PostgreSQL, MS SQL Server 2012, Oracle 11g, MySQL
Programming Skills
Python
Developing Environment
Juypter, Databricks
Data Modeling
Star, Snowflake, Fact Tables, Dimension Tables, Physical and Logical Modeling, Normalization and De-Normalization.
Database Development Environments
Oracle, SQL Developer
Operating System
Unix/Linux/Windows
Version Control & Deployment Systems
Git, SVN, Jenkins, Stash
SDLC Model
Agile and Waterfall
Scheduler Tools
CronTab, Control-M
Domain Knowledge
Investment Banking, Finance & Logistic
Software / Applications
Putty, Winscp3, MS Office
Other Technologies
Jira, Confluence