PROFESSIONAL EXPERIENCE Data Scientist Citicorp Services India Private Limited, Mumbai Feb 2022 - Present Project-1: Risk and Control Assessment
β Objective: Develop an NLP model to address discrepancies between ratings and reviews given by managers to reporting employees.
β Dataset: Consisted of two columns (Rating and Review) with no null values.
β Featurization: Employed various techniques to extract relevant features, including word match count and percentages of sentiment words. Used Hugging Face BERT model to identify negation and finalize 19 features.
β Model: Tested multiple algorithms like Logistic Regression, Random Forest, NaΓ―ve Bayes, with Random Forest achieving the highest accuracy of 0.81 after hyper parameter tuning.
β Results: Model achieved an accuracy score of 0.81 on the test dataset, using Accuracy, F1, Precision, and Recall as evaluation metrics. Project-2: Yearly Attrition Prediction Model
β Objective: Identify factors contributing to attrition and understand why employees are leaving the organization. Develop strategies to address these factors and reduce attrition.
β Dataset: Includes Compensation, Ratings, Promotion, VOE survey, and GDP data.
β Featurization: Utilized various techniques for feature engineering, incorporating factors like recent promotions, rating changes, and appraisal differences.
β Model: Explored multiple algorithms like Logistic Regression, Random Forest, XGBoost, Naive Bayes. After hyper parameter tuning, Random Forest achieved the highest recall of 0.78.
β Results: Evaluated model using accuracy, F1 score, precision, and recall. Achieved the highest recall of 0.78 on the test dataset, indicating effective identification of attrition factors. Senior Data Engineer Mar 2021 - Mar 2022 L&T, Pune Project: Recommendation System | Client - Vesako, USA
β Objective: Predict the user's first travel destination accurately using provided train dataset. Provide five additional probable choices based on available data.
β Analysis: Selected relevant columns, performed feature engineering, conducted univariate and bivariate analysis, imputed missing values, and extracted useful features.
β Models Built: Explored multiple algorithms like Logistic Regression, Random Forest, XGBoost, and Cataboost. After hyper parameter tuning, Cataboost algorithm achieved the highest NDGC score of 0.85.
β Results: Achieved an impressive NDCG score of 0.88, indicating effective prediction of first travel destinations.
CopyrightΒ© Cosette Network Private Limited All Rights Reserved