OnBenchMark Logo

Swadesh

Data scientist
placeholder
Location 
view-icon
Total Views179
availability
Shortlist0
back in time
Member since30+ Days ago
 back in time
Contact Details
phone call {{contact.cdata.phone}}
phone call {{contact.cdata.email}}
Candidate Information
  • User Experience
    Experience 7 Year
  • Cost
    Hourly Rate$10
  • availability
    AvailabilityImmediate
  • work from
    Work FromOffsite
  • check list
    CategoryInformation Technology & Services
  • back in time
    Last Active OnJune 12, 2024
Key Skills
PythonSQLETLmachine learningssrsdata scienceinformatica power center
Summary

PROFESSIONAL EXPERIENCE Data Scientist Citicorp Services India Private Limited, Mumbai Feb 2022 - Present Project-1: Risk and Control Assessment

● Objective: Develop an NLP model to address discrepancies between ratings and reviews given by managers to reporting employees.

● Dataset: Consisted of two columns (Rating and Review) with no null values.

● Featurization: Employed various techniques to extract relevant features, including word match count and percentages of sentiment words. Used Hugging Face BERT model to identify negation and finalize 19 features.

● Model: Tested multiple algorithms like Logistic Regression, Random Forest, Naïve Bayes, with Random Forest achieving the highest accuracy of 0.81 after hyper parameter tuning.

● Results: Model achieved an accuracy score of 0.81 on the test dataset, using Accuracy, F1, Precision, and Recall as evaluation metrics. Project-2: Yearly Attrition Prediction Model

● Objective: Identify factors contributing to attrition and understand why employees are leaving the organization. Develop strategies to address these factors and reduce attrition.

● Dataset: Includes Compensation, Ratings, Promotion, VOE survey, and GDP data.

● Featurization: Utilized various techniques for feature engineering, incorporating factors like recent promotions, rating changes, and appraisal differences.

● Model: Explored multiple algorithms like Logistic Regression, Random Forest, XGBoost, Naive Bayes. After hyper parameter tuning, Random Forest achieved the highest recall of 0.78.

● Results: Evaluated model using accuracy, F1 score, precision, and recall. Achieved the highest recall of 0.78 on the test dataset, indicating effective identification of attrition factors. Senior Data Engineer Mar 2021 - Mar 2022 L&T, Pune Project: Recommendation System | Client - Vesako, USA

● Objective: Predict the user's first travel destination accurately using provided train dataset. Provide five additional probable choices based on available data.

● Analysis: Selected relevant columns, performed feature engineering, conducted univariate and bivariate analysis, imputed missing values, and extracted useful features.

● Models Built: Explored multiple algorithms like Logistic Regression, Random Forest, XGBoost, and Cataboost. After hyper parameter tuning, Cataboost algorithm achieved the highest NDGC score of 0.85.

● Results: Achieved an impressive NDCG score of 0.88, indicating effective prediction of first travel destinations.

Copyright© Cosette Network Private Limited All Rights Reserved
Submit Query
WhatsApp Icon