SHAMINE MACWAN
PROFILE
WANT TO BE AN ENTHUSIATIC PRACTITIONER WITH AN ACADEMIC BACKGROUND IN STATISTICS AND BIG DATA ANALYTICS WITH EXPERTISE IN PROGRAMMING SKILLS, MACHINE LEARNING, DATA SCIENCE, NLP AND READY TO CONTRIBUTE TO ANY DOMAIN FOR THE ADVANCEMENT OF MY CAREER AND BETTERMENT OF THE ORGANIZATION.
SKILLS
- Python
- Machine Learning
- Deep learning
- Tableau
- Predictive Analysis
- NLP
CERTIFICATIONS
- Practical Time Series Analysis by Coursera (15/07/2020)
- Intro to Time Series Analysis in R by Coursera (10/09/2020)
- Visualizing Citibike Trips with Tableau by Coursera (11/09/2020)
EDUCATION
MSc. Big Data Analytics Bachelors in Statistics
2019 - 2021 2016 - 2019
St. Xavier’s College, Mumbai Bhavans College, Mumbai
CGPA – 9.65 /10 CGPA – 8.56 /10
HSC SSC
2015 - 2016 2013 - 2014
Vivek Vidyalaya Jr. College St. Thomas Academy, Mumbai
70% 85.04%
Work EXPERIENCE
Nimap Infotech LLP. July’21 – Present
Jr. Machine Learning/NLP Engineer
- Created a network graph of skills and related skills. Applied community detection technique to find different related skills and find the shortest path for a particular skill.
- Performed TF-IDF, SVD, Hashing, Binning on attributes to predict salary (regression analysis) and predict salary range (classification analysis).
- The future three roles were predicted for a current role various techniques used were MultiOutputClassifier, ClassifierChain, Neural Network and the best model was selected.
- Trained the data using FastText library to find Skills and Roles which were semantically similar.
- Predicted max. 4 Roles based on Technical Skills (multilabel problem) using different Classifiers.
- Trained data using T5 model to generate job description for a given Role, Industry and Technical Skills.
INTERNSHIP
NAVCARA CONSULTING Feb’21 – June’21
Associate Consultant Data Science intern
PROJECT: 30-DAY HOSPITAL RE-ADMISSION
- Applied machine learning techniques to Clean, Transform & Reduce the data.
- The analysis was done based on three criteria’s
- Including the outliers and all the features
- Removing outliers, selecting features using PCA
- balancing the data using SMOTE (up-sampling)
- balancing the data using RandomUnderSampler(down-sampling)
PROJECT: CREATING A CHATBOT
- NLP was applied to clean the questions and answers.
- For creating a Chatbot LSTM neural network was used.
PROJECTS
BUILD A CAT BREED IMAGE CLASSIFICATION MODEL WITH THE INCEPTION CNN ARCHITECTURE (1/07/2021)
- The dataset is of 10 breeds of cats, trained the dataset using the Inception CNN architecture.
- Dataset is split into training and testing sets using “image_dataset_from_library” and then images are trained using RMSprop optimizer.
DETECTION OF FAKE NEWS VIA NLP (26/08/2020)
- Cleaned the dataset using Python tools, used techniques like tokenization, stopwords, lemmatization, Tf-idf vectorization.
- Used Logistic regression & Random Forest supervised techniques on the data for detection.
- Used LIME to reflect the contribution of each feature to the prediction of a data sample.
A STUDY OF MOBILE PRICE DATASET USING DIFFERENT CLASSIFICATION TECHNIQUES
- Imputed missing values, used Extra Trees Classifier for selecting the top 10 features.
- The price of a mobile was predicted using Neural Network and other classification techniques like Random Forest, Naïve Bayes, K-Nearest Neighbor, Decision Tree to select the model which gives highest accuracy and
less bias.