Profile
Summary: Data scientist with 5 years of experience in predictive modelling, data processing, and data mining algorithms to solve challenging business problems. Strong background in computer programming language, and knowledge of various types of machine learning and natural language processing techniques.
- Overall 6+ years of experience in IT Industry.
- Experience in Machine Learning and NLP.
- Presently associated Innova Solutions as Data Scientist
- Programming Languages: Python, Pandas, Machine Learning, Sk learn Libraries.
- Cloud Services : AWS Cloud Services
- Databases: SQL, MongoDB
- Platforms and Misc: Pycharm, Visual Code, Anaconda, Jupyter Notebook
- Predictive analytics using Linear Regression, Logistic Regression, Multiple Linear Regression
- Have been working on different Machine Learning algorithms like Decision Tree, Random Forest, K-Means Clustering etc.
- Having experience on NLP algorithms like Spacy, RNN, LSTM and Bi-LSTM
- Knowledge on spark for data processing and databases like MongoDB and MySql
- Possess good interpersonal skills that have been put to good use in co-coordinating with Project teams and providing customized software solutions.
- Team player with effective communication skills and proven abilities in resolving complex issues.
Employment Scan
22-April-2020 to Current Data Scientist
17-October-2016 to 16-April-2020 Software Developer
Projects:
Project1: Payment Processing
Technical Environment: Python, numpy, pandas, matplotlib, seaborn, scipy, scikit-learn and tensor flow.
Description:
Payment solution helps users to analyze and create data stories from large volumes of data that could be available from various sources and channels. This aims to simplify and connect the entire payments world. Creating a dynamic, evolving solution and providing tools for pro-active collaboration amongst all parties.
Roles & Responsibilities
- Understanding business objectives and developing models that help to achieve them, along with metrics to track their progress
- Analyzing the Machine Learning algorithms and Predictive Modeling that could be used to solve a given problem and ranking them by their success probability
- Using python libraries(pandas, numpy, matplotlib) to extract the data into the working environment
- Done Exploratory Data Analysis to extract insights from the data.
- For Description variable, extracting numerical features used NLP Techniques(Tokenize, Bag of words, Lemmatization ,TF-IDF)
Project2: Text Modeling
Technical Environment: Python, Pandas, NLTK, Pytesseract, Bi-lstm, lstm,layoutlm
Description:
This project shows how to find and label key fields of receipts and other similar documents such as invoices or request forms. Documents like these have a visual arrangement of tokens in addition to the token content itself. The arrangement presents an opportunity to use the position of tokens on a document as an additional input.
Roles & Responsibilities
- Build information extraction model using text modelling.
- Extracted text from images using different OCR Techniques.
- Manually marked labels for all the observations for training the model.
- Involved in Data Preprocessing Techniques for making the data useful for model.
- Processed hundreds of documents for creating custom training data.
- Used pretrained layoutlm base uncased model to train the data.
- Translate product requirements into analytical requirements/specification, design and develop required functionality.
Project3: PDD Model
Technical Environment: Python, Pandas, Logistic Regression, Random Forest and Sql Server
Description:
PDD is advance calling/sending SMS to non-delinquent pool of customers, who are at High or Very High Risk of Bounce. This activity helps business to identify High or Very High Risky Customers and apply business strategies accordingly, to avoid delinquency
Roles & Responsibilities
- Understanding business objective and developing models that help to achieve them.
- Analyzing the machine learning algorithms that could be used to solve in a given problem and ranking them by successive probability.
- Worked on Exploratory Data Analysis to bring insights from data.
- Worked on feature selection, training, and hyper parameter tuning.
- Worked on evaluation of metrics and improving the model by analyzing metrics.
- Fetching the customer base according to their cycle date from SSMS.
- With the help of pandas package, we’ll load data into Jupyter notebook.
- Using PY spark perform actions on the dataset.