Discription
Page 1 of 3
Sumit-Data Scientist-4+Years
Resume Objective
Data Scientist with 4+ years of experience executing data driven solutions to increase efficiency,
accuracy and utility of internal data processing. Experienced at creating data regression models
using predictive data modelling and analyzing data mining algorithms to deliver insights and
implement action-oriented solutions to complex business problems.
Skills
Python, Pandas, NumPy, Matplotlib, Seaborn, Exploratory Data Analysis, Machine Learning,
Linear Regression, Logistic Regression, Decision Tree, Random Forest, SQL, Data Management,
Data Mining, Handling Pressure, Collaboration, Problem Solving, Leadership.
Professional Expertise
Role: Data Scientist
Period: Aug 2021 – July 2022.
Technology Used: Python, Pandas, NumPy, Matplotlib, Seaborn, SQL
The Project Profile:
Leading telecom players, understands that customizing offerings is very important for its
business to stay competitive. Currently, it is seeking to leverage behavioural data from more
than 60% of the 50 million mobile devices active daily in India. They are doing this to help their
clients better understand and interact with their audiences.
Contribution:
Fetched the data onto Jupyter notebook from CSV file and DB2 database
Conducted the Data Preparation:
Observed the challenges in the dataset i.e., outliers, missing values.
Verified for any null value in the ‘ID’ column.
Used folium package to plot latitudes and longitudes to see if any discrepancy in
the position of points.
Conducted Data Visualization using Matplotlib/Seaborn.
Page 2 of 3
Conducted Data Analysis using Pandas, Sklearn
Univariate Analysis
Bivariate Analysis using correlation and chi square.
Observed the user behavior which are going to directly impact the company’s offerings
for more than 60% of the 50 million mobile devices active daily in India.
Professional Expertise
Client: Westpac Services
Project: Westpac Services
Domain: BFS
Role: Data Scientist
Period: Sept 2018 – June 2021.
Technology Used: Python, Pandas, NumPy, Matplotlib, Seaborn, SQL, Sklearn, Machine learning
The Project Profile
The Hogan suite of products is used by a number of the top US banks, and it supports most of
the core banking functions with a highly integrated suite of systems. It was the first integrated
mainframe banking system developed using middleware and object-oriented development
methodology. DXC's technologically advanced Hogan Systems is designed to meet the core
banking needs of global financial services institutions. HOGAN is an integrated system built on
single and solid architecture. Application Systems (CIS, IDS, ODS, PAS and CAMS), Umbrella and
Financial Support System (FSS) are main subsystems of HOGAN product.
Contribution:
For the process of automating the loan eligibility process based on customer detail
provided while filling out application form, conducted machine learning algorithms
Decision Tree and Random Forest in automating the facility
Conducted Data Transformation and Data Preparation using Pandas, NumPy, Sklearn
involving missing value imputation, fix for inconsistencies, Transform/encode discrete
variables using one-hot encoding or Label Encoder.
Conducted Data Visualization using Matplotlib/Seaborn.
Conducted Data Analysis using Pandas, Sklearn
Univariate Analysis
Bivariate Analysis using correlation and chi square.
Verified the data for any multi-collinearity among independent variables
Page 3 of 3
Verified for any outliers in the independent variables.
Conducted Machine Learning Algorithms Decision Tree, Logistic Regression and Random
Forest
Evaluated the model and picked the best-fit using classification report (F1 score),
Accuracy using Sklearn
Applied the model on Test Data Set in order to evaluate the accuracy and classification
report of the model.
*********THANKS********