Snehal Ilawe

Logo

Data Scientist | Data Analyst | Data Engineer | Software Developer

View My LinkedIn Profile

View My GitHub Profile

About me

My name is Snehal Ilawe. I am a final year Masters student in the Big Data Analytics program at San Diego State University with an GPA of 3.92. I currently work as Graduate Assistant with Dr Martina Musteen at San Diego State University Research Foundation, analyzing data (using Python and Tableau) on cross-border acquisitions in 180 countries to gain deeper insights into the factors affecting the completion of such acquisition deals and discover unusual patterns and outliers that can highlight areas that need investigation. I am also working as Graduate Research Assistant in the Center for Climate and Sustainability Studies under Dr Samuel Shen to design and develop climatology plots to visualize trends and gain insights in historic climate data (https://4dvd.sdsu.edu/).

I also have 4 years of experience as a Software Developer at Amdocs, India. I have been part of numerous cross-functional team and was involved in designing, developing and testing various functionalities for Amdocs Billing Ensemble using C, C++, UNIX, LINUX, SQL, and Shell Scripting.

I am interested in developing and utilizing computational tools and machine learning algorithms to study, and understand complex human/biological/technical systems, processes.

Projects

Skill Recommendation System

Built a content-based recommendation engine (TF/IDF), that helps students identify hot skills needed to excel in their career of choice. Created ETL pipelines to aggregate job postings data for California using web scraping (Beautiful Soup) from various job portals. Skill extraction from job postings was done using Word Embedding plus CNN and Named Entity Recognition (Spacy). Missing data such as salaries was computed using KNN. The front-end visualization was developed using R/Shiny App.



4DVD-(4-Dimensional Visual Delivery of Big Climate Data)

4-Dimensional Visual Delivery of Big Climate Data, or 4DVD, is a digital technology that can quickly and easily visualize and deliver any kind of space-time data, ranging from air temperature to precipitation to wind speeds to even humidity.

Publication:

4DVD visualization and delivery of the 20th century reanalysis data: methods and examples



IPL Match Prediction

Analyzed IPL match data for the last 10 years. Performed feature engineering to generate team performance indicators. Applied Random Forests, Logistic Regression and Neural Networks to predict winners. Containerized the project using Docker.



Abandonment analysis of cross-border acquisitions and mergers

Created an ETL pipelines in Python to optimize the extraction world-wide company acquisitions data with other financial data. Performed Analysis in Tableau to highlight outliers and unusual patterns affecting completion of company acquisitions.


Additional Projects


Page template forked from evanca