Project 1: Expense Tracker
Click here to see live
- Expense Tracker App developed using React js library
- Used hooks and Redux to develop the app
- Deployed to Azure using the Azure App service.
Project 2: Shiny App: ODI Cricketers Info
Click here to see live
- Information about cricket players in One-Day Internationals namely Batting average, Strike Rate, Number of hundreds in comparison with Sachin tendulkar
- With filters :Minimum number of innings, player country, player name
Project 3: Covid 19 Dashboard -
Click here to see Live
- The R shiny app is built to view the interactive visualisation for cumulative COVID-19 cases and the three-machine learning model output.
- This graphs allows you to understand the way the COVID-19 cases have increased across the globe along with time.
Project 4: House Price Prediction using Advance Regression
Aim of the project is to know the following things about the prospective properties:
1 Which variables are significant in predicting the price of a house, and
2 How well those variables describe the price of a house
- Feature selection using Recursive Feature Elimination
- Building the Ridge Regression and Lasso Regression models.
- Hyperparameter tuning and selecting the best model based on RMSE and R-squared value.
Project 5: Forecast the number of passengers for the next twelve months
With the data on the number of passengers that have travelled with the airline on a particular route for the past few years. Using this data, want to forecast the number of passengers for the next twelve months.
- To capture the level, trend and seasonality in the time series with different smoothing techniques.
- Also build the different Auto-Regressive Models (ARIMA, ARMIAX, SARIMA, SARIMAX)
- Selecting the best model based on RMSE and MAPE.
Project 6: Choosing the countries that are in the direst need of aid
Aim of the project is to categorise the countries using some socio-economic and health factors and need to suggest the countries which needs to focus on the most.
- Using Hopkins Statistics to find whether data points have high tendency of cluster.
- Tuning optimal number of clusters using KMEANS clustering
- Finding the optimal number of clusters using Hierarchical Clustering
Project 7:Mobile Price Prediction
This Study aims at accurately predicting in what price range a particular mobile falls into , by fitting the data into five classifiers (K-NearestNeighbour, Decision Tree, Random Forest Classifier, Naive Bayes Classifier, and Support Vector Machine Classifier) and identify the best classifier with highest accuracy.
- As a new mobile company have to give tough fight to big companies it is essential to estimate price range of mobiles accurately. Hence accuracy performance metric was used on trained and test data.
- From the Confusion Matrix it can be interpreted that SVM predicted all the price ranges most accurately with error classification rate of 0.027.
- The average class accuracy also stood at 0.973. Thus SVM outperformed the other classifiers and is in line with our objective.
Project 8: Lead Scoring Case Study
Company named X Education gets a lot of leads, its lead conversion rate is very poor. For example, if, say, they acquire 100 leads in a day, only about 30 of them are converted. To make this process more efficient, the company wishes to identify the most potential leads, also known as ‘Hot Leads’. If they successfully identify this set of leads, the lead conversion rate should go up as the sales team will now be focusing more on communicating with the potential leads rather than making calls to everyone.
- Finding the correlation between independent variables and dropping one of the highly correlated variables.
- Important variables for the target prediction are selected using Recursive Feature Eliminations.
- Finding the optimal cut-off point between sensitivity and specificity.
Project 9: Supermarket Price Wars
The objective of the investigation is to figure out which supermarket, Coles or Woolworths, is cheaper. The sample is gathered from the website https://grocerycop.com.au/products which includes 9 products from each of the 10 categories. A large sample of 90 (n > 30) is chosen in accordance with Central Limit Theorem(CLT) to effectively avoid the issue with normality and to limit standard error.
- The paired-samples t-test is used to check for the statistically significant mean difference between Coles and Woolworths prices.
- The result of the dependent sample t-test signifies that there is a statistically significant mean difference between Coles Price and Woolworths Price.
- In conclusion, Woolworths prices are found to be significantly cheaper when compared to Coles prices.
Project 10: Time Series analysis of Airline passengers using R
A seasonal time series data is read and fitted with deterministic and stochastic trend models. A residual approach is followed to fit the stochastic models and a possible set of models is found, and each model is checked for significant coefficients and the significant models are selected for the diagnostics checking. Then the residual analysis is conducted on each model and the model with the stationary residuals is selected for the forecast.
- To capture the level, trend and seasonality in the time series with different smoothing techniques.
- Also build the Seasonal Autoregressive Integrated Moving Average (SARIMA) Model.
- Selecting the best model based on residual analysis.