Restaurant Rating Prediction Capstone Project

Introduction

This project aims to develop a machine learning model to predict the rating of a restaurant based on various variables such as location, price, number of reviews, and more. The ability to predict the success of a restaurant can help determine the best possible locations and factors that contribute to a thriving business in the highly competitive restaurant industry.

Data Acquisition

The data for this project was obtained using the Yelp API, specifically the business search endpoint. The following variables were selected for each restaurant:

City
Price
Number of Reviews
Rating
Latitude
Longitude

Approximately 500 restaurants were collected from each of the four states: California, Nevada, Utah, and Arizona, resulting in a total of nearly 2,000 places to explore.

Exploratory Data Analysis

The initial analysis of the data revealed that the distribution of ratings and the correlation between variables were relatively uniform across all four states. However, the correlations between the selected variables did not appear to be strong enough to provide a reliable prediction of restaurant ratings.

Clustering

Two clustering approaches were employed to visualize the distribution of restaurants:

Statewide Clustering: The first attempt focused on displaying the distribution of restaurants across the four states.
City-level Clustering: The second approach narrowed down the analysis to restaurants located within California. This process was repeated for each city, with the resulting clusters primarily represented by the number of reviews. The clusters were color-coded as follows:
- Red: Low number of reviews
- Blue: Medium number of reviews
- Green: High number of reviews

Predictive Modeling

Two machine learning algorithms were used to predict the rating of each restaurant:

Multiple Linear Regression: This model was employed to capture the linear relationships between the variables and the restaurant ratings.
Support Vector Machine (SVM): SVM was used as an alternative approach to predict the ratings based on the available features.

However, the results of both models indicated that the predictions were not sufficiently accurate, suggesting that the selected variables alone may not be enough to reliably predict restaurant ratings.

Conclusions and Next Steps

The clustering analysis provided valuable insights into the distribution of restaurants based on price and the number of reviews across different states and cities. However, the predictive models did not perform well in terms of accurately predicting restaurant ratings. This could be attributed to the lack of strong correlations between the selected variables and the limited dataset size.

To improve the predictive performance, the following steps can be considered:

Collect a larger dataset with more restaurants and potentially additional relevant variables.
Partition the data into more equal-sized subsets to test the variables and their impact on the prediction.
Explore and incorporate variables that are more directly related to the problem at hand.

By refining the data collection process, selecting more relevant variables, and expanding the dataset, the predictive model's accuracy can potentially be enhanced, providing more reliable insights into the factors that contribute to a restaurant's success.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
.DS_Store		.DS_Store
Capstone Presentation.pdf		Capstone Presentation.pdf
Capstone Project.ipynb		Capstone Project.ipynb
Capstone_Paper.pdf		Capstone_Paper.pdf
Capstone_Project_Brochure.pdf		Capstone_Project_Brochure.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Restaurant Rating Prediction Capstone Project

Introduction

Data Acquisition

Exploratory Data Analysis

Clustering

Predictive Modeling

Conclusions and Next Steps

About

Uh oh!

Releases

Packages

Languages

tataknu/Food-Rating-Prediction-IBM-

Folders and files

Latest commit

History

Repository files navigation

Restaurant Rating Prediction Capstone Project

Introduction

Data Acquisition

Exploratory Data Analysis

Clustering

Predictive Modeling

Conclusions and Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages