Site icon R-bloggers

winedApp: Wine Recommendation and Data Analysis Web App

[This article was first published on R – NYC Data Science Academy Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

About 

Don’t wind up with the wrong wine at the wrong time: unwind with the world’s best!

winedApp provides insights into prices, ratings, descriptions, and geographic distribution of the world’s most esteemed wines. Novice or connoisseur, consumer or seller, this app will meet your oenophile needs.

In the Wine Explorer, you can enter your location, varietal, aroma, taste, and price range preferences, and retrieve information on compatible wines. 

The Global Insights feature offers map visualizations of international wine trends.

Graphs and Charts provides additional lenses into relationships amongst countries of origin, varietals, prices per bottle, and ratings.

The data was sourced from ca. 36,000 wine reviews on the WineEnthusiast site. In this dataset, 145 varietals from 36 countries and 23 US states are represented. 

Further information was extracted from the Wikipedia list of grape varieties

General Trends 

  1. Wine prices (per bottle) and points awarded do not show a strong positive correlation (for certain countries, there is even a negative correlation). 
  2. The US, Italy, and France are by far the most represented in the dataset. 
  3.  The most represented varietals are Chardonnay (10, 996 entries) and Cabernet Sauvignon (9,058 entries). 
  4. Most wines fall within the range of $4-50 per bottle, but the distribution is right-skewed. The full price range is $4-2,013. 
  5. Varietals vary considerably with respect to point and price ranges, as well as country distribution. 
  6. There is a statistically significant difference between average prices per bottle for red vs. white wines ($42.47 and $30.51, respectively, with p << 0.05). 
  7. There is also a statistically significant difference regarding average point values for reds vs. whites (88.43 and 88.29, respectively, with p << 0.05, on an 80-100 point scale). 

Future Work

Features will be added and refined on a continuous basis. Any suggestions are welcome.

Current objectives include: 

  1. Testing the app on a larger dataset; 
  2. Enabling the user to determine if a given wine is available in their local area; 
  3. Building a price predictor. 

Technical Details 

Web scraping was completed using the CRAN rvest package.  The list of descriptive keywords featured in the Wine Explorer menu item was generated using the nltk (Natural Language Toolkit) in Python. All other content was produced via the R Shiny Dashboard library and associated data visualization packages. To view the app code, please visit this Github repository

External Links

App || Project Github || Author’s LinkedIn 

To leave a comment for the author, please follow the link and comment on their blog: R – NYC Data Science Academy Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.