Blog Archives

Survey: What Degree is Best for Data Science?

February 21, 2020
By
Survey: What Degree is Best for Data Science?

  TL;DRJust answer 4 questions about best degree for Data Science here: https://www.surveymonkey.com/r/7FGGWS7 No doubt asking the question "What's the best degree for Data Science?" one won't expect unified or even a few opinions (unless everything I know about people practicing data science is all wrong). Stephanie Glen analyzed various sources on the topic to show just that:  Source: Best Degree...

Read more »

H2O.ai Academic Program for Professors and Students: Quick Start with Driverless AI and Paperspace

February 11, 2020
By
H2O.ai Academic Program for Professors and Students: Quick Start with Driverless AI and Paperspace

If you are a professor teaching or a student enrolled in machine learning program or non-technical program with a machine learning hands-on lab becoming a member of the H2O.ai Academic Program will get you free access to non-commercial use of software license for education and research purposes. Since November 2018 H2O.ai (my employer) made its ground-breaking...

Read more »

How H2O propels data scientists ahead of itself: enhancing Driverless AI with advanced options, recipes and visualizations

December 14, 2019
By
How H2O propels data scientists ahead of itself: enhancing Driverless AI with advanced options, recipes and visualizations

H2O engineers continually innovate and implement latest techniques by following and adopting latest research, working on cutting edge use cases, and participating and winning machine learning competitions like Kaggle. But thanks to explosion of AI research and applications even most advanced automated machine learning platforms like H2O.ai Driverless AI can not come with all bells and whistles to...

Read more »

Finally, You Can Plot H2O Decision Trees in R

December 25, 2018
By
Finally, You Can Plot H2O Decision Trees in R

Creating and plotting decision trees (like one below) for the models created in H2O will be main objective of this post: Figure 1. Decision Tree Visualization in R Decision Trees with H2O With release 3.22.0.1 H2O-3 (a.k.a. open source H2O or simply H2O) added to its family of tree-based algorithms (which already included DRF, GBM, and XGBoost) support for one more: Isolation...

Read more »

Surviving Shelter: Analysis of Time Spent and Outcome in Dallas Animal Shelters

April 1, 2018
By
Surviving Shelter: Analysis of Time Spent and Outcome in Dallas Animal Shelters

In previous post we discovered Dallas Animal Services data sources (available on Dallas Open Data) and successfully analyzed how animals get admitted to and discharged from the city shelters. We loaded actual shelter records and looked at the types of admittance, different outcomes and their relationships. In this post we continue this analysis by focusing on the time animals spend...

Read more »

Dallas Animal Services: Shelter Intake Types vs. Outcomes Analysis

August 4, 2017
By
Dallas Animal Services: Shelter Intake Types vs. Outcomes Analysis

Thanks to Dallas OpenData anyone has access to the city animal shelter records.  If you lost or found a pet it could be that he or she spent some time in a shelter - I personally took lost dogs there. It's unfortunate but every year tens of thousands of animals find their way to shelters with significant fraction never finding way out. City...

Read more »

The Role of Small Data and Vacation Recap Example

July 5, 2017
By
The Role of Small Data and Vacation Recap Example

Wikipedia defines small data 'small' enough for human comprehension but then it goes further by qualifying data in a volume and format that makes it accessible, informative and actionable. I am not certain the latter is always true: smaller footprint doesn't automatically qualify data as informative and actionable without more work. In my book small data usually scales to...

Read more »

Logarithmic Scale Explained with U.S. Trade Balance

June 23, 2017
By
Logarithmic Scale Explained with U.S. Trade Balance

Skewed data prevail in real life. Unless you observe trivial or near constant processes data is skewed one way or another due to outliers, long tails, errors or something else. Such effects create problems in visualizations when a few data elements are much larger than the rest. Consider U.S. 2016 merchandise trade partner balances data set where each point is a country...

Read more »

MapReduce in Two Modern Paintings

May 25, 2017
By
MapReduce in Two Modern Paintings

Two years ago we had a rare family outing to the Dallas Museum of Art (my son is teenager and he's into sport after all). It had an excellent exhibition of modern art and DMA allowed taking pictures. Two hours and dozen of pictures later my weekend was over but thanks to Google Photos I just...

Read more »

Correlation Primer with Aster and R

December 20, 2016
By

Calculating correlations is often starting point before more advanced analytical steps take place. Big data (long data) always presents computational challenges of both scale and distributed nature. In turn they may get aggravated by the presence of large number of features (wide data). But challenges do not stop here as complex relationships induce analysis of correlations across subsets and groups....

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)