Blog Archives

Repetitive Q: Reading Multiple Files in the Zip Folder

October 17, 2019
By

Dear Readers,I always see a repetitive question coming to me and across various forums on how to read multiple files in the zip folder of same separator or multiple separator. Again, here, lets not compromise on speed.Solution is to use easycsv package in R, which in turn uses data.table package function "fread".Find below a quick example:library(easycsv) ## Loading required package:...

Read more »

Forecast Stability Guidance for Model Selection

October 2, 2019
By

In real world forecasting task, we don’t have luxury of actuals in hand for better model selection, in such realistic situations, forecast stability can guide us to some extent. Forecast Stability in simple terms, is all about how forecasts behave versus forecasts, we can measure it with simple coefficient of variation. This measure also helps us to understand non-randomness...

Read more »

Tips for R to Python and Vice-Versa seamlessly

March 31, 2019
By
Tips for R to Python and Vice-Versa seamlessly

When we TATVA AI visit our clients, often both data scientists and higher management ask us, how we deal with both  Python and R simultaneously for client requests; as there is no universal preference among clients. Though solution is not straight forward, however, I suggest to exploit common libraries for quick deployments, such as, dfply (python) and dplyr (R). Below is...

Read more »

Nimble tweak to use specific python version or virtual environment in RStudio

January 1, 2019
By

Reticulate made switch between R & Python easy, and doing its best to facilitate both worlds of data science.Meanwhile, I noticed that most of my followers or students raised the issues of uneasy switch betweendifferent python versions or virtual e...

Read more »

Now "fread" from data.table can read "gz" and "bz2" files directly

October 31, 2018
By

Dear R Programmers,Those who all use data.table for your data readings, good news is that now, fread supports direct reading of zip formats like"gz" and "bz2".To all my followers and readers, as mentioned earlier several times, good way for saving both space and reading fast is achievable by first saving raw files into "gz" format and their after reading the...

Read more »

Data Summary in One Go

July 3, 2018
By
Data Summary in One Go

Data Description R CodeThis function and package is long pending for publishing from my side, this time expecting soon to put as package for quick usage, before that thought releasing it for feedback.Below function provides R code for getting data des...

Read more »

Read and write using fst & feather for large data files.

December 19, 2017
By

For past few years , I was using featheras my favorite data writing and reading option in R (one reason was its cross platform compatible across Julia, Python and R), however, recently, observed it’s read and write time lines were not at all effective with large files of size __ 5 GB. And found fst format to be good...

Read more »

Clean or shorten Column names while importing the data itself

August 29, 2017
By
Clean or shorten Column names while importing the data itself

When it comes to clumsy column headers namely., wide ones with spaces and special characters, I see many get panic and change the headers in the source file, which is an awkward option given variety of alternatives that exist in R for handling them. One easy handling of such scenarios is using library(janitor), as name suggested can...

Read more »

Hard-nosed Indian Data Scientist Gospel Series – Part 1 : Incertitude around Tools and Technologies

August 23, 2017
By
Hard-nosed Indian Data Scientist Gospel Series – Part 1 : Incertitude around Tools and Technologies

Before recession a commercial tool was popular in the country, hence, uncertainty around tools and technology was not much; however, after recession, incertitude (i.e. uncertainty) around tools and technology have pre-occupied and occupying data sc...

Read more »

Big Data Insights: Tale of IT Investments and Returns

July 11, 2016
By
Big Data Insights: Tale of IT Investments and Returns

Once again, this post brings forth to the audience, a predictive analytical insight from huge volumes of information technology security data belonging to two fortune 500 companies (more or less having similar characteristics). Going to a quick backgro...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)