Posts Tagged ‘ database ’

Genome annotation with NCBI2R

November 18, 2012
By

It's very convenient manage data with R: you can import your dataset, you could find many packages which respond to your needs, then you could plot your results. However it could be very bothersome retrieve the data from online databases. … Continue reading →

Read more »

Association Rule Learning and the Apriori Algorithm

September 26, 2012
By
Association Rule Learning and the Apriori Algorithm

Association Rule Learning (also called Association Rule Mining) is a common technique used to find associations between many variables. It is often used by grocery stores, retailers, and anyone with a large transactional databases. It’s the same way that Target knows your pregnant or when you’re buying an item on Amazon.com they know what else you want

Read more »

Data Frames and Transactions

September 24, 2012
By

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets. In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to transaction the researcher may use the

Read more »

Using R to connect to a SQL Server and MySQL Database using MS Windows

September 8, 2012
By

Connecting to MySQL and Microsoft SQL Server Connecting to a MySQL database or MS SQL Server from the R environment can be extremely useful.  It allows a researcher direct access to the data without have to first export it from a database and then import it from a csv file or entering it directly into

Read more »

Databases (SQL, noSQL); Interfacing R with Excel

December 15, 2010
By
Databases (SQL, noSQL); Interfacing R with Excel

Los Angeles R users group Dec. 14 2010 meeting (see meetup info here): 1. A SQL primer for R users – Neal Fultz Video and slides will be available soon 2. R Database Access – Shrikrishna Bhogaonker 3. NoSQL data … Continue reading →

Read more »

GEO database: curation lagging behind submission?

August 30, 2010
By
GEO database: curation lagging behind submission?

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the issue, using the Bioconductor GEOmetadb

Read more »

Samples per series/dataset in the NCBI GEO database

January 7, 2010
By
Samples per series/dataset in the NCBI GEO database

Andrew asks: I want to get an NCBI GEO report showing the number of samples per series or data set. Short of downloading all of GEO, anyone know how to do this? Is there a table of just metadata hidden somewhere? At work, we joke that GEO is the only database where data goes in,

Read more »

Choosing an SQL Engine for Analytics

March 9, 2009
By
Choosing an SQL Engine for Analytics

I’ve been struggling for a while on which database to use for my working data. I used to use MS Access quite a lot. The problems with MS Access include but are not limited to: 2 GB file size limit, at least historically Versions change with each edition of MS Office Sort of tough to write SQL scripts Very

Read more »