data types are integer, numeric (real numbers), logical (TRUE or FALSE), and character (alphanumeric strings)
data frame is a table of data that combines vectors (columns) of different types (e.g. character, factor, and numeric data). hybrid of two simpler data structures: lists, which can mix arbitrary types of data but have no other structure, and matrices, which have rows and columns but usually contain only one data type (typically numeric).
Organización o forma
stack and unstack are simple but basic functions — stack converts from wide to long format and unstack from long to wide; they aren’t
reshape is very flexible and preserves more information than stack/unstack, but its syntax is tricky: if long and wide are variables holding the data in the examples above, then
library(reshape): melt, cast, and recast functions, which are similar to reshape but sometimes easier to use
Chequeo
ˆ Is there the right number of observations overall? Is there the right number of observations in each level for factors?
Do the summaries of the numeric variables — mean, median, etc. — look reasonable? Are the minimum and maximum values about what you expected?
Are there reasonable numbers of NAs in each column? If not (especially if you have extra mostly-NA columns), you may want to go back a few steps and look at using count.fields or ill=FALSE to identify rows with extra fields . . .
str: tells you about the structure of an R variable
class: prints out the class (numeric, factor, Date, logical,etc.) of a variable.