Some of the functions used in this section are described below.
read.table(). If the data set for a project is not small, it is most convenient to
enter the data into R from a tabular data file in which each row corresponds to an individual and
columns contain various measurements associated with each individual. These files must be plain text
(not created by a document processor such as Word). If the data comes from a database or
spreadsheet, the simplest way to have R read the data is to have the database or spreadsheet
export the data into a comma-separated values file (csv). An example is given by the file
sep=","is needed for the crabs data file. The following R code performs this task.
Crabs = read.table("http://www.utdallas.edu/~ammann/stat3355scripts/crabs.csv",header=TRUE,sep=",")
Note that the first two columns, named Species and Sex, respectively, contain strings, not numeric values. In such cases, read.table() assumes these are categorical variables and then converts each of them automatically to a factor. The unique values of a factor are referred to as its levels. The levels of Species are B,O (for blue and orange), and the levels of Sex are M,F.
A particular column of a data frame can be accessed by name of the data frame followed by a dollar sign followed by the name of the column. So, for example,
Crabs$FLrefers to the column named FL within the Crabs data frame. We can obtain a histogram of that column by
contains data related to smoking and cancer rates by state for 2010.
Some of the graphical tools available in R are illustrated in the script file