-
Notifications
You must be signed in to change notification settings - Fork 24
Importing Data, Basic
The easiest way to get data from a spreadsheet program like Excel into R is to save it as a .csv formatted file (CSV stands for comma-separated values. Basically the data set's text is separated by commas which programs like Excel know how to turn into rows and columns.).
Note: the first row of the
.csvfile should contain the variable names.
If you have your file stored in .csv formatted files saved on your computer you can simply use the read.csv command from the utils package (loaded by default) to open it in R as a dataframe. For example, to load a file with the name myFile.csv:
# Create myData data frame from myFile.csv
myData <- read.csv("myFile.csv")Also, the foreign package allows you to import data stored in formats that are 'foreign' to R, such as .csv or Stata .dta formats.
If you have your file stored in .csv format and it is based in the web on a non-secure site (like your Dropbox Public folder) you can simply use getURL from the RCurl package to download the document. Then use read.csv from the foreign package to load it into R as a data frame. For example:
# Load required packages
library(RCurl)
# Create an object for the URL where your data is stored.
url <- "http://myFile.csv"
# Use getURL from RCurl to download the file.
myData <- getURL(url)
# Finally let R know that the file is in .csv format so that it can create a data frame.
myData <- read.csv(myData) Data stored in .csv files on secure sites like github (these have URLs that start with https) can also be downloaded and turned into R data frames. This process is similar to that for http sites. The only difference is that you just add one extra command: textConnection (this is in base R). For example:
# Load required packages
library(RCurl)
# Create an object for the URL where your data is stored.
url <- "https://myFile.csv"
# Use getURL from RCurl to download the file.
myData <- getURL(url)
# Finally let R know that the file is in .csv format so that it can create a data frame.
myData <- read.csv(textConnection(myData)) When you use `getURL with an HTTP url you might get an error message like
Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
There is a simple solution: just add ssl.verifypeer = FALSE. So in this example you would type:
myData <- getURL(url, ssl.verifypeer = FALSE)
For more details see this blog post.