Skip to content

Files

Latest commit

author
chris.kelly
Oct 21, 2015
11b9a57 · Oct 21, 2015

History

History
40 lines (29 loc) · 1.68 KB

README.md

File metadata and controls

40 lines (29 loc) · 1.68 KB
title author date output
README
Chris Kelly
Thursday, October 20, 2015
html_document

Course project

The script run_analysis.R creates a tidy dataset from the dataset described at

http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones,

which dataset is available at:

https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

The run_analysis.R script is commented, so a detailed description of all the steps isn't needed here. A summary is warranted, however.

Note: the dataset must be unzipped in the "UCI HAR Dataset/" subdirectory of the current working directory in R or RStudio.

The script reads the dataset and combines the test and training data files (6 in all) into one dataset, containing only the 66 mean measurements, discarding the meanFreq variables (since they are derived from other data). The activity codes were replaced with the activity names for clarity. The script writes this as the file: tidyData.txt using write.table().

Also a tidy summary of the dataset is created in tidySummary.txt, also written when write.table(). The summary is in the "wide" format, with one variable in each column. The first column is the subject ID, followed by the activity, followed by 66 columns for the mean variables chosen for the dataset, averaged across all observations of each activity for each subject.

This summary can be read into R or RStudio with:

MyURL <- "http://s3.amazonaws.com/coursera-uploads/user-7f4773206024329704d09eda/975117/asst-3/b091792076da11e5a5e959f85260abc1.txt"
read.table(MyURL, header = TRUE)

The codebook for the dataset is Codebook.html or Codebook.md.