R Programming Course – Assignment 1 : Air Pollution Part 3

Part 1 : pollutantmean()
Part 2 : complete()

Part 3 : corr()

Write a function that takes a directory of data files and a threshold for complete cases and calculates the correlation between sulfate and nitrate for monitor locations where the number of completely observed cases (on all variables) is greater than the threshold. The function should return a vector of correlations for the monitors that meet the threshold requirement. If no monitors meet the threshold requirement, then the function should return a numeric vector of length 0.

This is @muntasir2165‘s code, thanks to him.

corr <- function(directory, threshold = 0) {
 id = 1:332
 filename <- list.files(directory, full.names = TRUE)

 result <-vector(mode="numeric", length=0)
 
 for(i in seq(filename)) {
   airquality <- read.csv(filename[i])
   good <- complete.cases(airquality)
   airquality <- airquality[good, ]
   if (nrow(airquality) > threshold) {
     # We need [[]] around pollutant instead of [] since airquality["sulfate"]
     # is a data.frame but we need a vector here. Hence, [[]]. Please note thatusing either
     #[[]] or [] gives the same results as the test cases
     correlation <- cor(airquality[["sulfate"]], airquality[["nitrate"]])
     result <- append(result, correlation)
     #print(correlation)
     }
  }
   result
}

2 thoughts on “R Programming Course – Assignment 1 : Air Pollution Part 3

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: