Computer Science Homework Help

University of North Florida R Programming Assignment

 

In this activity, you will analyze a real-world large data set involving flight data and use the advanced features of ggplot to produce graphs of this data.

In previous modules, the data used in your projects came from small built-in R data sets or were generated by using probability distributions. We are now ready to use real-world, not manufactured, data. The data set nycflights13 contains data collected on domestic flights out of New York City in 2013. It contains 336,776 records, each with 19 variables. This is a big data set.

Analyze the nycflights13 data to determine which day of the week has the longest average delay time. Do the delay times vary by airport? What about weekday?

If you have not done so already, use the directions given in Chapter 13 of this module’s readings to install the nycflights13 data set and the R tidyverse package.

Execute the R code necessary to recreate the delay, weekday, and airport graph in Chapter 13, Figure 13-4, page 285 of your readings. You will need to start at the beginning of the chapter to prepare your data for producing the plot. Document your code and the histogram in an HTML R Markdown document and submit the HTML file for this assignment.