--- title: "Introduction to gsplot." author: "Jordan Read, Laura DeCicco, Jordan Walker, Phethala Thongsavanh, Lindsay Carr" date: "`r format(Sys.time(), '%d %B, %Y')`" output: rmarkdown::html_vignette: toc: true number_sections: true fig_caption: yes vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{gsplot Intro} \usepackage[utf8]{inputenc} --- ```{r message=FALSE, echo=TRUE, fig.cap="Demo workflow", fig.width=6, fig.height=6} library(gsplot) MaumeeDV <- MaumeeDV demoPlot <- gsplot() %>% points(y=c(3,1,2), x=1:3, xlim=c(0,NA),ylim=c(0,NA), col="blue", pch=18, legend.name="Points", xlab="Index") %>% lines(c(3,4,3), c(2,4,6), legend.name="Lines", ylab="Data") %>% abline(b=1, a=0, legend.name="1:1") %>% legend(location="topleft",title="Awesome!") %>% grid() %>% error_bar(x=1:3, y=c(3,1,2), y.high=c(0.5,0.25,1), y.low=0.1) %>% error_bar(x=1:3, y=c(3,1,2), x.low=.2, x.high=.2, col="red",lwd=3) %>% callouts(x=1, y=2.8, lwd=2, angle=250, labels="Weird data") %>% title("Graphing Fun") demoPlot ``` ## Overview `gsplot` uses similar plotting graphics to R base graphics, but allows users to execute them in a more intuitive manner. Additionally, as the complexity of the plot features increase, `gpslot` code is simplistic compared to that of base graphics. `gsplot` also includes features not present in base graphics that are useful when working with USGS data, such as `callouts` (combines `segments` and `text` into a single call), `error_bar` (allows an error to be given as `y.high`, `y.low`, `x.high`, and `x.low` and automatically builds an error bar), and the argument `legend.name` (an argument within `points`, `lines`, etc. which does not require colors, linetypes, and other par information to be redefined within the `legend` call). ## Data manipulation Data from Maumee River will be used to showcase the workflow and features that `gsplot` offers. First, the data is manipulated to extract the timeseries as four separate variables - dates (formatted as yyyy-mm-dd), flow (discharge in cubic feet per second), pH, and Wtemp (water temperature in degrees Celcius). Additionally, the USGS site IDs for the sampling stations are identified. ```{r echo=TRUE, message=FALSE} sites <- unique(MaumeeDV$site_no) dates <- sapply(sites, function(x) MaumeeDV$Date[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE) flow <- sapply(sites, function(x) MaumeeDV$Flow[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE) pH <- sapply(sites, function(x) MaumeeDV$pH_Median[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE) Wtemp <- sapply(sites, function(x) MaumeeDV$Wtemp[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE) ``` ## Simple timeseries First, `gsplot` is used to create a simple timeseries graph for discharge, and a grid is added to help with data readability. ```{r echo=TRUE, fig.cap="Fig. 1 Simple flow timeseries using `gsplot`.", fig.width=6, fig.height=6} site <- '04193500' demoPlot <- gsplot(mgp=c(2.75, 0.3, 0.0)) %>% lines(dates[[site]], flow[[site]], col="royalblue") %>% title(main=paste("Site", site), ylab="Flow, ft3/s") %>% grid() demoPlot ``` ## Simple timeseries using a log scale This data may be better represented using a log scale due to the range of flow values. Thus, the yaxis is easily turned into a logged scale by inserting the code `log='y'`. To make sure that the grid lines correspond to the logged axis, the code `equilogs=FALSE` is used to let gridlines be drawn at unequal distances from each other. ```{r echo=TRUE, fig.cap="Fig. 2 Simple flow timeseries with a logged y-axis using `gsplot`.", fig.width=6, fig.height=6} site <- '04193500' options(scipen=5) demoPlot <- gsplot(mgp=c(3, 0.3, 0.0)) %>% lines(dates[[site]], flow[[site]], col="royalblue", log='y', ylab= expression(paste("Discharge in ",ft^3/s))) %>% title(main=paste("Site", site)) %>% grid(equilogs=FALSE) demoPlot ``` ## Multiple plots in one figure What if you wanted to see if there was any relationship between the pH and water temperature? Consider the following three graphs: pH vs water temperature, pH timeseries, water temperature timeseries. To view these three plots at one time, use `layout` to "append" the three different plots. ```{r echo=TRUE, fig.cap="Fig. 3 (a) pH vs water temperature, (b) pH timeseries, (c) water temperature timeseries.", fig.width=6, fig.height=6} site <- '04193490' plot1 <- gsplot() %>% points(Wtemp[[site]], pH[[site]], col="black")%>% title(main=paste("Site", site), xlab="Water Temperature (deg C)", ylab="pH") plot2 <- gsplot() %>% lines(dates[[site]], pH[[site]], col="seagreen")%>% title(main="", xlab="time", ylab="pH") plot3 <- gsplot() %>% lines(dates[[site]], Wtemp[[site]], col="orangered")%>% title(main="", xlab="time", ylab="Water Temperature (deg C)") layout(matrix(c(1,2,3), byrow=TRUE, nrow=3)) plot1 plot2 plot3 ``` ## Compare timeseries of different units For timeseries, it is sometimes helpful to plot data on the same graph to make comparisons; however, it becomes difficult when the data differ in units. Thus, a second y-axis can easily be added to plot the second timeseries. In this example, we can compare the water temperature and pH timeseries to identify relationships over time. pH is plot using the secondary y-axis by specifying `side=4`. ```{r echo=TRUE, fig.cap="Fig. 4 Water temperature timeseries on primary y-axis with pH timeseries on secondary y-axis.", fig.width=6, fig.height=6} site <- '04193490' demoPlot <- gsplot(mar=c(7.1, 4.1, 4.1, 4.1)) %>% lines(dates[[site]], Wtemp[[site]], col="orangered", legend.name="Water Temperature", ylab='Water Temperature (deg C)') %>% lines(dates[[site]], pH[[site]], col="seagreen", side=4, legend.name="pH", ylab='pH (pH Units)') %>% title(main=paste("Site", site), xlab='time') %>% legend(location="below") demoPlot ``` ## Adding to the plot retroactively Oftentimes, data is plot and observations regarding missing or abnormal data are made afterwards. `gsplot` makes it easy to add to any plot retroactively by using the plot object (`demoPlot` in this example) in the call for the plot feature. ```{r echo=TRUE, fig.cap="Fig. 5 Initial plot of water temperature timeseries.", fig.width=6, fig.height=6} # initially plot the data site <- '04193490' demoPlot <- gsplot() %>% lines(dates[[site]], Wtemp[[site]], col="orangered") %>% title(main=paste("Site", site), xlab='time', ylab='Water Temperature (deg C)') demoPlot ``` ```{r echo=TRUE, fig.cap="Fig. 6 Plot of water temperature timeseries with 'Missing Data' callout retroactively added.", fig.width=6, fig.height=6} # notice the missing data from ~ 1991 through ~2011 and add a callout demoPlot <- callouts(demoPlot, x=as.Date("2000-01-01"), y=10,labels="Missing Data") demoPlot ```