Creating Heatmap Tiles in R

3 minute read

Published:

This demonstrates a basic data visualisation technique in ggplot2 that gets used a bit, but always gets a good response when used in business facing presentations. I think its the intuitive calendar feel and use of great palettes like viridis. It’s really flexible and beats doing a line graph or similar, but can be tricky to remember how to do it.

The Data

For this example we have Daily Maximum Temperature data from the Australian Bureau of Meteorology.

The data are 2016 Daily Maximum Temperatures from Albion Park (Wollongong Airport) downloaded from: (http://www.bom.gov.au/tmp/cdio/IDCJAC0010_068241_2016.zip).

Here are some notes from the file metadata:

Station Details

Bureau of Meteorology station number: 68241 Station name: ALBION PARK (WOLLONGONG AIRPORT) Year site opened: 1999 Year site closed: Latitude (decimal degrees, south negative): -34.56 Longitude (decimal degrees, east positive): 150.79 Height of station above mean sea level (metres): 8 State: NSW

Data File Format

The data file format is shown below.

ColumnExplanation
1Product code
2Bureau of Meteorology station number
3Year
4Month
5Day
6Daily maximum temperature (degrees Celsius)
7Period over which daily maximum temperature was measured (days)
8Quality of daily maximum temperature

The Code

# setup  
library(tidyverse)
library(lubridate)
library(viridis)
library(ggthemes)
library(forcats)
# read in rainfall data  
temp <- read_csv('IDCJAC0010_068241_2016_Data.csv')
## Parsed with column specification:
## cols(
##   `Product code` = col_character(),
##   `Bureau of Meteorology station number` = col_character(),
##   Year = col_integer(),
##   Month = col_character(),
##   Day = col_character(),
##   `Maximum temperature (Degree C)` = col_double(),
##   `Days of accumulation of maximum temperature` = col_integer(),
##   Quality = col_character()
## )
temp
## # A tibble: 366 x 8
##    Product code Bureau of Meteorology station number  Year Month   Day
##           <chr>                                <chr> <int> <chr> <chr>
## 1    IDCJAC0010                               068241  2016    01    01
## 2    IDCJAC0010                               068241  2016    01    02
## 3    IDCJAC0010                               068241  2016    01    03
## 4    IDCJAC0010                               068241  2016    01    04
## 5    IDCJAC0010                               068241  2016    01    05
## 6    IDCJAC0010                               068241  2016    01    06
## 7    IDCJAC0010                               068241  2016    01    07
## 8    IDCJAC0010                               068241  2016    01    08
## 9    IDCJAC0010                               068241  2016    01    09
## 10   IDCJAC0010                               068241  2016    01    10
## # ... with 356 more rows, and 3 more variables: Maximum temperature
## #   (Degree C) <dbl>, Days of accumulation of maximum temperature <int>,
## #   Quality <chr>

We can now manipulate the data to get it in the best format for our visualisation.

# Using lubridate to change formatting of day and month cols from char into dates

temp <- temp %>% 
  mutate(Day = wday(as.numeric(Day)),
         Month = month(as.numeric(Month), label = TRUE, abbr = TRUE)) %>% 
  select(Month,
         Day, 
         MaxTemp = `Maximum temperature (Degree C)`)

head(temp)
## # A tibble: 6 x 3
##   Month   Day MaxTemp
##   <ord> <dbl>   <dbl>
## 1   Jan     1    25.6
## 2   Jan     2    24.6
## 3   Jan     3    24.1
## 4   Jan     4    22.2
## 5   Jan     5    22.3
## 6   Jan     6    19.6
# Creating the plot using ggplot2's geom_tile and the minimalist tufte theme.

ggplot(temp, aes(x =Day, 
                 y = fct_rev(Month), 
                 fill = MaxTemp)) +
  scale_fill_viridis(name = "Degrees (C)", option = "C") +
  geom_tile(colour = "White", size = 0.4) +
  labs(title = "How hot was Wollongong in 2016?", 
       subtitle = "Daily max temperatures from Wollongong Airport in 2017", 
       x = "Day of the Month",
       y = "Month",
       caption = "Source: www.bom.gov.au") +
  theme_tufte()

This gives a nice view of the seasonality and a few outliers like the really hot days in Dec and Jan.