Thinking about data-driven visualizations

When you’re creating a visualization based on data, it often seems as if the possibilities are endless. Realistically, however, your best option is to think carefully about each of the variables with which you’re working—typically represented as the columns in a spreadsheet—and the limited number of aesthetic dimensions of your visualization—for each data point: the x position, the y position, possibly the z position, color, transparency, shape, and size.

Your goal is to map each aesthetic to one variable. If you’re using an aesthetic dimension in your graphic that isn’t tied to your variables, then why do you have that dimension? After all, it’s not communicating any information.

Leland Wilkinson’s seminal book, A Grammar of Graphics.

Hadley Wickham’s classic: A Layered Grammar of Graphics.

and a discussion that applies Hadley’s ideas.