Guiding principles for effective data visualization
Slide Deck available at
https://bit.ly/2MFU9kn
William R. Buchanan, Ph.D.
Director, Office of Grants, Research, Accountability, & Data
Fayette County Public Schools
Background
Developed
Interactive Data Visualization tools for Accountability Reporting
Developed a program to
automate exploratory data analysis in Stata
Developed the
brewscheme toolkit in Stata
for aesthetics and accessibility of data visualizations
What is the purpose of data visualization?
Explore
Expose
Explain
What is communication?
How does the visualization process get us there?
Pop Quiz
Go to:
PollEv.com/billybuchana011
OR
Text
BILLYBUCHANA011
to
22333
to connect
Q1. Which slice in this pie chart is the
SECOND
largest?
Q2. Which slice in this pie chart is the
SECOND
largest?
Q3. Which bar is the
SECOND
largest?
What advice do I have for visualizing data?
Mapping between visual and measurement scales should be one and the same.
Do not alienate potential end-users.
Avoid
obfuscating the message.
Bind helpful information in the UI without blocking the user's view.
Establish clear and consistent standards for visual encodings.
Mapping between visual and measurement scales should be one and the same
Nominal Scale
Used for categories that have no natural order.
Can only test equality and inequality.
Ordinal Scales
Used for categories that have a natural order.
Can test for equality/inequality.
Can test order of magnitude.
No other mathematical operators are defined on this scale.
Intervallic Scales
Used in cases where there is no true zero or where the location of zero is arbitrary.
Can test for equality/inequality.
Can test order of magnitude.
Mathematical operations are defined on this scale.
Ratio Scales
Used when there is a true zero or when the location of zero is fixed.
Has all of the same prop
Visual Scales
Visual Scale type
1
2
3
4
5
Qualitative
5
1
4
2
3
Sequential
1
2
3
4
5
Divergent
-2
-1
0
1
2
Quiz Time
Q4. What type of measurement scale is used for the letter grades below?
F
D
C
B
A
Q5. What type of visual scale is used for the colors for the letter grades below?
F
D
C
B
A
Q6. The visual scale mapping from the previous example is aligned with the measurement scale of the values.
Q7. What type of visual scale is used for the risk groups below?
Low Risk
Moderate Risk
High Risk
Q8. The visual scale mapping from the previous example is aligned with the measurement scale of the values.
Q9. What type of visual scale is used for the colors for the risk groups below?
Low Risk
Moderate Risk
High Risk
Q10. The visual scale mapping from the previous example is aligned with the measurement scale of the values.
Takeaway
Visual scales should reinforce the underlying measurement scale.
Stop light colors are nominal.
Divergent scales are useful/helpful when you want to highlight distances in multiple directions.
Do not alienate potential end-users
Color contrast issues can also alienate end users.
Color contrast issues can also alienate end users.
Takeaway
Be careful about the selection of colors to denote groups/categories/values.
Be mindful of color contrast issues for any text in your visualizations.
Use available tools/technologies to proof your visualizations prior to publication.
Avoid obfuscating the message
What types of comparisons are easy for humans to perceive?
What element do you want/need end users to focus on?
How does your choice of graph/chart support/hinder that comparison?
Takeaways
Humans + Visual Comparison of Areas = Incorrect Interpretation
To see change over time, you need to display the slopes that represent the change.
Stacked bar/area charts can be challenging to interpret due to their construction.
Bind helpful information in the UI without blocking the user's view
If an end user isn't familiar with your graph/chart type, how do you know if they will interpret the visualization correctly/consistently?
How can end users actively engage with your visualization if they need additional contextual information?
What can you do to support multiple stakeholder groups interacting with the same visualizations?
How can you help end users avoid unsupported inferences/conclusions from your visualizations?
Takeaways
Guides to help users interpret the visualizations can drive consistent use and interpretation.
Make the UI elements obvious so end users can find the information easily.
Make the information as proximal to the visualization w/o obscuring the view.
Bind external resources that are regularly sought to create additional value.
Wrap Up
Wrap Up
Make your visualizations accessible to all potential end users whenever possible.
Reinforce the underlying meaning of the data with the visual mapping.
Humans and visual comparison of areas don't mix.
Empower the end users to make appropriate inference from your visualization with supporting materials.
Create/implement visualization standards to provide end users with consistency.
Additional Resources
People
John Wilder Tukey
- Inventor of the Box & Whisker Plot, terms like bit and byte, Statistician, & Author of
Exploratory Data Analysis
Edward Tufte
- Statistican & Data Visualization Author/Expert
Mike Bostock
- created
D3.js
, lead NY Times Graphics Team, & Co-Founded Observable
Jeffrey Heer
- Director of the
Interactive Data Visualization Lab at UWashington
& Co-Founder of
Trifacta
Leland Wilkinson
- Currently Chief Scientist at
H2O.ai
& Author of the Grammar of Graphics
William Cleveland
- Led Data Visualization Research at Bell Labs & Currently Professor of Statistics & Computer Science at
Purdue University
Stephanie Evergreen
- Data Visualization Consultant, Author, & Expert.
Andy Kirk
- Data visualization author/researcher & owner/maintainer of
https://www.visualisingdata.com/
Giorgia Lupi
- Data Artist
People (continued)
Robert Kosara
- Senior Research Scientist at
Tableau
& owner/maintainer of
https://eagereyes.org/
Alberto Cairo
- Knight Chair in Visual Journalism & Director of the Visualization Program at
University of Miami's Center for Computational Science
Tiffany Farrant-Gonzalez
- Data Designer
Jen Christiansen
- Senior Graphics Editor at
Scientific American
Jane Pong
- Data Visualization Designer
Krisztina Szűcs
- Data Visualization Designer
Anita Graser
- GIS Specialist & Contributor to
QGIS
Lam Thuy Vo
- Senior Reporter at
BuzzFeed
Joanna S. Kao
- Journalist/Developer at
The Financial Times
Mico Yuk
- Co-Founder of
BI Brainz
Cynthia A. Brewer
- GIS Visualization Expert, Author, Penn State Prof, & Creator of
ColorBrewer
Dianne Cook
- Interactive Data Visualization Pioneer
People (continued)
Amy Cesal
- UX Contractor at US Department of Veterans Affairs, former Sunlight Foundation staffer, &
Play-Doh Data Visualization Expert
Nathan Yau
- Data Visualization Expert, Author, & owner/maintainer of
https://flowingdata.com/
Elijah Meeks
- Visualization Software Developer & Author
Hadley Wickham
- Chief Scientist at
RStudio
& Author of the
ggplot2
package in R
Stefanie Posavec
- Data Artist & Co-Author of
Dear Data
Jake VanderPlas
- Developer of the
Altair
library in Python
Scott Murray
- Developer of
Processing
language for interactive visualizations
Nadieh Bremer
- Astronomer turned Data Visualization Expert
Robert Grant
- Statistician, Developer, & Author on Data Visualization
Michael Mitchell
- Statistician at US Department of Veterans Affairs & Data Visualization Author
Books
Tukey, J. W. (1977).
Exploratory data analysis
. New York City, NY: Addison-Wesley Publishing Company.
Wilkinson, L. (2005).
The grammar of graphics
. Second Edition. New York City, NY: Springer Science+Business Media, Inc.
Cleveland, W. S. (1993).
Visualizing data
. Summit, NJ: Hobart Press
Cleveland, W. S. (1994).
The elements of graphing data
. Summit, NJ: Hobart Press
Kirk, A. (2019).
Data visualisation: A handbook for data driven design
. Second Edition. Thousand Oaks, CA: SAGE Publications.
Yau, N. (2011).
Visualize this: The FlowingData guide to design, visualization, and statistics
. Indianapolis, IN: Wiley Publishing
Yau, N. (2013).
Data points: Visualization that means something
. Indianapolis, IN: Wiley Publishing
Meeks, E. (2017).
D3.js in action: Data visualization with JavaScript
. Second Edition. Shelter Island, NY: Manning Publications Co.
Tufte, E. R. (2001).
The visual display of quantitative information
. Cheshire, CT: Graphics Press.
Books (continued)
Mitchell, M. N. (2012).
Interpreting and visualizing regression models with Stata
. College Station, TX: Stata Press.
Mitchell, M. N. (2012).
A visual guide to Stata graphics
. Third Edition. College Station, TX: StataPress.
Cox, N. J. (2014).
Speaking Stata graphics
. College Station, TX: StataPress.
Wickham, H. (2016).
ggplot2: Elegant graphics for data analysis
. Second Edition. New York City, NY: Springer Science+Business Media, Inc.
Grant, R. (2018).
Data visualization: Charts, maps, and interactive graphics
. New York City, NY: CRC Press.
Evergreen, S. D. H. (2020).
Effective data visualization: The right chart for the right data
. Second Edition. Thousand Oaks, CA: SAGE Publishing.
Evergreen, S. D. H. (2020).
The data visualization sketchbook
. Thousand Oaks, CA: SAGE Publishing.
Evergreen, S. D. H. (2018).
Presenting data effectively: Communicating your findings for maximum impact
. Thousand Oaks, CA: SAGE Publishing.
Other
Data Visualization Society
Open Visualization Conference
Bocoup
Stamen
Flourish
BIBrainz
DeltaRho
Bokeh
Panel
Center for Data and Visualization Sciences
University of Washington Interactive Data Lab
Effective Data Visualization (slides from Jeff Heer talk)
Visualization is Not Enough (EuroVis 2019 talk by Jeff Heer)
Visualising Data's Resource List
Fayette County Public Schools' Data Visualization Standards
- This is a work in progress
Papers/Other References
Kelly, K. L. (1965). Twenty-two colors of maximum contrast.
Color Engineering, 3(26)
, pp. 26-27.