Interactive Data Visualization for the Web

Using jsonio, libd3, and libhtml to create reusable D3js

Billy Buchanan
Currently Director of Data, Research, and Accountability
Fayette County Public Schools
Formerly Data Scientist @ Minneapolis Public Schools

https://wbuchanan.github.io/stataConference2016
  • A bit about interactive visualization in Stata
  • jsonio
  • libd3
  • libhtml
  • Putting it all together

ffmpeg for animated graphs

Stata as graphics engine

Higher level abstractions

  • Robert Grant's 2014 Stata Conference Presentation
  • Provides a higher level abstraction around D3.js but flexibility of Stata's twoway commands make parsing and translating challenging
  • Solution now uses Stata to generate a single static HTML file with the data and JavaScript embedded in a single document
  • Much more interaction, but still limited by what was exposed to users via the APIs

Higher level abstractions

  • Chavez & Matsuoka (2016) take things a step further towards integration with web technologies
  • With these advances there are still challenges to overcome:
    • Licensing requirements/restrictions
    • Decoupling of metadata and data
    • Extensibility

A Different Approach to Integration

  • Use a modular approach for the different components
  • jsonio - Serialize/Deserialize JSON data
  • libd3 - Mirror D3.js in Mata so migration less painful
  • libhtml - Library of Objects represent HTML elements
  • jsonio packages all of your data and metadata in a single JSON object so you don't lose variable labels or value labels
  • Want more flexibility from D3.js examples? Use libd3 to create Mata functions that generate the D3 code to meet your needs
  • Need or want more control over the formatting? Create the DOM elements directly with libhtml

jsonio

  • Current capabilities in Stata focused only on consumption of json and has lead to many questions from users
  • I/O tool for using JSON data in Stata
  • Java plugin built on the Jackson JSON API
  • Handles arbitrarily complex JSON consistently
  • Almost lossless export

Deserialization

key-value mode

							
. // Load the same data into Stata in a key-value pair structure
. jsonio kv, file("~/Desktop/waypointsResponse.json") ///
  nourl elem("(legs_[0-9]/((start)|(end))_location/((lat)|(lng)))")
							
keyvalue
/routes_1/legs_1/end_location/lat42.378175
/routes_1/legs_1/end_location/lng-71.060226
/routes_1/legs_1/start_location/lat42.359824
/routes_1/legs_1/start_location/lng-71.059812
/routes_1/legs_2/end_location/lat42.442609
/routes_1/legs_2/end_location/lng-71.229336
/routes_1/legs_2/start_location/lat42.378175
/routes_1/legs_2/start_location/lng-71.060226
...

Deserialization

row-value mode

							
. // Load the same data into Stata in a key-value pair structure
. jsonio rv, file("~/Desktop/waypointsResponse.json") ob(1) ///
  nourl elem("(legs_[0-9]/((start)|(end))_location/((lat)|(lng)))")
							
Variable Namestorage typedisplay formatvariable label
jsonvar1double%10.0g/routes_1/legs_1/end_location/lat
jsonvar2double%10.0g/routes_1/legs_1/end_location/lng
jsonvar3double%10.0g/routes_1/legs_1/start_location/lat
jsonvar4double%10.0g/routes_1/legs_1/start_location/lng
jsonvar5double%10.0g/routes_1/legs_2/end_location/lat
jsonvar6double%10.0g/routes_1/legs_2/end_location/lng
jsonvar7double%10.0g/routes_1/legs_2/start_location/lat
jsonvar8double%10.0g/routes_1/legs_2/start_location/lng
...

Serialization

Close to lossless export of data and meta data is possible with jsonio

	
"data" : [{
		...,
		{
    "mpg" : 17.0,
    "price" : 11995.0,
    "headroom" : 2.5,
    "rep78" : 5.0,
    "length" : 193.0,
    "weight" : 3170.0,
    "displacement" : 163.0,
    "turn" : 37.0,
    "trunk" : 14.0,
    "make" : "Volvo 260",
    "gear_ratio" : 2.9800000190734863,
    "foreign" : 1.0
  }, ... }],
"variableTypeIsString" : {
    "mpg" : false,
    "price" : false,
    "headroom" : false,
    "rep78" : false,
    "length" : false,
    "weight" : false,
    "displacement" : false,
    "turn" : false,
    "trunk" : false,
    "make" : true,
    "gear_ratio" : false,
    "foreign" : false
  },
  "variableNames" : [ "make", "price", "mpg", "rep78", "headroom", "trunk", "weight", "length", "turn", "displacement", "gear_ratio", "foreign" ],
  "variableLabels" : {
    "mpg" : "Mileage (mpg)",
    "price" : "Price",
    "headroom" : "Headroom (in.)",
    "rep78" : "Repair Record 1978",
    "length" : "Length (in.)",
    "weight" : "Weight (lbs.)",
    "displacement" : "Displacement (cu. in.)",
    "turn" : "Turn Circle (ft.) ",
    "trunk" : "Trunk space (cu. ft.)",
    "make" : "Make and Model",
    "gear_ratio" : "Gear Ratio",
    "foreign" : "Car type"
  },
  "valueLabelNames" : {
    "foreign" : "origin"
  },
  "valueLabels" : {
    "foreign" : {
      "0" : "Domestic",
      "1" : "Foreign"
    }
  }, ...
	

libd3

  • libd3 has not been updated for the D3.js version 4 release, but this shouldn't be too much of an issue
  • Uses a fluent API design (e.g., you can chain method/function calls) to mirror the D3.js semantics as closely as possible
  • Purposefully tried to mirror the JavaScript API as closely as possible to make it easier to reuse existing code
  • There are a few notable exceptions to make sure things get processed correctly

libd3 (differences)

  • To reference JavaScript objects from Mata, prefix the name of the object with obj_
  • All functions/methods must be followed with parentheses
  • Although more difficult to reuse the same object multiple times, the d3 class does have a built-in undo button

libhtml

  • Create HTML elements as Mata objects
  • Uses object inheritance to handle global HTML properties
  • Slightly limited by single inheritance model
  • Also uses a fluent API design to make it more convenient to generate HTML content tags

  • Lots of HTML elements now, so the installed package is just a shell that grabs the files from a server and compiles the mata library for you locally
  • Creating a library of low level objects gives everyone more flexibility to create and contribute to the community
  • Consistent method for adding content to HTML tags

Putting it all together

Use jsonio to serialize data and write your HTML and JavaScript to create static graphs for the web

or to create slightly interactive graphs

or to emphasize the exploratory in EDA

You can create Mata functions or Stata programs that generate the HTML and JavaScript for users to create simpler graphs directly in Stata

And can also make graphs that are a bit more complex

Oh...and by the way...


FCPS is Hiring!!!

If you or anyone you know is interested in doing research work education

:::cough::: economics of education/psychometrics/program evaluation types :::cough:::
feel free to grab me at some point during the conference to tell me or email me:
Billy.Buchanan@fayette.kyschools.us

You can also check The Fayette County Public School's Job Postings to check and see if the positions are up/posted.