AALL Hackathon, July 2014

Step 1: A recommended dataset for the hackathon was the AALL's 2011-2013 Preliminary Analysis of AALL's State Legal Inventories ( here ). For each state, this report links to (1) the original 2007 status of authentication for online material in that state, (2) the 2009 update on authentication for the state, and (3) raw data for that state used to compile the 2012 state legal materials inventory. We wanted to use the raw data. That data was kept in Google Docs spreadsheets which were marked public at the time the report was released in November 2012. Many of these became unavailable in the conversion to Google Drive in April 2013.

Step 2: Each state was assessed for whether data was available or not. Two different error messages appeared: A file not found error, or an access denied error. The lead researcher for the Pennsylvania data was present at the hackathon, and provided access. It took her more than 30 minutes to click through providing access, and is not something that could easily be done on demand if members of the public were to request access.

Step 3: We compared spreadsheets for the states for which data was available. Columns were not in any consistent order from state to state and fields were not labeled consistently. Law librarians identified which fields were the same across spreadsheets. 19 fields were the same across states (provided in Appendix at the bottom of this page). Hawaii and California's data was downloaded and cleaned as a proof of concept. For Hawaii and California, fields for the 19 identified fields that appeared in most state spreadsheets were orderd and labeled consistently, so that the spreadsheets were organized the same. Additional fields, that were on one state but not the other, were left on after those 19 basic fields.

Step 4: The goal was to make an interface for comparing aspects of government document authentication and preservation between Hawaii and California. Then, other states could be loaded, and comparisons between any two states or statistical data across states could be pulled. Because accessing and cleaning the data took almost all available time, an interface was not developed.

Raw and cleaned data is presented below.

State Spreadsheet from Report

Spreadsheet with Fields relabeled

Alabama No data available (permissions setting)  
Alaska Alaska raw data  
Arizona No data available (believed deleted as of July 2014)  
Arkansas Arkansas raw data  
California California raw data California cleaned data
Colorado No data available (believed deleted as of July 2014)  
Connecticut No data available (permissions setting)  
District of Columbia District of Columbia raw data  
Delaware No data available (permissions setting)  
Florida No data available (permissions setting)  
Georgia Georgia raw data  
Hawaii Hawaii raw data Hawaii cleaned data
Idaho Idaho raw data  
Illinois Illinois raw data  
Indiana Indiana raw data  
Iowa Iowa raw data  
Kansas No data available (permissions setting)  
Kentucky No data available (permissions setting)  
Louisiana No data available (permissions setting)  
Maine No data available (permissions setting)  
Maryland Maryland raw data  
Massachusetts No data available (permissions setting)  
Michigan No data available (believed deleted as of July 2014)  
Minnesota No data available (permissions setting)  
Mississippi No data available (permissions setting)  
Missouri Missouri raw data  
Montana Montana raw data  
Nebraska Nebraska raw data  
Nevada No data available (permissions setting)  
New Hampshire New Hampshire raw data  
New Jersey No data available (permissions setting)  
New Mexico New Mexico raw data  
New York New York raw data  
North Carolina North Carolina raw data  
North Dakota North Dakota raw data  
Ohio Ohio raw data  
Oklahoma Oklahoma raw data  
Oregon No data available (link leads to North Dakota data)  
Pennsylvania Pennsylvania raw data (data provided because researcher attended the hackathon)  
Rhode Island Rhode Island raw data  
South Carolina South Carolina raw data (when Google Doc is viewed, message pops up saying file is in owner's trash and will be deleted soon)  
South Dakota South Dakota raw data  
Tennessee Tennessee raw data  
Texas No data available (permissions setting)  
Utah Utah raw data  
Vermont Vermont raw data  
Virginia Virginia raw data  
Washington Washington raw data  
West Virginia West Virginia raw data  
Wisconsin No data available (permissions setting)  
Wyoming Wyoming raw data  


Step 5: Recommendations to AALL moving forward are as follows:


Appendix: Here are the 19 fields that appear on most states' data sets:

  1. Title of the set of government documents
  2. Branch of government that makes that set of documents (ie. legislative, judicial, executive)
  3. Government Entity / Department that makes the documents
  4. Current internet publishers
  5. URL for government documents
  6. Scope / years covered for online (ie. 1995 to present)
  7. Current print publishers
  8. Past publishers for this document series
  9. Type of information
  10. What format is the online version in
  11. Primary law (yes or no)
  12. Is the online version official (yes or no)
  13. Is a disclaimer provided for the online version (yes or no)
  14. If yes, provide the disclaimer language
  15. Is the online version authenticated
  16. Is there permanent public access for the online version
  17. Is there a statutory authority / directive to provide permanent public access
  18. Cost to access the official version?
  19. Is there copyright in the official version?