AALL Hackathon, July 2014
Step 1: A recommended dataset for the hackathon was the AALL's 2011-2013 Preliminary Analysis of AALL's State Legal Inventories ( here ). For each state, this report links to (1) the original 2007 status of authentication for online material in that state, (2) the 2009 update on authentication for the state, and (3) raw data for that state used to compile the 2012 state legal materials inventory. We wanted to use the raw data. That data was kept in Google Docs spreadsheets which were marked public at the time the report was released in November 2012. Many of these became unavailable in the conversion to Google Drive in April 2013.
Step 2: Each state was assessed for whether data was available or not. Two different error messages appeared: A file not found error, or an access denied error. The lead researcher for the Pennsylvania data was present at the hackathon, and provided access. It took her more than 30 minutes to click through providing access, and is not something that could easily be done on demand if members of the public were to request access.
Step 3: We compared spreadsheets for the states for which data was available. Columns were not in any consistent order from state to state and fields were not labeled consistently. Law librarians identified which fields were the same across spreadsheets. 19 fields were the same across states (provided in Appendix at the bottom of this page). Hawaii and California's data was downloaded and cleaned as a proof of concept. For Hawaii and California, fields for the 19 identified fields that appeared in most state spreadsheets were orderd and labeled consistently, so that the spreadsheets were organized the same. Additional fields, that were on one state but not the other, were left on after those 19 basic fields.
Step 4: The goal was to make an interface for comparing aspects of government document authentication and preservation between Hawaii and California. Then, other states could be loaded, and comparisons between any two states or statistical data across states could be pulled. Because accessing and cleaning the data took almost all available time, an interface was not developed.
Raw and cleaned data is presented below.
State | Spreadsheet from Report | Spreadsheet with Fields relabeled |
Alabama | No data available (permissions setting) | |
Alaska | Alaska raw data | |
Arizona | No data available (believed deleted as of July 2014) | |
Arkansas | Arkansas raw data | |
California | California raw data | California cleaned data |
Colorado | No data available (believed deleted as of July 2014) | |
Connecticut | No data available (permissions setting) | |
District of Columbia | District of Columbia raw data | |
Delaware | No data available (permissions setting) | |
Florida | No data available (permissions setting) | |
Georgia | Georgia raw data | |
Hawaii | Hawaii raw data | Hawaii cleaned data |
Idaho | Idaho raw data | |
Illinois | Illinois raw data | |
Indiana | Indiana raw data | |
Iowa | Iowa raw data | |
Kansas | No data available (permissions setting) | |
Kentucky | No data available (permissions setting) | |
Louisiana | No data available (permissions setting) | |
Maine | No data available (permissions setting) | |
Maryland | Maryland raw data | |
Massachusetts | No data available (permissions setting) | |
Michigan | No data available (believed deleted as of July 2014) | |
Minnesota | No data available (permissions setting) | |
Mississippi | No data available (permissions setting) | |
Missouri | Missouri raw data | |
Montana | Montana raw data | |
Nebraska | Nebraska raw data | |
Nevada | No data available (permissions setting) | |
New Hampshire | New Hampshire raw data | |
New Jersey | No data available (permissions setting) | |
New Mexico | New Mexico raw data | |
New York | New York raw data | |
North Carolina | North Carolina raw data | |
North Dakota | North Dakota raw data | |
Ohio | Ohio raw data | |
Oklahoma | Oklahoma raw data | |
Oregon | No data available (link leads to North Dakota data) | |
Pennsylvania | Pennsylvania raw data (data provided because researcher attended the hackathon) | |
Rhode Island | Rhode Island raw data | |
South Carolina | South Carolina raw data (when Google Doc is viewed, message pops up saying file is in owner's trash and will be deleted soon) | |
South Dakota | South Dakota raw data | |
Tennessee | Tennessee raw data | |
Texas | No data available (permissions setting) | |
Utah | Utah raw data | |
Vermont | Vermont raw data | |
Virginia | Virginia raw data | |
Washington | Washington raw data | |
West Virginia | West Virginia raw data | |
Wisconsin | No data available (permissions setting) | |
Wyoming | Wyoming raw data |
Step 5: Recommendations to AALL moving forward are as follows:
Appendix: Here are the 19 fields that appear on most states' data sets: