Launch: Demolition Hold List

Today I put a new thing on the Internet: Demolition Hold List, Chicago, IL: A place for info about architecturally significant buildings in danger of being demolished (or are already gone). Please take a look at all of the great buildings and, if you can, contribute photos, text, and memories of the buildings listed there.

I wanted to write down how I made this site because I think it could be useful to others in creating Web communities from spreadsheets and other structured text.

Tools

Steps

  • All of the information that originally populated this site was taken from the“Demolition Delay” page of the City of Chicago Department of Community Development Web site on Friday, March 6, 2009. I say this with specificity because the status information for each permit may have changed between that day and Monday, March 9, 2009, which is the day I set up automated tracking of those pages. This info was simply copy/pasted from City Web site
  • I placed all of the records in chronological order in a text file usingTextWrangler, “a very capable text editor”. There are many such tools in the world, but I like this one
  • I cleaned up the text in a number of ways, but mainly I did the “Text > Remove Line Breaks” thing. When you do this– and then remove the stray lines between each record– you get a nice clean set of lines with consistent text strings
  • These “text strings”– or word patterns– are the key to making the spreadsheet, which is the key to then doing all sorts of other stuff like working with developers and publishing the info in bulk
  • For instance, in every line, we’ve got this string– ” Date received: “, or “space-Date received-space”. All you have to do is do a search and replace– search for every instance of ” Date received: ” and replace it with “tab-Date received-tab”. Then you can do this all the way down the line, every time you want to have a new piece of info in a new field
  • The reason to chose “tab” is because you’re going to end up with a “tab-delimited file”, or one whose fields are separated by tabs
  • Once I have all of the natural fields separated, I save the document out of TextWrangler and open it as a spreadsheet in OpenOffice
  • If I did it right (and it will probably take a few times of re-importing), all you have to do is add column headers and you get something like this— a sharable, downloadable, publicly updatable, syndicatable list of items
  • The next thing I have to do is modify this sheet so that you can use it to import the contents as posts in WordPress. The guy who made this nifty utility has a sample file that requires a very specific format for the upload file
  • The main thing you need to do to conform with this format is to pull a lot of the data from separate into columns in a way that makes sense. The first step is to create new columns right next to each content type so that each post can have a set of text that makes sense. In other words, you have to put the phrase, “Permit:” next to the permit number, “Applicant:” next to the applicant name, and so on
  • Once you’ve got that, export the file to a CSV in pipes-delimited format. Open that file up in TextWrangler again and do some more copy/ paste replacing to get the post content into one column. The main thing is to go back to all of those columns you made and smoosh them together with the text that it relates to. For example, replace “|Permit:|” with “|Permit: ” (keeping the initial pipe, so that the import script knows where the column starts, but removing the second pipe and replacing it with a space, so that it flows nicely with the content of the next column). Then replace “|Applicant:|” with ” Applicant: ” (replacing the pipes with spaces)
  • When you’re done, the file will look something like this— formatted perfectly for uploading into WordPress
  • Now you’ve got to set up a Web host and install WordPress. Sign up for a cheap Web host that offers one-click WordPress installs– Hostmonster w/Fantastico costs something like $110 for two years with unlimited domains. Crazy. There’s a ton of WordPress support as well. Get a mitt and get in the game
  • In order to interact with your server, you’ve got to have an FTP client. I likeTransmit
  • Use Transmit to upload the CSV utility to the proper directory, and then you’re ready to import the posts. In about 5 seconds I had a Web site with 231 unique pages
  • Here’s the big question– how the hell do I keep up with the changes to the source data? One solution is Versionista. They automatically watch flat HTML pages and let you know when the page changes. You can evensubscribe to the pages in RSS. This is where it gets a little squirrely. I want to be able to automatically pump these changes into my spreadsheet and the Web site, but since the City of Chicago site often goes down and throws a 404 error, the system thinks that the page has changed (and it has) and that the change is relevant (but it isn’t)
  • There is a Yahoo Pipe to help manage the changes as well (must enter this Versionista address: http://versionista.com/pub/16576/1/1/). I learned this from the people over at ProPublica. More to come on that. Anyway, for now there is some human action that has to occur. God bless us all.

Posted

in

, ,

by

Tags: