Monthly Archives: September 2013

Geocoding Historic Homes with Google Fusion Tables

Using data available from Wikipedia concerning historic homes constructed near the turn of 19th and 20th century, I have created a map of structures in Knoxville, Tennessee designed by George A. Barber, an architect.

I pulled the data from <> as a simple “copy” and “paste” operation into Apache Open Office Calc spreadsheet.

I saved the spreadsheet as a .csv file, comma delimited.

I added a new column and duplicated the street addresses. I deleted the parantheticals surrounding the street address, along with the name of the property.

I deleted the street address and parenthetical in column 1 to retain the name of the property.

After saving the .csv file again, I opened up my personal Google Drive account.

I added the “Google Fusion Tables” application from Google, and then selected “create new fusion table” as instructed in Google’s tutorial.

After importing the data, I ran into some problems concerning the division of street, city, and state.  From “File > Geocode” my “street” was not recognized immediately as a location address.  After changing the “street” drop down in the “Rows 1” view to “location,” I was able to direct the application to geocode based on the street address.

At present time, this is a very basic map.

I do like the ease with which it obtained the lat/long coordinates, and how it transformed the table data into “cards” with the pertinent information in a “pop-up” on the map.

I’m also happy that it can export the resultant geocoded map as KML.

For future work, I think it would be interesting to link a Flickr or other photo management system to the Geocoder.

I also understand it is possible to add a Google Street view image of the particular property.

However, it is necessary to obtain the location information in the form of lat/long for this to work.

It is unfortunate that Fusion Tables do not append the lat – long information to the table.

There is software available which can provide this information.

From my course in the Geography Department, I’m aware of this software:

The application of interest is listed under “Google Geocoder.”

Geocoding with Google Earth is accomplished through two programs: KMLGeocode and KMLReport. The first program reads Excel Worksheets or an XML export of a table from a relational database system and creates a KML file that can be loaded into Google Earth. Once the KML file is loaded, Google Earth will attempt to geocode each entry in it.  After the file is geocoded, it can be saved to a new KML file. This file will contain the coordinates of each

address found. The second program, KMLReport, reads that file and generates two files: one for

geocoded addresses and one for addresses that were not found. The file for geocoded address is written as a comma delimited text file that can be loaded into ArcGIS.

At the moment it seems like obtaining a street view would require me to obtain the lat-long coordinates for the data, the append it to the Fusion Table.

Fusion has some advantages, including automatic publishing to the web, the ability to easily update table data, and for “collaborative data entry.” I can see some potential applications for my neighborhood organization, or any other collaborative group with limited access to mapping technology (especially a library system or other local municipality that does not have thousands of dollars to spend on ESRI software).

“Racial Dot Map” Visualization Discussion

My assignment in Geographic Information Librarianship is to find, read, and be ready to discuss a peer-reviewed GIS related article for class.

I had initially seen a GIS topic of interest come to my attention via my daily browsing for nuggets of information on social media – in this case – Facebook.

Someone had shared a “dot map” showing population data for the United States based on census data. Each dot on the map represents one person, and all of the dots are color coded to represent race. Keep in mind these are approximations of race – if you zoom in on my house, you won’t see me exactly, but you’ll see a “representative” of my census block.

This had been done previously (See the “Census Dotmap” at ), but by integrating additional datasets to “guesstimate” the population density by census block, an enhanced visualization was made possible.

The original article was linked on “” which had the inset text proclaiming “This is the most comprehensive map of race in America ever created.”

Here’s the original article:

What’s fascinating to me is that the representation visualizes 7 gigabytes of data.

Now the blogosphere and media is abuzz with this, but I need a peer reviewed article.

So, while the dotmap has it’s own Web page, I am turning to an earlier study that is cited as the “inspiration” for the more recent study.

The study was peer reviewed by the Advisory Board for the US2010 project. The report is entitled “The Persistence of Segregation in the Metropolis: New Findings from the 2010 Census” and can be downloaded online: .

In this, report, 2010 census data suggests that desegregation is a slow process, and growing hispanic and asian populations are “as segregated today as thirty years ago.”

Because I live in a “typical” black neighborhood with 40% whites, this item of analysis caught my attention:

“Yet another factor is the difference in the quality of collective resources in neighborhoods with
predominantly minority populations. It is especially true for African Americans and Hispanics
that their neighborhoods are often served by the worst performing schools, suffer the highest
crime rates, and have the least valuable housing stock in the metropolis.”

A spatial analysis with census data showing demographics, income, and community resource can be useful for city administrators when making decisions about how to allocate funds. Perhaps I am being naive in hoping for a political world governed by data-driven decisions, but the technology nonetheless exists to do so.

This kind of decision making is a way to ensure that resources are distributed equitably.

However, the value of the “dot map” is clear in reviewing this paper, as much of the data is presented in tabular form, without any spatial visualization. Spatial visualization can enhance the experience of absorbing the data and intuitively understanding what it means. The example of 8 mile road in Detroit with a clear dividing border between black and white communities is a very clear representation of the difference between the two data reporting options.

Stats 537 Homework 2

Homework two was challenging as I am struggling with picking up SAS.

I have had some prior experience with R and feel it is more straightforward.

This homework prompted me to e-mail the instructor concerning learning R versus SAS.

Homework 2 is archived within this post; see attached file.


DataONE / DataUP Tool User Experience Study

Hi Rachel,

Just wanted to confirm with you I’m planning on meeting with you in Hoskins after my class ends Friday morning.

I have spent a bit of time assessing the status of DataUP.

One thing to clarify in the morning is what DataUP tool Dr. Tenopir and Prof. Frame want us to look at.

Per this recent (August 4) tweet, , the Excel add-in is being phased out in favor of the OS agnostic Web app:

Therefore it seems likely any usability testing should center on the Web app.

Upon confirming a focus on the DataUP Web app, I think it would be worthwhile to reach out to the research team that devised the DataUP Excel Add-in.

We may find they have some ideas in mind for usability questions and perhaps did not have the time to address the questions or run usability studies on the tool.

The researcher team’s contact information is listed below:

kristin.tolle [at]
carly.strasser [at]
John.Kunze [at]
patricia.cruse [at]
Stephen.Abrams [at]
Perry.Willett [at]

In looking around today, I found these resources that seemed helpful: Simplifying data repository adherence and metadata management for environmental science
This tutorial shows you how to use the DataUp web application, a tool that is designed to help environmental scientists easily upload files, create and manage metadata, create a unique data citation, and post files to a repository.

Thanks, and see you in the morning.


Role of Dataset in Scholarly Publishing – Final Paper Feedback

From: Tenopir, Carol
Date: Fri, Aug 9, 2013 at 6:53 PM
Subject: RE: Jessel IS590 Research Paper
To: Tanner Jessel

Tanner: Attached is your final paper with my comments and your grade on the paper. Below is a summary of your grades for each assignment and final grade. I had a great time with you all in London and enjoyed reading your journals and papers. Have a good fall semester!

· Attendance and participation – (25%) Grade: A. We appreciated your insights and active questions throughout the two week experience. You added a lot to the course.

· Daily Journal –(25%) I loved your blog entries! Hope you follow up on some of the things you’ve made notes on, for example looking into Scielo some more and following up with some of the folks you met.

· Course Paper – (50%) Grade: A see comments on paper

Final Grade in Course: A. Great work!

Carol Tenopir
Chancellor’s Professor, School of Information Sciences
Director of Research and Director of the
Center for Information and Communication Studies
College of Communication and Information
University of Tennessee
1340 Circle Park Drive, 423 Communications Bldg
Knoxville, TN 37996-0341
Office: 865 9747911


Aerial photography of Great Smoky Mountains, NC-TN


The week before last I attended the graduate student open house at Hodges library.

I spoke with a librarian staffing a table with information about the Smokies Collection about the University potentially acquiring some historical aerial photography of the Park. Would you happen to know who that librarian was? She was handing out bookmarks with images from the Smokies Collection.

I took a business card to follow up but unfortunately have misplaced the card, but I do want to follow up on this so I thought I’d try here.

I am currently volunteering with the National Park on a cartography project documenting historical land use in the Smokies.

We’re attempting to classify land use using early aerial photography surveys conducted by the TVA and USGS in the 30s, 40s and 50s.

Through a partnership with Clemson University, Park Service archivist John McDade already has digitized aerial photography from the 1950s, the earliest comprehensive set of aerial survey photos of the park.

However, because evidence of land use can rapidly change, I hope to obtain earlier photos documenting land use prior to the Park’s formation.

Toward that end, I have been in communication with both the National Archives Cartographic Section in College Park, Maryland, and the Tennessee Valley Authority’s Maps Division based in Chattanooga.

My correspondence indicates that the National Archives has aerial survey film negatives for these years:

Haywood Co., NC – 1924, 1939, 1940
Swain Co., NC – 1939
Blount Co., TN – 1938, 1939
Cocke Co., TN – 1937, 1938, 1939
Sevier Co., TN – 1937, 1938, 1939

I should point out that these are index sheets, and the information I have from the Archives suggests that there are at least four 20 x 24 sheets per index per survey. Haywood County has 6 sheets because it is a larger area.

I should also point out that it is unknown to me what extent of the Park within each county is covered by the survey missions. While TVA/USGS maps uniformly detail structures, roads, and other features for the "lowland" areas outside the park, it was not until publication of the 1943 map series that details of areas within modern day park boundaries were added.

Essentially the earliest topographic maps (based on the aerial photo survey images) have a giant "blank spot" on the map where the Park is. This seems to indicate that possibly: a) USGS/TVA did not find mapping the interior of the park to be a priority or b) USGS/TVA did not have aerial photographs of the interior of the park.

My expectation is it’s actually a mix of the two – mapping the interior of the park was probably not a priority; therefore flight time over the interior was not allocated for missions photographing the interior of the future Park.

The reason I mention this is it creates a problem in determining how many film negatives would need to be digitized, which was a question the librarian I spoke with at the graduate student open house asked me.

For an estimate, for the 1953 series, there are 30 index sheets, 20 by 24 inches each. It seems likely to me that based on the size of each county, one index sheet probably indexes a lot of smaller film negatives.

Again, the information above was provided to me by the National Archives.

The original film canisters are available from TVA Map and Photo Records office in Chattanooga. The original film negatives would need to be digitized, and this work has a fee associated with it. However, I believe this fee would be less than that of a vendor working out of the National Archives, which is the only option available for obtaining digital copies from NARA Cartographic Section.

From prior conversations with Peggy Cooper of TVA Maps, the prices in April 2013 were quoted to me:

$80 per hour for research to find the materials
$28.50 per 1000 dpi scan
$15.00 to write to a disc

Contact information for Peggy Cooper is below:

Peggy A. Cooper
pacooper @
TVA Maps
2837 Hickory Valley Road
Chattanooga, Tn. 37421
Toll Free-800-627-7882

As a graduate student working on this as a volunteer, obviously $80 per hour and $28.50 per scan for an unknown number of film negatives is a bit steep for me to pursue on my own, so I was excited to hear that University Libraries might be interested in looking at the material for possible acquisition.

Please let me know if there is any additional information on these photographs that you might need to evaluate this resource for possible acquisition.



Blue way proposed for Beaver Creek in Knox County

Reclaimed reputation for ridge top park