London Experience, Day 5: “Really Clever Stuff” in Science Publishing

Today the 19 Pratt and UTK students travelled to our base-of-operations instructional facility, the Weston Room at the Maughan Library.  This is an impressive gothic-styled building.  The room we were in dates to the 1600s and in a morbid sort of way, as a deconsecrated chapel, had remains of former “Masters of the Rollls” entombed in the same hall we took instruction in.  Memento mori, indeed.

The first speaker for our instruction was Dan Pollock, with experience in digital publishing at Nature.

I greatly enjoyed his talk, particularly his comments on “mining the world’s knowledge.”

Essentially, the “circa 2010” publishing model is “really inconvenient.” And what is that model?  Well, it’s a collection of articles, chapters, metadata links, and structured datasets.

The better model incorporates structured, interactive, and queryable figures and text. Incorporates semantic search.  Features linked data.

Already, semantic search is available from organizations like Google Scholar, PureDiscovery, SureChem, and WolframAlpha described as a “computational knowledge engine.” (side note: what’s up with combining names, is that a side effect of the no-blank-space URL or just something marketers love to do?)

This is what Pollock is talking about when he says “really clever stuff here” in the information life cycle – and ultimately in the research data life cycle.

This is the Web 2.0 and Web 3.0 technology – semantic Web, linked open data, analysis and visualization tools built in to the scholarly journal, which scholarly e-publishers must pursue to continue the revolution that the printing press started in 1450.

Publishers can start to make their data available via Application Programming Interfaces, and adopt other standards like HTML 5 to advance interoperability. has already done some of this with the linked data platform described at <>. It was great to a science publisher like Pollock speak – very engaged and enjoyable presentation and certainly appreciated the message about the role of information technology and information science in continuing the publishing revolution.







London Experience, Day 4: The Data Paper

Today the class enjoyed a series of presentations at the first annual “Strand Symposium on Digital Scholarship and ePublishing” at Kings College, London.

I’m providing the schedule here, along with an uploaded version 2013-Strand-Epub-symposium:

09:30-09:35 INTRODUCTION by Anthony Watkinson (CIBER Research), organiser and chair

09:35-10:15 Introductory Presentation by Professor Carol Tenopir (UTK) – How Scholars decide to Trust Resources

10:15-11:30 FIRST SESSION Building and evaluating cultural resources

1. Professor Tula Giannini (Pratt-SILS): How Brooklyn’s Libraries and Museums Collaborate to Create a new Digital Cultural Heritage Resource: The Brooklyn Visual Heritage Website

2. Professor David Nicholas (CIBER Research): Evaluating the Usage of Europeana

11:15-12:00 Refreshment Break

12:00-13:15 SECOND SESSION Are we publishing and, if so, for whom?

1. Dr. Stuart Dunn (KCL) The distinction between exposing data and publishing: a case study from archaeology

2. Dr. Susan Whitfield (BL) The challenge of creating a resource and interface that is accessible across linguistic, disciplinary and cultural boundaries to the everyman of the Internet

13:15-14:15 Opportunity for lunch: lunch is not provided by there are many appropriate places to eat in the Waterloo surroundings.

14:15-15:30 SESSION THREE Managing online resources

1. Dr. Richard Gartner (KCL) Digital Asset Management- the pleasures and pitfalls of metadata

2. Matt Kibble (Bloomsbury) The product management role in planning and building digital resources

15:30-16:00 Refreshment break

16:00-17:15 SESSION FOUR Investment and sustainability

1. Dr. Paola Marchionni (Jisc) The end is the beginning: the challenges of digital resources post digitisation

2. Chris Cotton (Proquest) The benefits of public private partnerships in large-scale cultural and heritage digitisation

17:15-17:30 CONCLUDING REMARKS Anthony Watkinson

I enjoyed taking in a lecture on “Trust and Authority of Scholarly Resources” from Dr. Tenopir – the first time this data has been presented.  It is useful for understanding not only the information behavior of scientists, but the general public as well.  The question “who do you trust” will always loom large in public discourse.

Our course leader Anthony Watkinson, himself a former publisher and University College London lecturer, did a superb job putting together speakers.  In spite of the full slate of speakers and 19 graduate students, admittedly there might have been more in attendance.  Anthony felt he had promoted the event a bit late.  In all it was not bad for a first year on a week day with no free lunch!

Anthony did a big favor for the science data students in recruiting Dr. Fiona Murphy from Wiley Publishing to speak with us for an informal lunchtime talk on science data.  We also had one of the Pratt MLIS students come with us as she is interested in data science.

In preparation for the lunchtime meeting, Anthony sent a few key resources.  What I found most interesting was that Dr. Murphy had delivered a talk at the “Open Access Infrastructure for Research in Europe ” OpenAIRE/LIBER workshop on “Dealing with Data. What’s the Role for the library?

Dr. Murphy’s talk was entitled ‘Data Publication: a Publisher’s perspective’ and there are two ways of accessing it – see the presentation online <> or watch a video online <>.

From the description:

Fiona from Wiley spoke about what publishing data is all about: why it is important in terms of being cited and credited. The growing pressure funder mandates also plays a role.

Some things I found particularly interesting:

Charting the growth of open access – the number of papers published between 2000 and 2012 was under 5,00 for most papers, with exception of BMC, which broke 5000 in 2005, with other publications lingering well past the first half of the first decade of the new millennium.

Dr. Murphy also commented on what exactly a data article is:

A data article describes a dataset, giving details of its collection, processing, software, file formats etc, without the requirement of novel analyses or ground breaking conclusions. It allows the reader to understand the when, how and why data was collected and what the data-product is.

Example: Geoscience Data Journal

With an example data paper:

Some other links:

“Peer REview for Publication & Accreditation of Research Data in the Earth sciences (PREPARDE)” –

“The Research Data Alliance aims to accelerate and facilitate research data sharing and exchange”

It is probably worth subscribing to this mailing list <>

And a quote worth sharing:

Publishing an article without at the same time making the data/evidence available is scientific malpractice

Geoffrey Boulton

Dr. Murphy is on twitter:


And is a member of the “International Organization of Scientific, Technical & Medical Publishers” research data group  <> focused on “exchanging information on new initiatives about the integration of research publications and research data and 2) to discuss evolving best practices in this new area.”


London Experience, Day 3: High Class to High Tech

On the second programmed day of the London portion of my scholarly publishing course, we visited The British Library.

The building itself evokes the “stack of bricks” style of architecture, echoing an earlier brickwork victorian structure nearby, but still mildly clashing in form if not in hue.

Inside, cool marble ascends, centering the eye on a five story collection of books – “the King’s Library,” that terminates in a pool of black marble that sublty mirrors the collection and gives the appearance the collection extends to infinity. Seeing the personal collection of  King George III was impressive.  This is the reading collection of the man vilified by our founding fathers – even Thomas Jefferson’s library at Monticello was not so large!

While the library’s primary mission is to archive a copy of every single publication originating in the United Kingdom, translating to 5,000 publications per day, there is also some high tech wizardry going on in the digital forensics lab.  This was of particular interest to me because of nascent problems in the field of ecology – a relatively new field – where prominent early leaders in the field are now nearing retirement age.

As the DataONE Data life cycle points out, retirement of the primary researcher can be a key moment in the longevity of a dataset.

Enter Jeremy Leighton John. He has something of a Crime Scene Investigator’s capabilities- but his forensic investigations center on retrieving information from archaic computer systems.  With piles of floppy disks, hard discs, and computer programs no longer used, Jeremy has honed forensic computing to an art.  He can emulate ancient operating systems, programs, and doesn’t even need to turn the original computer on to access its files.

Interestingly, he’s an ecologist in a “library” world, much like myself and many other ecologists I’ve met interested in data science and data preservation. And has some fascinating research interests spanning from complex systems to bioinformatics.

He’s also on twitter:

While Jeremy’s talk was most pertinent to my interests, some other topics worth looking into include:

“Making maps accessibly through crowd-sourced geo-referencing;”

Digital Curator Nora McGregor’s talk on digital scholarship (@ndalyrose on twitter and there is a Digital Scholarship blog) where she presented some continuing education opportunities for librarians at the British Library.

Some courses I pulled from the list that I wish were taught at my own institution:

  • Data Visualization for Analysis in Scholarly Research
  • Georeferencing and Digital Mapping
  • Information Integration: Mash-Ups, APIs, and the Semantic Web
  • Managing Digital Research Information
  • Working Collaboratively: Using the BL Wiki and Beyond
  • Metadata for Electronic Resources: Dublin Core, Mets, MODS, RDF, XML

I may be able to track down some of these course offerings online – for instance the first course has additional readings online.

Finally I have in my notes the “Big Data project” from Oxford Internet Institute at <>, which is quite similar to the U.S. project “the Internet Archive” that takes a snapshot of the Web.  This is unique in that it is looking at .uk domain only.

Also of note was an exhibition on propoganda – from roman coins to twitter.  Twitter was perhaps the most interesting – as it employed a type of sentiment analysis to determine if a given tweet was positive, negative, or neutral.  There was a visualization wall as well, allowing a color-coded view of each tweet in real time or linked with a timestamp to a recorded event unfolding on TV, such as the 2012 Olympic Ceremony.

Finally, I got to see the Magna Carta, and many other rare books.  The abundance of rare books made it clear why security was incredibly tight for the entire facility – even employees of over 30 years were subject to intense scrutiny.

One last item of note – and as you can see there were many – was the Qatar digitization project. Funded by oil wealth, a floor of 14 information professionals, two with library science training, were busily digitizing the archives of the East India Trading company where their activities were focused in the middle east.


London Experience, Day 2: Cultural Impact on E-Publishing?

Since settling in and prior to arrival, a key question on my mind is this: why is London, England such a center of publishing activity for scholarly journals?  Why might it be more advanced than in the United States?

Are there cultural differences, or has Europe simply been at the forefront of publishing for the longest?

Certainly there is a long tradition of academic scholarship in Europe – and Europe is the birthplace of publishing in the modern sense, of mass production and dissemination of printed material.

There are certainly an abundance of scholarly institutions – libraries, museums, universities – in and around London.  Great scholars from Newton to Darwin hail from England – Darwin is even on the 5 pound bank note here, and is buried in Westminster Abbey.  Egyptology, “natural philsophy” and medical arts were all in vogue in Europe – particularly London.

The Dutch Scientist Anton Von Leeuwenhauk in 1683 published a series of letters and pamphlets on his observations of small “animacules” via primitive lenses, precursor’s to today’s microscopes.  Who received his correspondence?  The Royal Society.

Not of Denmark. That’d be the Royal Society of London. So why was a Dutch scientist corresponding with the Royal Society?  Was it the only venue for scientific communication in Europe? Perhaps.  Founded in 1660, it was certainly the oldest, and in 1665 began publishing “Philosophical Transactions,” arguably the first scientific journal in the world.

So London clearly had an early start in the scientific publishing game. Two protestant states, Germany and England, seem to have a key role in the science communication story.  Perhaps because anywhere else, scientists risked excommunication?  Or, was English as the language of commerce the central focus of early science communication efforts?

This might be the first sign of the cultural impact upon scholarly publication in Europe.

I also have considered the attitude of the people of England themselves.  Citizens of the U.K., especially following the Victorian age of empire and epic struggle against fascism and totalitarian rule in the latter half of the 20th century, are perhaps among the most ardent supporters of a free and open democracy.  England indeed was among the first countries to reject foreign rule by the Vatican, and first to set the course for modern representative democracy with the Magna Carta.

What might this cultural love affair with open access and democratic control and self determination mean for scholarly publishing?  In spite of being the center of commerce and a world financial center, clearly embracing the capitalist model, might the cultural love of community spirit lead to a greater emphasis on open access?  These are interesting questions to consider in contemplating access to scholarly materials as a product of England.

A few cultural differences I noticed right away: on my trip from Heathrow to Central London via the Tube, I noticed that the concept of “personal space” was non-existent.  In the U.S., the social norm is to give each other a wide berth of at least 3 feet.  Even on the elevator to our dorms, and among friends who’ve known each other for nearly a year, American students would forgoe a packed elevator to wait for the next “lift.”

Not so in the U.K. Complete strangers shared not only personal space but physical contact – completely without hesitation. No accident, I sat elbow-to-elbow on the tube, with no effort made to keep at least an airspace.

I wondered if this was something inherrent to a large city – I’ve visited New York and Washington underground transit systems and don’t remember such familiarity – or if it’s just part of British culture. They fought tooth and nail through a war – many families cramming into the underground system.  Propoganda posters like “All together now” may have galvanized society to see itself as one cohesive unit – casting aside the “cowboy” or “desperado” mythos of the U.S.

Further prompting my thinking along these lines was a message in both the common kitchen and bathroom – concerning “fairness” of cleaning up.  While in the U.S. such snarky, slightly passive aggressive notes are more concerned with “we are not your mother or father” and encouraging personal responsibility, the bathroom/kitchen messages in my University College dorm (Astor College on Charlotte Street) extolled the virtues of being “fair.” It is “not appropriate or fair to others” to leave dirty dishes in the sink; “it is not fair to the cleaning staff or others” to leave a mess in the bathroom.

The U.K. does seem concerned with fairness as evidence by public transit, universal healthcare, and this anecdotal evidence provided by my signs.

Might this cultural reverence for “fairness” also permeate into the publishing realm, specifically regarding open access to scholarly and scientific information?

Finally, London was the birthplace of the term “scientist.” There was some discussion today in class regarding the start of the first true “scientific journals.” An interesting way of looking at it could be via the Google nGram project.  Already described by a variety of bloggers, the popularity of the word “scientist” can be traced against that of “natural philospher” based on both words’ appearances in digitized literature. A chart of the two from 1800 to 1900 shows “scientist” is the clear victor.

From Cambridge University, William Whewell coined the term “scientist” around 1830 as a play on “artist.”  It was not until the later half of the 19th century that the term flourished in the popular literature – from the Google nGram for the word, we can see that it passes “natural philosopher” only in the 1870s.

The scitext website from Cambridge puts the earliest French semi-scientific publication, Journal des Sçavans at 1665, the same year that “Philosophical Transactions of the Royal Society of London” started up.

Finally I see something of an arc from the “natural philosopher” – the tinkerer, ponderer, renaissance man – to the true specialist – and finally to the “generalist,” as one blogger suggests prominent, interdisciplinary scholars like Jared Diamond can be described.

Even Charles Darwin himself was a generalist – with wide ranging interests.

Yet to stay current, scientists seem to increasingly be forced into specialization.  Perhaps a rise in collaboration is the only way to retain the “natural philosopher,” holistic perspective.

Some additional reading:


IS 590 – Problems in Information Sciences – Scholarly E-Publishing

My summer semester of 2013 includes a 3-credit hour course in Scholarly E-Publishing.  This course provides exposure to an international electronic publishing industry, particularly focused on journal and book publishing, from a world center of electronic scholarly publishing: London, United Kingdom. It offers an intensive series of talks, site visits, and instruction designed to explore how e-publishing is changing both the way scholarly research is conducted and communicated.  Information professionals from Oxford, Cambridge, the British Library, Elsevier, Wiley,  Proquest and more share their unique perspective on scholarly publishing.

Because scientific effort must be clearly communicated and disseminated via scholarly publishing, the course content is of particular interest to the University of Tennessee “SciData” program and is highly relevant to my professional and scholarly goals.  I am particularly interested in understanding how publishers intend to work with open access data repositories such as DataONE, Dryad, or spatial data repositories such as ShareGeo in the UK or EDAC in the U.S.  I am interested in the concept of the data paper, and how a dataset and a data paper might be linked to a publication and shared across platforms with the scholarly community.

The course is a joint venture of University College London Department of Information Studies, the Pratt Institute School of Information and Library Studies in New York City, and the School of Information Sciences at the University of Tennessee.

Given my background in natural sciences (B.S., Ecology and Evolutionary Biology) and entry into the UT School of Information Sciences 2013 cohort concurrently with the 8 SciData Scholars, I was allowed the opportunity to participate in the course.

Along with a blog of reflections on daily course material and the London experience, the course culminates in an individualized research paper.  I intend to focus on the role of data and datasets in scholarly publishing.  The role of datasets in scholarly publishing is most pertinent to my work with the DataONE project concerned with accessibility and preservation of environmental data.

For more on the course, follow the tags “INSC 590 – E-Publishing” or view the syllabus online (SU13-IS590-E-publishing, PDF). The Course Syllabus is also available as a .doc format <>.



London Experience, Day 1: E-Publishing as an Information Problem

I arrived in London on Monday to overcast skies and blustery temperatures – blustery at least for a  Tennesseean, especially one who’d just spent 3 weeks in New Mexico’s high desert at the University of New Mexico Environmental Information Management Institute.

I’d been chasing the sunrise eastward since 10:25sunday morning, leaving the arid desert for a 2 hour hop to George Bush International Airport in humid Dallas, Texas, then taking a direct 9 hour International flight to Heathrow International in London.



Cloud bank on the approach to London.

The sun rose high over the northern latitudes above a sea of clouds – I pondered how generations of Europeans had made the reverse trip across the frigid North Atlantic towards a better life in the Americas – and how some never made it to that promised land.  It is odd to be an American making the reverse journey, especially so non-chalantly as on a jet aircraft.  Perhaps the former fleet of Air France supersonic jets gave an even more injurious insult to the hard-won sea voyages of early seafarers.

As the plane made a descent to London, I caught a few glimpses of the rocky coastline – which only a few weeks before I’d been discussing the fractal nature of the British coast. It was nice to see in person.  Nearing London, I took interest in the orderly structure of the city’s layout, especially in comparison to U.S. cities like Atlanta and Charlotte I’ve flown into – metropolitan areas with much surrounding land where sprawl freely creeps.  The countryside was lush and verdant.  Rolling hills in the distance surprised me for their semblance to the rolling green hills outside Knoxville, Tennessee.  The city grid seemed efficient and bustling – my eye focused on the steadily snaking rail lines and linear beads of mercury running along iron oxidized tracks, carrying morning commuters to their destinations.

Note the open space and clearly defined edges.  Culturally, is England less prone to Sprawl?  This is the land of the "commons" after all.

Note the open space and clearly defined edges. Culturally, is England less prone to Sprawl? This is the land of the “commons” after all.

In as much as communication and information is a transportation problem, essentially how to transfer information from one location to another, the idea of my transatlantic flight and these inner city commuters captured my imagination.  As much as my flight makes earlier transatlantic flight seem hopelessly primitive, I wonder how many years will pass before out fossil-fuel based jet setting is also hopelessly primitive, perhaps as our atoms are bounced across continents, transmitted and re-assembled as a data stream a la Star Trek transporter fame.

As I embark to study scholarly e-publishing, the same question arises: what technology has yet to exist that will revolutionize publishing as much as the printing press did in 1450?  Certainly the Internet has significantly enhanced publication and dissemination options, but it is still largely based on an electronic version of the print medium paradigm.  Even with tablets, many newspapers simply offer a PDF version of their daily magazine – embracing the full potential of digital publishing may require not only technical expertise but also a strong imagination to envision and embrace the possibilities presented by fully digital scholarly content.

I’m looking forward to exploring this topic in the next two weeks.