Digital Libraries & Archives: Imperfect but Invaluable

Digitization of print texts has greatly transformed the way in which academics and the public engage with historical work and data. Just like many innovations in the digital world that increase the efficiency of researching history, the digitization of primary and secondary sources have allowed people to cut the time they spend searching for historical information. Yet, also like other technological advances of our time, the increased efficiency of digital history reading also threatens to alter our relationship with books and records of the past.

Digitized Secondary Sources

Beginning first with consideration of digitized secondary sources, I spent some time searching for historical texts that I have previously read in their physical form. Reading sources I am familiar with in a new way underscored that the process of reading historical books on websites such as Google Books both adds features that can make reading more efficient and that take away some of the conveniences of a physical book.

In one of them, Mark Krasovic’s The Newark Frontier: Community Action in the Great Society nearly every page was available to read, which is incredible for a book new as of 2016. I last consulted this text writing a historiography on Newark, where I recall struggling to get a hard copy of the recently published book. Had I known this was available at the time, I could have saved myself much trouble. However, I remember heavily annotating this book with post-it notes, a feature that is impossible on Google Books. Ironically the part of the book I needed the most for a historiography assignment, the notes, are the most incomplete part of the digitized text so this would have been of little help.

Also, in Carmen Teresa Whalen’s From Puerto Rico to Philadelphia: Puerto Rican Workers and Postwar Economies I sought to test the “search in this book” feature. The greatest strength for these digitized books is the ability to quickly scroll through texts and to quickly navigate key words.  When I had last read this book, I was using it to understand the regional context around Puerto Rican workers and activists in nearby 1950s-1960s Trenton, a subject that can be found in no other book. Remembering the time I spent sifting through the book for mentions of these workers and residents across the Pennsylvania-New Jersey line, I tried the simple “New Jersey” search on Google books and immediately came up with a dozen references. One drawback was immediately clear here, as each mention of the state appeared on a page with “no preview”. However, the ability to at least know where to look in a physical book for a term that does not appear in the book’s index is a valuable time-saver.

A more serious issue results from the failure of the site to link all chapters in both of these books under the navigation bar. Advertised as a way to skip between chapters, clicking on the navigation drop menu only gives opportunities to locate to some of the chapters. Similarly, from the index and endnotes of both texts only certain pages link to the content pages, providing only partial convenience. This can be frustrating to a user and can cause a reader to overlook parts of a text.

Digitized Primary Sources

Online archives similarly make the research process for historians easier while also not offering the complete benefits of visiting an archive and browsing through physical documents. The two documents I searched from the Open Content Alliance are also ones in which I have looked at in the past in physical format. The first, The Grants, Concessions, and Original Constitutions of the Province of New-Jersey is an edited compilation originally published in the late 18th century of Early American laws in East and West Jersey. I have used this on several different research projects to track changing laws in the period. The ability to search for words within this volume is enormously useful as this book is not organized logically in any geographical or chronological order. Additionally, while I own a hard copy of this book, the 763 page Concessions can be a pain to carry around with me. Having it in a digital format allows me to reference this important source wherever I am. One drawback, however, results (similarly to Google Books) from the inability to save pages on With the physical book I can color code flags to pair related laws and policies, which is nearly impossible on the Internet. One way around this is to download the file as a edit-able PDF but this often overloads my computer so is not a reasonable option.

Another source I located on the site that I have used frequently is the Town Records of Newark, New Jersey 1666-1836. I relied on these records very strongly during my undergraduate thesis on 17th century Newark, so having a legible and complete copy was important. Two different copies of this edited source have been digitized, each offering a significantly different view despite being of the same edition. The first, contributed by University of California Libraries, appears most like the physical book housed in many libraries, with faded pages that can be hard to read. It features a long map (several pages wide) of 1806 Newark which provides much important information about the town layout and population size. The other one, digitized by Google, had scanned the pages in such a way that increased the contrast of the image, with words appearing darker and the background whiter. This makes it much easier to read than the first, but it also allows readers to forget the age of this typed copy. The second one also has been scanned without folding out the map page, causing readers of this version to miss a critical part of the document. The clear difference between these two versions on the same digital platform demonstrates the extent to which different factors can affect the digitized appearance of a primary source.

Both of the primary sources discussed here are printed in book form, so when I say I own copies of them of course I am talking about as edited typed versions. Rarely do primary sources from this period appear on sites like or other large databases, although they are sometimes digitized by state archives or local historical societies. Still though, these appear in smaller numbers meaning that most of the primary sources available on Open Content Alliance are ones that have already been printed in mass quantities.


Without a doubt, solely having historical sources available in digital form heavily detracts from its value as a resource. However, as an option available in addition to physical copies, one is not required to choose between the two and can benefit from the strengths of both.

3 thoughts on “Digital Libraries & Archives: Imperfect but Invaluable”

  1. That was a great idea to compare physical books with their digital counterparts. Annotating books is extremely important while doing research and its frustrating that not all digital sources allow you to do so. I wonder who digitizes the books on Google Books and elsewhere. If they are not familiar with historical research that could account for why the notes are not digitized well.

  2. I wholeheartedly agree on digital vs physical annotation. My own (messy) system of highlighters, underlining, annotating, bookmarking, etc. never translates to digital books, because it’s tactile as well as visual. I need to be able to quickly look back and forth between different pages of different books in front of me; I can flip to the page with the yellow post-it bookmark a lot faster than I can scroll back and forth with a document reader, and I don’t lose my place if my computer crashes or eduroam suddenly decides to log me out. I know plenty of e-reader formats allow you to highlight and type annotations, but every digital interface is slow and clunky compared to just jotting your notes down in the sidebar with a pen.

  3. I also love the ability to search through texts not yet in the public domain but already digitized. I usually use HathiTrust to do this, because I’ve found their search tool to be much more comprehensive than Google Books– HathiTrust tells you how many times the word appears on the page. I agree that, like you said, it’s not really a choice between digitized or physical. They work together, and each offer strengths.

Comments are closed.