Some people eat, sleep and chew gum, I do genealogy and write...

Tuesday, February 18, 2014

The power of automatic source searching -- Part One Past and Present

One of the newest trends in online genealogical databases is the ability of a very few online family tree websites to search for source records for each person in an online family tree or connected desktop program. Additionally, the programs post the acquired sources directly to the individuals in the family tree listed in the sources. This integration of an online family tree with the ability of the website to search for sources if a very powerful tool for genealogical researchers in some limited circumstances. There are presently three major online programs with this capability. Each of the three programs differ in their individual methodology and effectiveness, but all three advance the researcher's ability to find documents in the source documents maintained in the individual database programs.

The three websites are well known and used by millions of genealogical researchers. They are:


Of course, MyHeritage.com and Ancestry.com are subscription services with monthly or annual fees. FamilySearch.org is free to all users but some features require registration.

There are other programs that aid in the search for online sources, but lack the ability to integrate the found sources automatically into a family tree. Let me illustrate this process. First, I will show a diagram of the traditional research model:


Here the researcher goes to a repository of a record source, such as a library, archive, court house etc., and searches through the individual sources for information about his or her ancestors. After finding a source the researcher records the information from the source, and following the research cycle, takes all of the steps to start the process over again. For a reference, here is a diagram of the Research Cycle:


The steps in the Research Cycle vary in interpretation, but generally all of the variations are similar.

Once the researcher has completed the cycle with one ancestor, the process is repeated ad infinitum. In the process, the researcher accumulates a great deal of paper. Up until recently, the advances in computer technology had simply substituted an online database for the paper repository and moved some of the researcher's paper records either to genealogical database programs or digitized copies. I call this "computer aided research." Here is the modified research model showing the benefits derived from a partial computerization of the model:


As you can see, the benefits are limited. We are still paper dependent and even though the work flow can be accelerated, with this model we are still dependent on paper sources.

This model evolved into a slightly different one once we had access to online sources. Here is a modified diagram showing the addition of the online sources:


Some researchers find this model to be defective and do not see the need to have a computer, mainly because the end product seems to be the same. It is relatively easy to reject the online sources model because there is still the need to retain paper copies and search in repositories to find paper records.

But now there is a new model. This involves the online sources doing the searching for records. For this to be a benefit, there had to be three things happen:

  1. The online sources had to reach a critical mass of information
  2. The individual user had to accommodate using records online in a family tree
  3. The search engines in the online websites had to be effective enough to make the process work
All three conditions had to be satisfied. You can probably see that some genealogists currently reject all of part of these conditions. 

Now, what do I mean by "critical mass?" In this case, the amount of information online had to reach the point where a significant number of the types of sources used by most genealogists were readily available. For example, all of the U.S. Census records, many vital records, and other types of records had to be searchable. The reason for this is that absent this critical mass of records, the researchers would be going back to paper (or microfilm) records so frequently as to diminish the value of the online experience. Once that point was reached, almost all genealogical researchers would benefit directly from the availability of online searches. 

Putting this into practical terms, as genealogists began to find source records online, the utility of searching online increased. At some point in the not too distant past, the availability of online sources reached the point when most of the "bread and butter" types of sources and many not so easily found sources were beginning to appear in large online database programs. Once that point was reached genealogy companies began to see the utility of hosting online family trees and then, as an added value doing automated or semi-automated searches for sources that applied to the individuals in the online family trees. At this point, there needed to be a major advance in searching technology for all this add any significant value. This is where we are today. In Part Two of this analysis, I will be outlining how the most recent changes, especially those that were codified at RootsTech 2014, will affect the future of online automatic source searching. 

Stay tuned. 

No comments:

Post a Comment