RootsTech 2014

Some people eat, sleep and chew gum, I do genealogy and write...

Monday, June 30, 2014

More Sources than you can imagine -- Five Steps to their Discovery

Most of the process of becoming an experienced genealogical researcher involves the realization that there are more sources than you will ever have time to research. From diaries and journals locked in trunks in attics and basements, to new digitized databases on the large online websites, there seems to be an unlimited supply of resource material. I have been online now since the Internet had a total of six websites and have had a front row seat to the explosion of information available online. But that is only the tip of the iceberg of information waiting to be put online. I think one of the major steps in the transition from being a passively interested genealogist to becoming experienced, is taking that first physical trip to find records.

That first trip may be a simple as a visit to a cemetery to find a grave marker or as complicated as a visit to a major research library, but it is this step that begins the transformation of a novice into an expert. My journey began with visits to the Family History Library in Salt Lake City, Utah. Going to the Family History Library can be an overwhelming experience because of the huge number of microfilms and other resources for searching. But even a visit to the Family History Library is not the same as searching for individual records in cemeteries, local historical societies and other place where records are kept. If you feel either economic or time restraints in planning a visit to a remote location, try the alternatives; phone calls, email, physical letters (yes, you can still send a letter), contacting friends or relatives or other genealogists in the area and many other options.

Here we go with the five steps.

Step One -- Make sure you are looking in the right place. 
I write about this point and teach about it continually. Most of the problems I encounter with researchers' inability to find an ancestor originate from looking in the wrong place for records. I might also add that knowing how to follow where the records may have moved over the years helps also. You absolutely must make sure you have positively identified a location where your ancestor lived. This is necessary for each ancestor. I don't mean the general area, I mean exactly where the ancestor lived as close as possible. For example, finding a grave, confirmed to be the right ancestor, is finding an exact location. You may have to use maps, gazetteers, online place lists and other resources, but it is absolutely necessary to identify an exact location. Without an exact location, you are guessing that you have found the correct ancestor.

Why is this necessary? Because genealogically relevant sources are created at or near the place where the event occurred. This rule implies that you can also identify the appropriate jurisdiction where the event occurred. This means, you have the right county or country in existence at the time of the event and that you have the name of the place at the time also.

Step Two -- Begin your search for records, not for people.
Too many inexperienced genealogical researchers start immediately looking for names and dates rather than understanding what kinds of records might be the most useful given the times and places where events may have occurred in an ancestor's life. For example, the most common method of approach is to search for the person, by name, in an online website. One of the first experiences of a newly minted researcher is the realization that a whole lot of people had the same name as the target ancestor. Because of the huge amount of information online and because most beginners are looking for recently living ancestors, it is common to immediately begin finding records for an ancestor. Unfortunately, these early successes lead many researchers to think that searching in Ancestry.com or FamilySearch.org is all there is to doing genealogical research. Nothing could be further from the truth. To continue being successful in finding records about your ancestors, at some point you need to begin looking for records and put searching for names into the background.

Step Three -- Find out about the records.
It is interesting how many genealogical researchers never use probate or land and property records because they don't understand them and don't take the time to learn how they work. Every class of records has its own unique jargon. In addition, all records are limited to specific time periods and places. Without knowledge of the record, what they do and do not contain, you can waste a lot of time looking at records that cannot possibly contain information about your ancestor. Take the time to understand how any particular type of record was recorded, what information was and was not included and how your ancestor may have been recorded.

Step Four -- Milk the records you do find for all they are worth.
Records pile up like pancakes in different jurisdictions. For example, there are local records, county records, state records and national records. You need to be aware of all the records that exist at every level of jurisdiction in place at the time of an event in your ancestor's life. This may become a real challenge, especially if you are searching in "Germany" in the mid- to early 1800s. You will soon find out that during most of the time, Germany as such, did not exist. You will need to understand what records were or could be generated at each jurisdictional level from the local church parishes to the national governments as they changed frequently.

Step Five -- Carefully examine and evaluate each record found.
I see too many researchers who, upon finding a record such as a U.S. Census record, just file it away or attach it to their ancestor's file and then forget the record. You need to look at every record for clues as to where additional records may be found. The new researcher thinks their work is done once they find the record, now they can move on to the next ancestor. It is not that simple. Every record found suggests additional records that may also be available. Once you grasp this concept, you will be on your way to packing your bags for your first genealogical research trip. Most people think genealogists find people. To some extent that is true, but it is more accurate to say that genealogists find records about people.

Sunday, June 29, 2014

St. Johns, Apache, Arizona Margaret Godfrey Jarvis Overson Photography Collection Unknown Photos

On FamilySearch.org Family Tree Photos, I am uploading many of the photos taken by my Great-grandmother, Margaret Godfrey Jarvis Overson and her father, Charles Godfrey DeFriez Jarvis. Some of the photos were also taken by Henry Victor Overson, my Great-Uncle and my Grandmother, Eva Margaret Overson Tanner. Here is a screenshot of some of the photos:


All of these photos are tagged to Margaret Godfrey Jarvis Overson. You can search for all of the photos with St. Johns in the title by searching for "St. Johns" in quotes. This will bring up all the photos on FamilySearch.org with the town name attached to the photo. I have found that in order to come up in search, the photos need to be tagged to a person. If you find one of your ancestors in a photo tagged to Margaret Overson, just remove the tag and put on the correct tag. But please include St. Johns in the title or description or somewhere so others can find the St. Johns photos.

Thanks for your help in identifying these photos.

A Genealogical Conversation


The following is a hypothetical conversation between four genealogists. As you will see from the conversation, each of these individuals represents an archetypical type of genealogical researcher. I have named the genealogists, M. Black, M. White, M. Gray and M. Red.

M. Black
First, I would like to thank all of you for being here today. I think it important that we chat about the goals and challenges of genealogy from different viewpoints. I am especially glad that M. Red was able to be here, it is such an effort to come to a formal meeting like this.

M. White
Yes, I too am glad to be here. Did you prepare an agenda? I didn't see one being passes out.

M. Black
I thought this would be more of an informal discussion. Let's start by stating some of the goals we think are important in the overall genealogical community. I am recording this discussion and will provide each of you with a copy of the recording.

M. White
You mention a "genealogical community." I am not sure that such a community exists. Don't the members of a community have to identify themselves as members? I would guess that most of the so-called genealogical community members are oblivious to the community's existence.

M. Gray
And of course, you have no clear definition of genealogy. Who is and who is not included in the so-called community?

M. Red
That does raise a question. How do we set some kind of goals and recognize challenges if there are no definitions for the community and even the topic of our conversation?

M. Black
Perhaps, we can come to a consensus on those issues so we can move on to others? What do you think?

M. White
I would think that a definition of genealogy would include something about searching for a person's ancestors. It is this idea that people become acquainted with all of their ancestors that seems to be common denominator.

M. Red
Then what about the millions of bogus family trees online. It seems to me that the reality of the situation is that people don't really care about accuracy and reality. All they want is some names that they can claim to be ancestors. In fact, if those names include famous people or royalty all the better.

M. Gray
That seems to be quite a cynical view of the subject. Why not give those online family trees the benefit of the doubt and include them in the community? The number of people involved would certainly be more impressive if we include all those people rather than defining them out of the community. We have opposing choices when it comes to definitions. We can make them so broad as to include practically the whole world or we can make the definitions so exclusive that we exclude most of the people who consider themselves interested in finding their ancestry. What is it going to be?

M. Black
I don't think we have to take either extreme. I'm comfortable in including anyone interested enough to post a family tree online. Although, I personally would not put my genealogy online where others could copy all my hard work.

M. Red
I cannot agree. I think that merely posting a copied family tree online is not and cannot be called genealogy. I don't what to call it, but simply writing down what you know and copying something from someone else is not actively doing genealogy. I think anything remotely called genealogy has to involve research, documentation and analysis, at least some basics.

M. White
I am not too sure I go along with your definition. What about oral histories. If a culture has no written genealogies, isn't writing down what they know genealogy?

M. Red
Then we get into exception after exception. Perhaps we all just agree that we have no idea what we are talking about and leave it at that?

M. Black
Now, let's not get huffy M. Red. Coming up with a definition of genealogy cannot be too difficult. As to who is a member of some hypothetical genealogical community, let's let the people themselves decide if they are in or out of the community.

M. Gray
If I understand what you are saying, you would have anyone who merely said "I am a genealogist" automatically be counted as a member of the community?

M. Red
Who is counting? Who decides who is and who is not counted as a genealogist?

M. Gray
It seems to me very presumptuous of us or anyone to deign to define genealogy, much less include or exclude anyone from the genealogical community, assuming, of course, that such a community exists.

M. White
I would think that any definition of genealogy would include, at least, the following:
  1. An effort to learn about a person's ancestors
  2. Recording any information obtained
  3. Sharing that information with others in the family
M. Black
I don't agree with the sharing part at all. I once gave a copy of my family file to a cousin and he immediately posted the entire file as his family on an online database program. I told him to take it down but he never responded to me. Since then I find that dozens of other people, who I don't even know, have copied my work.

M. Red
So are we down to a definition of genealogy that includes anyone even remotely interested in their family or do we limit the definition in some way?

A long discussion is held off the record.

M. Black
Let the record show that we had a discussion off the record and have decided that we are certainly not in a position to define either the term genealogy or the nature of the genealogical community. We also have no way to include or exclude anyone from an ill-defined community. I suggest we table the discussion to another time. Any one want to make the motion? Is there a second? Those in favor? Motion carried.


This conversation points out two basic problems with talking to people about genealogy. There is no commonly accepted definition of what is or is not involved in genealogy and there is no defined genealogical community. I must take the position, contrary to my hypothetical conversation, that I would only include those in the genealogical community who take an active part. Putting a family tree online and then forgetting about it does not qualify as doing genealogy. I certainly would not exclude anyone who made an effort to find their ancestors, but I don't think merely copying a family tree and posting it online qualifies a person to be called a genealogist.


Ancestry and ProQuest Announce Expanded Distribution Agreement

I had only just recently written a blog post about who owns the major genealogy companies, when Ancestry.com and ProQuest announced an expanded distribution agreement. If you do research in libraries, especially with their online offerings, you should be acquainted with ProQuest. Here is a description of their services:
The company’s cloud-based technologies offer flexible solutions for librarians, students and researchers through the ProQuest®, Bowker®, Dialog®, ebrary®, EBL® and Serials Solutions® businesses – and notable research tools such as the Summon® discovery service, the RefWorks® Flow™ collaboration platform, the Pivot™ research development tool and the Intota™ library services platform. The company is headquartered in Ann Arbor, Michigan, with offices around the world.
Here is a quote from the blog post from ProQuest:
PROVO, UT and ANN ARBOR, MI, June 27, 2014 – Ancestry, the world's largest online family history resource, and ProQuestannounced today an expanded agreement to create broader and deeper opportunities to deliver premier genealogy resources to libraries worldwide. Under the new multi-year agreement, ProQuest will be distributor of both existing products, including Ancestry Library Edition, and future Ancestry products. The agreement also allows for significant content and feature improvements to ProQuest’s HeritageQuest Online. These enhancements will be developed in the coming months and powered by Ancestry. 
“We are pleased to announce our expanded strategic relationship with ProQuest,” said Brian Hansen, Ancestry vice president of emerging businesses. “We look forward to working together to create new and enhanced resources for libraries that uniquely respond to the needs of this growing market.” 
“This new agreement expands on our 10-year relationship with Ancestry to deliver leading solutions for our customers," said Simon Beale, senior vice president and general manager, ProQuest. "We're leveraging the synergies between our companies, bringing together skills and talents in way that will advance the family history services libraries can offer their patrons.” 
Ancestry Library Edition, powered by Ancestry and distributed by ProQuest, brings the world’s largest online family history resource directly to libraries. Featuring more than 13 billion records in more than 8,000 unique databases, Ancestry Library Edition is an unparalleled online global record collection from North America, the United Kingdom, Europe, Australia, and beyond. 
HeritageQuest Online, which can be accessed from anywhere, is a strong complement to Ancestry Library Edition. HeritageQuest Online is a treasury of American genealogical sources, including the digitized version of the popular ProQuest Genealogy & Local History book collection and other valuable content.
I am very interested in the outcome of this agreement because of its impact on HeritagQuest. This is a valuable website, usually available for free access through a library with your library card number. The website has languished for the past few years with little added content and a poor interface. Perhaps the agreement with Ancestry.com will freshen up this valuable website a bit.

Saturday, June 28, 2014

Basic functions of FamilySearch Family Tree finally explained

Through the kind help and analysis of a Gordon Collett, a commentator on my Rejoice, and be exceeding glad... blog, and in response to questions about entering standardized date and place names in the FamilySearch.org's Family Tree program, I was finally able to understand how that program handles these entries. I have included Gordon's comments with some of my own in a post entitled, "Mystery Solved -- How Names and Places in FamilySearch Family Tree Work Revealed."

These comments answer the following questions:

  1. How do I preserve the original place name and still use the standardized place name selections?
  2. What is the difference between what is shown in the date and place fields and the standardized list?
  3. How do we preserve both the accurate date formats and place names?
  4. How do I get the children to show up sorted in birth order?

These and many other questions are answered in Gordon's comments. Please take time to read this rather long blog post and work through the examples given. It will clear up a lot of issues with the program.

A Basic Approach to Giant Economy Sized Pedigrees

Over the years, I keep hearing huge numbers from people working on their family trees. This is not too surprising given that the basic numerical progression of ancestors is exponential. But some of the numbers are quite unbelievable, over 100,000 or even two or three times that much. My great-grandmother worked on her and her husband's ancestry for about 30 years. I inherited her "files" which, of course, were all on paper. Working pretty steadily, it took me almost 10 years to transcribe all her research and enter it into a succession of computer programs. In the end, she had accumulated just over 16,000 names. But this large number had been accomplished mainly through name extraction. In other words, she copied out anyone in a particular parish or area with the same surname as the people she was searching for. In other instances, with research on my family lines, I found that other relatives, not my great-grandmother, had followed some pedigree lines through convenience, assuming a relationship to the most promising ancestor rather than making the hard choice to admit that they had no more evidence.

Those experiences, finding that many of the people in her files were not remotely related to my family, the unsubstantiated family lines of other relatives and my subsequent experiences in working with people and their online family trees, has helped to make me somewhat skeptical of very large genealogy files.

I am personally acquainted with several people who are doing the same type of name extraction as my great-grandmother did, only on a much larger scale. Don't get me wrong, my great-grandmother did a huge amount of valid and documented genealogical research considering the time and effort it took to do research by mail and with the limited resources she had. I am forever indebted to her for her magnificent work. But what I see today, involves copying on a massive scale. My great-grandmother had a very valid reason for doing what she did. The documentary evidence she had was scanty to say the least. Every so often, she would find a distant relative that was able to connect a little more the family.

Not infrequently, as I pursued the work previously done on my family lines, I found, as I already mentioned, where researchers had done essentially the same thing over and over. Chosen an ancestor, let's say at random, merely because there was a "documented" pedigree for that person or because the surname was the same. It is also interesting that many of those pedigrees involved some sort of Welsh or English royalty. I never yet found one of these lines that could be documented. Now, I find the same type of suppositional work in many of the larger online family tree files. I have written several times before how many wrong people have been connected to my ancestors in FamilySearch.org Family Tree. This is the case even when good, reliable documentary evidence is not only available, but easily obtained.

So my basic approach to very large files is skepticism. I am reasonably sure that there are many of the owners of these huge files that would defend every last name as absolutely verified and reliable. Perhaps, that is where we part company. I am never sure about anything in my file. I have personally verified lines that go back quite a ways, but in every single case, the line has "ended" in obscurity and conflicting evidence. I have had people come to me and declare that they have "solved" the mystery of the arrival of my ancestor in America, only to ignore me when I requested the sources for the evidence they claimed to have. Many of the lengthy pedigrees I found in relatives research we very easily proved to be unsustainable and without foundation.

I have no response to those who claim ancestry back to some remote and famous person. I have spent too many hours trying to "help" someone prove that they were related to those same famous people and who refused to acknowledge my doubts that the relationship they claimed was not based in fact.

Now, I fully understand that famous people and European Royalty had children and that there are people who are related to some of them. Given the reality of pedigree collapse, it is not only possible, but highly likely that many of us Americans are related to some royal person or another. My skepticism arises from the practice of then "adopting" that royal person's pedigree without question.

In addition, if I chose to incorporate every person in every family tree that the large companies tell me I am related to, I too would also have hundreds of thousands of names in my file.

Every time I raise this subject in any context, I am immediately attacked by someone who claims the every single name in his or her file was put there one by one through their sweat and blood. In those circumstances I defer. But the question really is what do you do with all those names? I am finding it difficult to compile documentation after a reasonably exhaustive search for only a limited number of generations of my own verified ancestors. A job, by the way, of documenting sources that has not been done previously. If I were to go into the Family Tree program or any other family tree program, I would find the same situation. I would find that there was little or no documentation for the names in the trees. So you must excuse me one more time for my skepticism. Perhaps that also came from my many years of listening to my law clients and their opponents.

When I am sincerely asked what to do about a large file inherited from a relative, I always give the same answer. Begin to verify and add sources to the file. You may find, like I did, that some severe pruning is in order.

Friday, June 27, 2014

What do Search Trends Tell Us About Genealogy?

Google has a product called "Trends." Search trends show the relative interest in any search term over time. In the short term, such trends have been used to spot the outbreak of disease, such as flu epidemics. The idea here is that what people search for on Google is a measure of the interest they have in the subject. As an example, here is a screenshot of the Google Search Trends on the term "World Cup."


It may seem trivial and obvious, but the search terms peak every four years when the World Cup Football Championships are held. I have talked about this in previous blog posts some time ago, but thought an update was in order. Google correlates peaks in searches to specific news stories. Here is another generic example using the search term, "presidential election."


Of course, Google searches reflect global interest in a specific topic, but it is still interesting to see the spike in interest corresponding to the periodic nature of the elections. But what about terms relating to genealogy? Here is a comparison search for the terms "genealogy" and "family history."


First off, this does not mean that there has been an overall decline in interest in genealogy. This graph show a relative interest based on all online Google searches. What it does mean is that there is more and more "noise" online. That is, there are more and more searches and any one topic, such as genealogy gets a smaller piece of the overall attention of the people in the world. Let's look at the trends for the big genealogy companies, Ancestry.com, FamilySearch.org, MyHeritage.com and findmypast.com. bear in mind the the word "ancestry" can be the basis for searches in lots of different contexts.



The waypoints on the red Ancestry search really do correspond to Ancestry.com. It looks like the search term "ancestry" is a hot topic and getting more popular. Let's take ancestry out of the picture and see what happens with the other three:


Now this starts to get real interesting. You can see the steep upward curve of MyHeritage beginning when they started the company. You can also see a big peak in MyHeritage in February, 2010 corresponding to what? The huge British genealogy conference, Who Do You Think You Are? held on the 26th through the 28th of February, 2010.  FamilySearch shows a dramatic downward trend through 2010 with a slow upward trend since that time.

What if I use the search terms "Ancestry.com" and "FamilySearch.org?" Here is the results of that search:


This is very interesting. It shows that a substantial part of the searches on the term "ancestry" alone might have related to other subjects rather than Ancestry.com. It also shows that Ancestry.com has done a really good job of keeping its name before the public.

It is pretty difficult to come up with genealogy related terms that relate only to genealogy. Any generic terms such as "family" and "history" alone, will include a huge number of searches that have nothing to do with "family history" or genealogy. Many of the terms I cam up with did not show enough volume to produce a graph.

If you want to see how popular genealogy is compared to any other pursuit, you can do searches on the comparison. For example, how about genealogy vs. football?


That was highly predictable. But what about something such as gardening?


Now this gives us something to think about. At one point in the past, genealogy was much more popular as a search than gardening, but over time, gardening has stayed about the same and genealogy has declined. Also, interest in gardening peaks every year in May. What about something really active such as cycling? How does that compare to genealogy?


Like gardening, interest in cycling depends on the season of the year. You can try this out yourself on Google Trends.

My conclusions are consistent with what I have written in the past. Genealogy is not a fast growing interest in the world. It is slowly moving into the background noise of the Internet. It is also not one of the most popular past times as is often stated. If you want to see what I mean, just try something totally off the wall such as comparing "genealogy" to "Doctor Who."


Try and figure out what this means.

Thursday, June 26, 2014

What is a Denial of Service Attack and Should I be Worried?

Headlines from the recent denial of service attack on Ancestry.com and FindAGrave.com used the word "hack" to incorrectly describe the event. Neither of the two websites were "hacked" according to the most common definition of the term. Hacking involves using a computer to gain unauthorized access to data in another computer system. Using the word "hack" either as a noun or verb changes the original use of the term. Originally, a hacker was nothing more than a person who was enthusiastic about computing. Over time, the word took on a sinister and negative meaning when it was used to refer to unauthorized entry into data files. The neutral use of the term has all but disappeared except in computer circles.

The problems with the computers owned by Ancestry.com and its subsidiary, FindAGrave.com, were not caused by any kind of unlawful entry. The word "hack" has expanded to mean any externally caused problem with a large computer system. What happened to the Ancestry.com was a denial of service attack which was a successful attempt to flood the network with requests to Ancestry.com to such an extent that their computer servers could not handle the traffic and shut down or stopped working properly. It is like having a million people show up at the same store at the same time and try to get entry. There are at least three different types of denial of service attack modes. See CERT, Software Engineering Institute, Carnegie Mellon University, Denial of Service Attacks. The news accounts of the Ancestry.com attack are not specific enough to explain exactly what happened. Quoting from the Carnegie Mellon University article:
Denial-of-service attacks come in a variety of forms and aim at a variety of services. There are three basic types of attack:
  • consumption of scarce, limited, or non-renewable resources
  • destruction or alteration of configuration information
  • physical destruction or alteration of network components
I have not seen any reports of which of the three methods brought down the Ancestry.com computers. Although there are several ways this type of attack can be executed, the most common involves sending a huge number of messages to the host computer and overwhelming its capacity. The details of how these attacks succeed are quite technical and not easily explained. 

The question that occurs to individuals is whether or not this is another thing they need to be worried about. The answer is quite simple, no. This is a problem created in servers, those are the machines that provide content to the Internet. Unless you are running a server from your personal computer, and you would know a lot more about this subject if you were, you do not have to worry too much about denial of service. There are lots of other ways your computer can become compromised with worms, viruses, trojan horses and all the other types of problems sometimes referred to by the umbrella term "malware," but this is not one that usually occurs with personal computers. 

There are still very ample reasons to practice safe computing. Here is a good summary from the Massachusetts Institute of Technology (MIT).



Wednesday, June 25, 2014

The Challenge of Genealogical Complexity

Complexity can take a variety of forms. We have systems that are complex because of the number of elements involved. For example, it is a more complex challenge to have a family reunion with 100 people than it is with 10. Looking at that same family reunion, it may also be complex because of the relationships between the various members of the family. These two types of complexity have been defined as disorganized complexity and organized complexity. See Wikipedia:Complexity. In order to discover our ancestors, we must be able to manage both types of complexity.

The Great Divide in genealogy is between those who have the skills to work with complexity of both types and those who do not. We sometimes perceive these skills to be associated with intellectual ability or literacy, but the necessary skill set is far more subtile and also far more complex than any simplistic division. One of the major problems I have been writing about now for years, is the failure to view genealogy (i.e. the process of discovering one's ancestors) as a complex, many layered system. Hence, there are statements made that view parts of the system as "easy" and other parts of the system as "challenging" when, in fact, both evaluations are "true" in the sense that genealogy is a system, some parts of which are less complex than others and therefore can be viewed as "easy" while other parts of the same system are overwhelmingly complex and difficult.

One definition of a system is a set of parts or elements that have relationships among them differentiated from relationships with other elements outside the relational regime. See Wikipedia:Complexity. Here the "relational regime" is the activity of discovering ancestral relationships and the accompanying details of those ancestors' lives. In participating in this activity, we "incorporate" elements from other complex systems. Where the complex system of genealogy intersects with other equally or even more complex systems we have our greatest challenges as genealogists. Because we are investigating "people," in a real sense our research can conceivably encroach on almost every other complex system in the world. Truly, there is no end to genealogy's complexity and its interactions with almost every other complex system.

Looked at from this standpoint, it is practically a miracle that anyone can adequately master the complex genealogical systems as a whole. But in reality, this is not necessary. Each person is only required to approach a sub-set of the whole complex genealogical system. Therefore, some individuals cam become highly trained and competent in their chosen area of inquiry. What is a major tragedy is the failure to view the entire field of genealogical research and preservation as one complex system. The consequences of this failure include the commonly expressed antipathy by some genealogists towards those who upload their family trees online without proper "documentation." Rather than investigating a way to include this phenomena in the overall system, they would exclude those "family trees' from the system. In effect, each participant in the overall genealogical system, has a tendency to view the entire system as composed only of those parts that they personally understand and with which they have a reasonable familiarity. The classic expression of this tendency is the fable of the blind men and the elephant. In a real sense, we are all blind men examining the genealogical elephant.

The disorganized portions of genealogy involve the huge numbers of elements composing the system. Some of those elements include the number of anyone person's ancestors, the number of documents in the world, the number of online websites, and so forth. We are faced with this type of complexity daily. It is also evident that any one individual's ability to handle this type of complexity varies greatly. There are those researchers who feel overwhelmed when they have a family file with 100 people, while others routinely deal with thousands of ancestors at the same time. At the same time, we also deal with organized complexity in the form of the Internet, computer systems, libraries, archives and so forth. It is the individual's capacity to absorb and integrate these varying levels of complexity that determines their ability to make progress in discovering and recording their family history. It is in this area of the ability of the individual researchers to adapt to these varying demands of complexity that it the true division between those who are frustrated with genealogy and those who thrive in the pursuit.

In fact, the disorganization of genealogy as a system approaches chaos. Genealogy is high-dimensional, non-linear and very difficult to model. Because of these factors, all current attempts to characterize the entire genealogical community fail, some more miserably than others. This effect can be seen in the attempts of many entitles within the system, both individuals and business entities, to approach genealogy from an entirely populist viewpoint, making the entire system seem to be something people can do in the "spare time" without any preparation or training. As people become aware of the complexity of the overall system, they feel betrayed by the simplistic portrayals and in a real sense "turn off" further involvement.

The real failure here is not in the individual participants' lack of awareness of the complexity of the genealogical system, but in the failure of those who have the ability to see and appreciate that complexity in communicating the need to allow for that complexity to be acknowledged. It is also a failure on the part of those who ignore the overall complexity and fragment the elements of genealogy in order to make it appear more simplistic than it is in reality. For example, presently there are a number of people and organizations working on the issue of digital preservation. However, these same individuals have no connection with or communication with the mainstream of genealogists. One reason for this is that "digital preservation" impacts genealogy but also impacts many other systems and disciplines.

Another example of this lack of system awareness is the large online genealogical database programs' attempts to exponentially increase the size of their collections without adequately providing for ways to organize those vast online databases in a way that the potential users of the databases can comprehend. In particular, the online database programs provide an automated search technology, but ignore the basic fact that their online search technology cannot search images, so the real effect is that the user tends to believe that the "databases" have been search when in fact, only a very small percentage of the huge number of documents online are actually included in such searches. This is evidence of the lack of awareness of the overall genealogical system. This particular problem is then "fixed" by relying on indexes generated by unsophisticated participants who have no idea who will use the index or how it will be used. I use this as an example of the complexity of the system, not to disparage the contributions of the indexers. In fact, the "index" simply adds one more complex and little understood system to the overall complexity of genealogy as a whole.

In writing about this and other related subjects, I almost always feel like I am talking to a brick wall. The genealogical community does not incorporate anything approaching a meta-genealogy where those who are involved in the level of providing services and defining the direction of the separate elements can discuss any type of overall theory about how the whole system works (or doesn't work). The entire community seems to run on its own inertia. I am convinced that there is a need for a review of the overall complexity of the genealogical system and the development of a genealogical complexity theory to account for the interactions of the various components. I am certain that there are those people out there who understand what I am writing about or I would not keep writing about this subject.

A final note, I use Wikipedia for examples because it is a convenient way to get into any particular subject. I would expect that following up on any reference to a Wikipedia article would necessarily involve moving on to the sources cited as well as others.

Tuesday, June 24, 2014

Who owns what in online genealogy? Shuffling the Deck

I find about as much false information and fables are rampant in the genealogical community about the ownership of the larger genealogy companies as any other genealogy topic. I haven't addressed this issue for some considerable time, but I still get really strange comments on blog posts over two or three years old. The amount of misinformation in this area seems monumental. One of the most common specific issues, is the ownership of Ancestry.com. This is particularly true with respect to its relationship to FamilySearch, International. Many of the comments about FamilySearch, International. Many of these comments display a decided anti-Mormon bias. It is interesting that genealogy, one of the most inclusive persuasions,  is not immune from bigotry and prejudice. 

First of all there are four very large genealogy-based companies. None of them share any ownership interest whatsoever. FamilySearch, International has no ownership interest in Ancestry.com. All of this ownership information is freely available online. There is no mystery here. There are no hidden agendas. There is no conspiracy. In every case, I am acquainted with the each of the CEOs of all four companies. I have had interviews and meetings with all of them. I find them all dedicated to genealogy, capable and very intense people.

In summarizing the ownership interests of the larger genealogy companies, I am including, to the extent possible, a list of the associated websites they each manage and own. This is list is subject to change at any time as evidenced by the recent acquisition of Mocavo.com by D.C. Thomson, Family History (findmypast.com). To answer one question, yes, findmypast.com is written all in lower case letters.

Which of these entities is the "largest." First of all, this question is impossible to answer. Each of the companies has a completely individual way of measuring its online collections of data. In addition, the question is meaningless. If the database has what you are looking for, the collections are useful. If they do not have what you are looking for, they are of no use to you. It's that simple. Here I go with the analysis.

Ancestry.com, Inc

Ancestry.com is presently owned by Permira, a private equity firm founded in 1985. See this link for the company history. Here is a quote from the Permira website about Ancestry.com:
Ancestry is the undisputed global market leader in online family history, with two million subscribers and 6x the traffic of the nearest competitor. It was acquired by a company owned by the Permira funds and co-investors in December 2012. 
Ancestry is the global leader in online genealogy offering the world’s largest online family history resource, with over 11 billion records and 34 million family trees containing 4 billion profiles. 
Ancestry’s network of websites enables users to discover, preserve and share family history, using an unrivalled data set of digitised historic records from 15 countries. Records include: census; ship passenger lists; military documents; birth, marriage and death certificates; immigration documents; casualty lists and newspaper clippings. This enables subscribers to discover their past, search for ancestors and records, along with sharing what they have found by uploading their own content. 
The offering is delivered via multiple platforms including desktop web, mobile and social media. Ancestry pioneered online family history by converting a time intensive, expensive offline pursuit into an affordable, accessible one online.
As far as I can determine here is the current list of the websites owned by Ancestry.com:


The following Ancestry.com websites are being shut down as of 5 September 2014:


FamilySearch, International

Quoting from the FamilySearch.org website:
FamilySearch International is the largest genealogy organization in the world. Millions of people use FamilySearch records, resources, and services to learn more about their family history. To help in this great pursuit, FamilySearch has been actively gathering, preserving, and sharing genealogical records worldwide for over 100 years. FamilySearch is a nonprofit organization sponsored by The Church of Jesus Christ of Latter-day Saints. Patrons may access FamilySearch services and resources free online at FamilySearch.org or through over 4,600 family history centers in 132 countries, including the main Family History Library in Salt Lake City, Utah.
FamilySearch.org maintains the following websites:


For members of The Church of Jesus Christ of Latter-day Saints, some of the functions of FamilySearch.org are supplemented by LDS.org.

MyHeritage.com

MyHeritage.com is a privately owned company based in Israel. Here is a quote about the company from their website:
MyHeritage was founded by a team of people with a passion for genealogy and a strong grasp of Internet technology. Our vision has been to make it easier for people around the world to use the power of the Internet to discover their heritage and strengthen their bonds with family and friends. 
As of 2005, we were based in the beautiful village of Bnei Atarot, near Tel Aviv, Israel, founded by German Templers in 1902 under the name of Wilhelma. 
Inspired by the surrounding fields and orchards and Templer estates, one of which served as our headquarters, we used the tools of tomorrow for researching the family history of yesterday. In February 2012 following our constant growth, we moved into lovely new offices in Or Yehuda, Israel. We also have offices in the USA in Lehi, Utah and LA, California, and employees and representatives in many countries around the world. 
As a dynamic family history network, our innovations for family tree building and historical content search are constantly evolving to provide families with the most engaging and rewarding experience. Our recent acquisitions of World Vital Records and Geni.com for example, have enabled us to offer billions of historical records and exciting tools for collaboration to a wider and more international audience than ever before.
Hers is a list of the MyHeritage.com acquisitions. Some of these companies still are online, others have been absorbed into MyHeritage.com. MyHeritage.com is available in 40 different languages.

Here is the description of D.C. Thomson family history from their website:
DC Thomson Family History is a British-owned world leader in online genealogy, with an unrivalled record of online innovation in the field of family history and 18 million registered users across its family of online brands. It hosts over 1.8 billion genealogical records across these brands, which includes household names like findmypast and Genes Reunited. 
DC Thomson Family History helps partners to digitise their precious collections, providing them with an archive-quality digital surrogate of their records, or publishing existing indexes and transcriptions.
D.C. Thomson family history's most recent acquisition was Mocavo.com. Here is the current list of websites:
As I mentioned above, these lists could change at any time. One thing is certain, there will likely be additional acquisitions in the future. If any of the other large online companies have ownership interests they may be difficult to ferret out. 



Monday, June 23, 2014

findmypast.com buys Mocavo.com

This newest of the large genealogy online database programs buying out a smaller, well endowed, company is definitely not a surprise. This newest large company buys smaller company acquisition was announced in a press release dated 23 June 2014 from findmypast.com announcing the purchase of Mocavo.com. The reason this is not a surprise to me was that I was watching Mocavo.com closely and could see that they had the potential to make the "big four" genealogy companies into a "big five." It is my personal opinion that none of the three commercial companies, findmypast.com, MyHeritage.com or Ancestry.com, can afford to have another competitive player in the market at their level. The only question in my mind was which of the three would make the purchase.

Quoting from the press release;
Family history is known for causing incredible outbursts of excitement. The yelp of success in the library when a long-standing mystery is solved, the utter shock at the discovery made in a newspaper article, or the squeal of delight when an e-mail hits your inbox notifying you of the latest record match from a previous search. 
Today, the entire family history industry has a reason to shout from the rooftops – Findmypast and Mocavo have joined together to build the future of family history. 
What does that mean for you? Everything…and so much more. 
From the very beginning, Mocavo established itself as an absolute family history destination. From itsinnovative search technologies to providing good Karma to the community, Mocavo is a genealogist’s best friend. Bloggers and users agreed – Dick Eastman, founder of the ever-popular Eastman’s Online Genealogy Newsletter went on record, “my future genealogy searches will start on Mocavo.com.” Since that time, the Mocavo machine hasn’t stopped and is now one of the industry’s fastest growing genealogy services.
One salient point here is that findmypast.com has a huge database made up almost entirely of indexes. This acquisition gives it a substantial presence with digitized copies of actual source documents and especially gives the UK-based company a greater presence in the United States.  Continuing on with a further quote from the press release:
Just imagine what you can uncover in Mocavo’s more than 8 million yearbooks, 500 million military records, and a slew of other incredible resources. And if that wasn’t enough, Mocavo also offers a free scanning service to help you preserve your own family’s records. But Mocavo hasn’t stopped there, they continue to release 1,000 new datasets each day – that’s nearly 30,000 datasets each month – and 365,000 datasets a year.
findmypast.com does have a huge investment in digitized newspaper pages but they are all in another separate website and charged separately for access. For example, I am hearing comments from those in The Church of Jesus Christ of Latter-day Saints who have recently been given "free" access to findmypast.com, that they are being asked to pay for access to digitized newspaper pages not included in the basic findmypast.com subscription. It is certainly unclear from the press release whether or not Mocavo.com will continue as a separate website merely owned by findmypast.com (actually D.C. Thomson Family History) or if the datasets will be incorporated into the already existing records on findmypast.com. My guess is that Mocavo.com survives as an entity owned by findmypast.com but separately billed and maintained.

Another interesting speculation is whether Mocavo.com will continue is present "free access" to all the records or if it will conform to the "pay-as-you-go" model preferred by British websites. Depending on how all this works out, the genealogy community may or may not benefit as much as it would have with a competing Mocavo.com.

The Issue of Source Citations in Genealogy

If you want to push a hot topic button for some members of the online genealogy community, all you have to do is take one position or another about the need for source citations. If you really want to get some response, you can bring up "proper citations" as an issue. I must say that I am as opinionated as anyone on the subject and bring the subject up frequently because of the abuses and excesses I see on all sides of the issue.

I think that we need to have some historical perspective in this regard and understand that adding source citations to genealogical data has not been such an active issue as it is today. The genealogical community is, of course, laboring mightily to position itself in the face of the publication of an 887 page definitive book on the subject.

See Mills, Elizabeth Shown. Evidence Explained: Citing History Sources from Artifacts to Cyberspace. Baltimore, Md: Genealogical Pub. Co, 2007.

It is interesting that we have such an amazingly complete treatise on a subject so few genealogists are even aware exists.  Let's see why that may be the case.

Individual participation in genealogical research has historically been extremely limited. If you would like one of the few historical studies of genealogy as a profession and avocation, see

Weil, François. Family Trees: A History of Genealogy in America. 2013.

It is amazing what a little bit of history will do to quell self- righteous indignation over the lack of source citations in online family trees and elsewhere. To give you an idea of where genealogy is coming from and why there may be some problems such as lack of source citations, let me give just one quotation from the Weil book.
From the 1860s to the mid-twentieth century, racial purity, nativism, and nationalism successfully dominated the quest for pedigree and gave genealogy more contemporary ideological relevance than ever before. The language of race, heredity, and later eugenics invaded the genealogical sphere, helping many white Americans describe themselves self-consciously as Anglo-Saxons and claim racial and social superiority over others. This new language was so pervasive that many of these “others” (African Americans and European migrants) came to share some of the tenets of racialized genealogy. In this new context the market for genealogy experienced tremendous , though unregulated, growth, which in turn helped develop frauds on a scale unknown in the United States until then. Some reacted and attempted to regulate the field. Other Americans, true to alternative visions inherited from the antebellum period, persisted in connecting genealogy to moral, religious, and democratic concerns, but by the late nineteenth century they were in the minority.
Only when the racial and nationalist foundations of genealogy were undermined in the middle of the twentieth century did the configuration of the genealogical interest in the United States change once again. It took decades, the civil rights movement, and the new interest in ethnicity and heritage for American genealogical culture as we know it today— popular, multicultural , and multiracial family history— to settle in. As the family history market has developed with the advent of the computer revolution and the Internet, genealogy has become a major component of the American economy of culture. In the age of DNA, the return of biological evidence to genealogy also raises new, fascinating, and troubling questions about the identity of individuals and groups within American society. Weil, Fran├žois (2013-04-30). Family Trees (Kindle Locations 122-126). Harvard University Press. Kindle Edition. 
Truly, genealogy in the United States has a troubled and very complex history. Condemning pedigrees and compiled genealogies from before the 21st Century is a currently popular position, but from this quotation, it is evident that a scholarly, carefully documented approach to genealogy is not only presently a rare commodity, but has been almost since the beginnings of populist genealogy. Converting the huge masses of online family tree submissions to the need for careful source citations would take nothing less than a restructuring of the entire United States' educational systems and a radical change in the value system. Advocating adding source citations to genealogical submissions online involves much more than a simply educating genealogists.

In addition to these cultural and social issues, we have the plain fact that the addition of a source citation in no way assures the accuracy of the online entry. Adding and evaluating the reliability of any particular source is a skill that is learned either through relevant educational experience or through trial and error in practicing research.

The real question is whether or not genealogy is an inclusive or exclusive pursuit. If we take the position that genealogy is an academic discipline, it must needs be exclusive in the extreme. We would have to be like lawyers and pass unauthorized practice of genealogy rules and enforce them. On the other hand, if genealogy is inclusive, then perhaps we need to recognize that errors and bad online family trees are part of the trade-off. Railing against a particular online program because it does not "require" source citations is definitely an exclusive approach. Perhaps those of us who view genealogy as needing sources should be a little more tolerant of those who have not yet reached that point of understanding the process.


Sunday, June 22, 2014

The Rest of the Mystery Photos from France

Here is the explanation of these mystery photos from an earlier post:
One of my friends came to a class at the Mesa FamilySearch Library and brought me a large envelope of photographs. These photos were obviously quite old. From appearances and the type of mounting they dated from the late 1800s. My friend had purchased a used photo album many years ago in Paris, France. All of the photos were in the album. She removed the photos from the album to use for herself, but could not throw away the photos. Now all these years later, I taught a class on digital photography for genealogists and she showed up to bring me the photos. 
I told her I would find an appropriate home for the photos even if I could not identify the family. Since all the photos were in the same album we could conclude that all of the people were related in some way. To make life interesting the photographer logos are from Berlin, Germany, Hannonver, Germany, Tolleston, Indiana, Waterbury, Connecticut, New York, New York, and Jersey City, New Jersey.










Are Online Family Trees a Substitute for a Local Desktop Genealogy Program?

For some genealogists, online family trees are anathema. I am guessing, but the number of genealogists who abhor online family trees is probably only an insignificant, tiny percentage of the "genealogists" who have their family tree online. With over 71.5 million members on MyHeritage.com alone, plus the millions more on Ancestry.com and other online family trees claiming huge numbers, such as Geni.com with over 77 million, it is more than obvious that online family trees are "genealogy" to most people interested in their ancestry.

So why the antipathy towards family trees? Errors and bad genealogy. Right now I am looking at my Geni.com family tree, for example. Here is a screenshot for illustrative purposes:


The arrow points to a suggested invitee to help "complete my family tree." The name, identified as a "great uncle" is Ralph Carum Tanner who is further identified as a "son of Henry Martin Tanner and Eliza Ellen Tanner." This claimed individual is well known to me because he does not exist. I have searched every possible record and no one by that name exists. I am also extremely well aware with elaborate documentation of each of the children of Henry and Eliza Tanner. He is not their child, if he exists. How do you get rid of this type of unsupported information? The answer is that this fictitious person now lives in hundreds of family trees online.

Apparently this core problem with family tree programs is no disincentive to the millions of online users. So, one of the most common reasons given for having your own independent desktop genealogy program is to provide a "clean" copy of your pedigree so that when the rogue online family tree users start messing up everything, you have someplace to go to get "good" information. This reasoning is based entirely on the idea that the genealogist making this point is always and completely right. What if each and everyone of these millions of online family tree users had all of their data in their own program. What difference would that make to the accuracy of the overall online family trees?

How would everyone having their own program on their own computer help with the accuracy of the existing online family trees? Wouldn't the desktop programs simply mimic the online family trees? I believe they would. The hard core genealogists take the position that they are right and all the online folks are wrong. This may or may not be true. Let's suppose I have my genealogy in three or four online programs. Isn't that back up enough? Why do I need another copy of the same information on my own computer? Especially if I don't know enough about genealogy to tell if there is a fictitious person in one of my families.

Technically, you could define genealogy to the point that your were the sole person on the earth who was a qualified genealogist. I think those who consider themselves to be genealogists are going to need a new way to "prove" that a person needs a local desktop genealogy program.



Back up your genealogical data files -- a reasoned approach


The recent Denial of Service attack at Ancestry.com and FindAGrave.com point out the vulnerability of the Internet yet in again in a very graphic manner. Although this attack is over, there is nothing stopping another such attack at another time and place on the Internet. During the past few years, the "Cloud" a euphemism for the Internet or Web, has been touted as a solution to individual backup challenges. It turns out that storing your data online is subject to some of the same risks as any other method of data storage. It is time to evaluate each type of storage media as to its merits and limitations.

A computer system consists of several integral parts that work together. Traditionally, the computer hardware consisted of a CPU (Central Processing Unit) chip on a board with other processors, a power supply, connecting cables called busses, perhaps a fan, a memory storage device and box to put it all in. Then you start adding things like keyboards, a mouse or pointing device and you have a computer system. Today, all of that can be packaged in a smartphone. I am focusing on storage. We moved from recording tape in cassettes or on spools to floppy disks to hard disks to flash or solid-state memory in just a few short years. At the same time, there were all sorts of variations including hard drive cassettes and CDs and DVDs. Where are we today?

As genealogists, unfortunately we are all over the board. We have a core of people who are still holding on to their floppy disk storage and we have those who use the latest and best storage methods.

Now a word about the "Cloud." There is no magic place to store your information called the "Cloud." The cloud is nothing more or less than a bunch of computers with hard drives attached. So when you are storing your data "on the cloud," you are really just using someone's computer and hard drive, usually in a huge array of computers called a server farm, like the one pictured above. Now, these computers are subject to all the same failure possibilities of your own computer in your own home. The difference is that they have people working 24/7 to replace the computers and hard drives as they fail. They keep the backups. But what if the whole server farm fails? Exactly the problem. The key to safety is redundancy or in other words, multiple backups.

The same rule holds true whether you are storing photos on a smartphone or genealogy on your computer's internal hard drive. You need to have multiple copies of your data on different storage devices. All of this costs money. But you always have to place a value on the time it has taken you to accumulate your data. I have people come to me crying because they lost their data. Upon questioning them, it turns out they have 40 or 50 names (or some other very limited amount) of data and are devastated because of the prospect reconstructing those 50 names. Think of the real world where people like me have more than 3 Terabytes of data and over 300,000 files accumulated over the last 32 years. How do we back up this data?

Here is where we are today.

A 4 TB (Terabyte) hard drive cost just $150 on Amazon.com. That is how I backup my data. I have multiple hard drives and make copies on each. But what would happen if a meteorite hit my home (or something more predictable)? I make periodic copies of all the data on another hard drive and give a copy to one or more of my children. How often do I back up? Every time I think about the effort it would take to reproduce all that data.

Even if you don't have this massive problem, the answer is the same. You back up your data on multiple drives and keep them in multiple places. I don't rely on Cloud storage so much because of the massive amount of data I have and the cost to keep it online. I do use Internet or Cloud storage for some types of data, particularly those items I use regularly on a variety of computers.

What are the hardware options? Hard drives are still the most cost-effective way to store computer data. Flash drives (thumb drives whatever) are reliable and convenient. If your data fits on a flash drive, go with it. But they are easy to misplace so be careful. Flash drives may someday replace hard drives (mechanical spinning media) but that is still a ways off. CDs and DVDs? They are still a way to backup data but they are slow and limited in size. They are also on their way out. Most new computers are not being made with CD or DVD drives, this is always an indication of the end of a particular form of storage.

What about storing data online (the Cloud)? Good idea but not any more reliable than hard drives or flash drives. The best idea is to use more than one method and make sure the copies are in more than one place. I use all these types of storage for different reasons at different times. I have hard drives (very large ones). I have flash drives (very large ones). I have CDs and DVDs but I am transitioning away from them as fast as I can. I have some storage online but balance the availability against the cost and the size of my storage needs.