RootsTech 2015

Some people eat, sleep and chew gum, I do genealogy and write...

Saturday, November 28, 2009

Search strategy online for ancestors

What is the best strategy for searching online for your ancestors? Surprisingly, there are few simple rules that will lead to a dramatic increase in the reliability of the returns from a search with any search engine, be it Google, Yahoo, Ask, Alta Vista, Bing or whichever. These rules also work with nearly all of the online databases, including Ancestry.com, WorldVitalRecords and many others.

Before getting into the search rules, you must first understand tiny bit about a very complex subject, that is, Boolean logic. From Wikipedia, "Boolean logic is a complete system for logical operations, used in many systems. It was named after George Boole, who first defined an algebraic system of logic in the mid 19th century. Boolean logic has many applications in electronics, computer hardware and software, and is the basis of all modern digital electronics. In 1938, Claude Shannon showed how electric circuits with relays were a model for Boolean logic. This fact soon proved enormously consequential with the emergence of the electronic computer."

Why do you need to know this? (and you might add, who cares?) Because relational databases, like the kinds mentioned above usually contain some form of Boolean logic to perform queries. Now, I said above that you only need to know a tiny little bit. Here is an example of the tiny little bit you need to know. Open Google on any browser. Look just to the right of the logo. There is a small link that says "Advanced Search." Click on that link and look at the results. Google search gives you three options, in Boolean terms they are "all these words," "this exact wording or phrase", "one or more of these words", and last, "any of these unwanted words." These options are Boolean operators.

One of the problems with the various search engines used by all the different databases is that some use Boolean operators and some don't. If the search function in the database does not have a Boolean capability, then your ability to find information is really limited. For example, the Family History Library Catalog has a keyword search link. However, there is no further way to modify a search, so if I search for the word "Arizona" and add the word pioneer, it does not give me a way to only search for results with only those two words, so I get results of any title having either "Arizona" or "pioneer" or both together or 235 matching titles. If I were doing the search in Google, I could narrow down the returns to only those with the term "Arizona" and the term "pioneer" or exclude the term pioneer or whatever.

So here are the rules:

Rule One: Know you search engine. Do a practice search or two and determine the results from including or excluding terms. Determine whether or not the search function supports some kind of Boolean operations. If so, you can immediately begin eliminating a lot of extraneous junk from your searches.

Rule Two: Always search from the least possible number of terms. For example, if I were looking for Henry Tanner in Arizona, I would put in only those three terms for my first search. Let's try that in Google. I search for "Henry Tanner Arizona." On this particular day, I got 62,200 returns and found the information I was looking for in the first item returned. I find that the same tactic works best with Ancestry.com also. Sometimes, putting in too much information will not bring any results at all, particularly if you diligently fill all of the fields available in an Ancestry.com search.

Rule Three. Vary the order of your search terms. If you are looking in Google for early settlers in Ohio, try searching for "Ohio early settlers" and then "settlers early Ohio." Notice how the total number of hits changes and the order of the hits also changes. Check to see if this rule affects the way your results are returned. Whether or not changing the order of the terms changes the results of the search tells you a lot about the search function in any particular program.

Rule Four. Always think geographically. Adding a geographic location to any genealogical search will help to focus the search. "Henry Tanner" is not very unique name. A Google search for "Henry Tanner" will return into the hundreds of thousands, including the African American Artist Henry Ossawa Tanner. If my Henry Tanner came from Arizona, then the rule is to add that to any search. This is one reason to be familiar with the methodology of the search function used by the program you are using. Adding a term like Arizona to the Family History Catalog search will just add more hits, rather than limiting the hits like it does in Google.

Rule Five. Try searching for related terms or usage. Henry might be Hank, Joseph might be Joe. Maybe the person lived in three different states. Why not search for the name with each or all of the three states? Try variations.

These are pretty simple rules. If you really want to get more proficient at searching on a computer you have to begin to think like a programmer. But, you say, I don't want to think like a programmer. Well, then, at least try some of the rules and see if you can improve your results. I am sure that you will. But I am also sure that the more you learn about computers and programs, the better you can become in searching the Internet. This is a situation where knowledge is power and more knowledge is more powerful searches. You may hate math (or love it) but there is no way to escape the fact that underneath all the graphic whiz of the computer age, lies some pretty complex logic. If you begin to understand even a tiny bit of that logic you will begin to increase your ability to find things on the Internet.

More later on the Boolean operators.

No comments:

Post a Comment