Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, February 1, 2009

GEDCOM and XML -- a clarification

In a previous post, I stated that GEDCOM was an Extensible Markup Language or XML. Thanks to reader Mark Tucker, I have a clarification of this statement. GEDCOM 5.5 is text based but is not an XML. If you are interested in the technical explanation please refer to an article by Aaron Skonnard a teacher at Northface University in Salt Lake City, Utah entitled XML Data Migration Case Study: GEDCOM. A beta version of GEDCOM 6.0 is available and is completely based on XML.

If you are further interested in pursuing this issue, I would suggest starting with the GEDCOM article in Wikipedia and doing a general search in Google on "GEDCOM 6.0 xml."

However, as the article states:

On January 23, 2002 a beta version of GEDCOM 6.0 was released for developers to study and begin to implement in their software.[9] GEDCOM 6.0 was to be the first version to store data in XML format, and was to change the preferred character set from ANSEL to Unicode. (Uniform use of Unicode would allow for the usage of international character sets. An example is the storage of East Asian names in their original CJK characters, without which they could be ambiguous and of little use for genealogical or historical research.)

Today, lineage-linked GEDCOM is still the de facto common denominator. Since the 2002 publication of the beta version of GEDCOM 6.0, no genealogical software supplier has yet supported it, despite the inherent advantages of an extensible and portable language like XML and its multi-lingual Unicode support.

Corrections, additions and comments are always welcome.

2 comments:

  1. Soon xml will take over the world. It does, however, transmit a lot of information efficiently.

    ReplyDelete
  2. thanks for this post James, we have been looking at GEDCOM for our site over the last few days and wondering why no uptake around XML, it's shame for us as the rest of our site is XML compatible. Any idea why the reluctance? Luckily our Tree allows the flexibility for users to correct the import errors that I have heard happen with the old GEDCOM standards.

    ReplyDelete