Family Histories .info
HomeExamplesIndexingScenariosSharing

Index to Add Meaning!

Beyond just sharing files and stories, it's very useful to "index" your files and identify people to link all these to other documentation; this makes it easier to search for people and correlate them. By modifying your HTML files according to the standards below, you make it possible for programs to automatically link your information with other documents. This is much like what the Family Search indexing project is doing, though they have tools that make it even easier to add the semantics. (I'd love to create/find open-source versions of those tools as well.)

Note that the methods here are specifically for files in HTML format. If you're not accustomed to HTML documents, try an online tutorial or ask a friend for a quick explanation.

Basic Links

Wherever there is a person's name (or other useful data) a your document, add tags that link it to other references for that person; then programs can detect and point people to those references.

Here is sample text from the start of this document.

Aaron Bracken was born January 26, 1844, at Hartland, Niagara, New York. He was the son of Joshua and Hannah Bell Bracken.

We surround the name of Aaron Bracken with a "span" tag, and also add some "itemprop" definitions:

<span itemscope itemtype="http://historical-data.org/Person">
<meta itemprop="url" content="https://familysearch.org/tree/#view=ancestor&person=KWVP-7PM" />
<meta itemprop="url" content="http://thomas.tolmanfamily.org/genealogy-pages/ppl/e/5/c4fe2696ab1711b005e.html" />
Aaron Bracken
</span>
was born January 26, 1844, at Hartland, Niagara, New York. He was the son of Joshua and Hannah Bell Bracken.

"So what?" Well, here's what: now you can use semantic tools to start browsing around these files.

For example, here's a little bookmarklet that will guide you to those linked sites:

"Activate Ancestry Links" Bookmarklet

Drag that bookmarklet into your bookmarks. (If that's not obvious, see more instructions here.) Then go here to see a sample file and click on that bookmarklet to activate the links that are hidden within it. To see this in action, watch this video.

To see the whole process of adding semantic tags, watch this video.

For those of you passionate about local apps, you can run this locally without a network connection: download all the files and edit the setup.js file to point to those local files, and then you can edit the bookmarklet to point to that setup.js file.

Advanced Concepts

The "url" itemprops are a very basic form of linking, pointing directly to some web location, which I don't think is a good long-term solution. The right way to do this is to reference these shared datum with URIs or XRIs instead. The same Aaron Bracken can be located in multiple locations; we know Family Search has some info, so if we want to see the Family Search version then we should be able to simply store his ID (KWVP-7PM) and the owner of that ID (FamilySearch.org), and then the tools can look him up in a browser with their URL scheme (eg. appending their ID to "https://familysearch.org/tree/#view=ancestor&person=").

My current problem is that I don't know of the perfect scheme to store these IDs for arbitrary systems. Here are a few possibilities:

  • HTML element of "a" or "link"

(Not ideal because you can only attach one link to your data; also, these are for URLs, which are direct HTTP locations which can change.)

<a href="https://familysearch.org/tree/#view=ancestor&person=KWVP-7PM">Aaron</a>
  • itemprop of 'url', content with location (as used above)

(Not ideal because this is just like a URL... it doesn't track the specific ID or other info that may be important for an API.)

<meta itemprop="url" content="KWVP-7PM" itemprop="https://familysearch.org/tree/#view=ancestor&person=KWVP-7PM" />
  • itemprop with namespace URL, content with ID

(Not ideal because itemprop values are typically something short and well-known, eg. "externalId")

<meta itemprop="https://familysearch.org/tree/ancestor" content="KWVP-7PM" />
  • itemprop of "externalId" or "externalUrn" and content of combination namespace & ID

(Not ideal because the namespace & ID are smushed together.)

<meta itemprop="externalId" content="https://familysearch.org/tree/ancestor&id=KWVP-7PM" />
  • itemprop of "externalId" or "externalUrn" and content of ID and some new attribute of namespace

(Not ideal because "namespace" isn't a generally recognized attribute.)

<meta itemprop="externalId" namespace="https://familysearch.org/tree/ancestor" content="KWVP-7PM" />
  • HTML object with embedded type and data elements (and maybe param elements?)

(Not ideal because it's generally recognized for other content and not semantic data, and it's more complex.)

  • XRIs

(May be ideal, but I just don't know enough about them... and neither do most people.)

My next problem after that: how to store these IDs for reference in our local, private copies of the data. A good start might be the relative references in URIs; the GEDCOM X file format has a blurb on that kind of usage, albeit inside a self-contained bundle.

Any ideas or other pointers welcome.