FamilySearch Wiki:History of Content Organization, Browsing, and Categories

The Past

Testing 1, 2, 3

When FamilySearch Wiki was first published on the Internet in early 2007, we were using a folder structure to organize articles. This was problemmatical, however, for articles which seemed to fit the domain of multiple folders. Should an article about the Western States Marriage Index be placed in the Idaho folder, the Utah folder, or the United States folder? The whole purpose of using folders was to make it easier for users -- especially authors -- to quickly survey all the articles we had on a topic. But if we were constantly having to place an article in one folder when it really fit the domains of several, would that purpose be served? Clearly, we needed a way to make a many-to-one association between topics and articles.

This led us to Keywords. In Plone 2.x, the content management system we were using to begin the wiki, authors could associate articles with keywords. This would offer the required many-to-one association between topics and articles. However, we saw how huge a task it would be to populate the system with FamilySearch's deep and wide topic authorities in multiple languages and train authors in their use. Our tiny team didn't have the resources to tackle the project, and meanwhile, the search engine was doing a pretty good job of finding the articles we needed for production work, so we put this project on hold.

In autumn 2007, FamilySearch directors decided to switch the site to a different platform. We would migrate the content from Plone to MediaWiki because it had been proven that MediaWiki sites that did what we were trying to do could be scaled to a large audience. In December 2007, we launched Beta 2, inviting all past contributors of the Plone site to join us in testing the new MediaWiki platform.

= The Present =

MediaWiki has no folder structure for content. It incorporates the use of Categories, which allow users to browse all the articles associated with a topic. Although Categories allow browsing, we do not yet know whether they can be used to filter MediaWiki's search results. We're still learning the new platform.

Since good metadata requires tagging standards, since we have migrated to another platform that allows tagging, and since standardizing our tagging will be a very large project, we aim to address the issue in early 2008.

= Browsing by Place or Topic =

Genealogical topics include two major subsets – those which are associated to a place and those which aren’t. We plan to use Mediawiki’s Categories to classify our system’s content. Mediawiki allows users to create Categories with Subcategories beneath them, which is good for classifying, say, towns within a county. However, Mediawiki’s Special Page listing all Categories lists all Categories and Subcategories in a flat list, not a hierarchy. We recommend that product management test users to see whether they would find it useful to include another page which lists all Categories and Subcategories in a hierarchical or outline form.

= Authorities for Topics and Places =

When seeking information in the wiki, users need to see browsing options and search results which are consistent, unambiguous, and if possible, even familiar. The Library of Congress authorities (subject headings) meet these criteria for subjects other than places. Where the LC catalog fails is in the disambiguation of common place names used for multiple levels of jurisdiction. For instance, place names like Grant, Washington, Montgomery, Jefferson, Lake, and Summit are used for towns, townships, parishes, counties, and sometimes states. If our wiki were to employ a category of “Washington” without including its jurisdiction, and a user types “Washington” into the wiki’s search engine, the search results will contain entries for all these jurisdictions.

= Filtering =

Users need to be able to filter a search not only by [all places named Washington], but by [Washington that is a county in Washington state].

In Mediawiki, users can create a category United States with a subcategory Washington. However, they can’t add more subcategories named Washington for counties, parishes, townships, and towns with that name. A category name can be used only once.[1] Standards Clarify Place Categories

To create place categories that users and the system will find unambiguous, we must employ standards found in genealogy programs and the Family History Library Catalog. An entry for Washington Township would look like this:

United States, Washington, Washington, Washington

…or this:

Washington, Washington, Washington, United States

…or this:

United States, Washington State, Washington County, Washington Township

We need to choose one of these three standards and implement it.

= Categorizing by Wiki Code vs. WYSIWYG Interface =

Categorizing an article by adding wiki code is problematical for two reasons. First, it’s not simple. For the same reasons normal people prefer a WYSIWYG operating system over DOS or UNIX, normal people also prefer WYSIWYG controls over wiki coding. Just as the DOS operating system was a barrier to many people using computers, wiki coding is a barrier to many people categorizing Wikimedia content.

Another problem with categorizing-by-coding is that one must spell the category exactly right. If they fail to get every character correct, the system creates a new category. Since our system’s place category names will be long, such as United States, Washington, Washington, Washington, the probability for error in adding category codes will be high. This necessitates the recruitment and management of a fairly large category cleanup team, which is a high-maintenance solution.

Proposed Solution
Since categorizing articles by adding wiki code increases errors and decreases the number of active authors, other solutions should be considered. Users wanting to categorize an article should be able to choose a category from a hierarchy of all active categories. They should also be able to search for the category name rather than browsing the hierarchy.

= Disambiguating Article Titles =

Customers who search by a common place name like Washington who don’t know how to filter their search by Category will find in their search results articles for all places named Washington. To make their options less ambiguous, we will recommend that authors use standards in titling their articles. Instead of titling an article Vital Records or Vital Records in Washington, we may recommend they use something like Vital Records of Washington (town), Washington County, Washington. Again, we need to select a standard and implement it.

[1] In Mediawiki, categories are identified only by name. To categorize an article for Washington County and another for Washington State, users may choose to add to the article’s body text. If the categories for the state, county, and township had the same name, the user (and even the system) wouldn’t know how to tell them apart.