Archive for February 2007


Endeca Guided Navigation vs. Facets In Search Experiences

February 26th, 2007 — 12:00am

A recent question on the mailing list for the Taxonomy Community of Practice asked about search vendors whose products handle faceted navigation, and mentioned Endeca. Because vendor marketing distorts the meaning of accepted terms too often, it’s worth pointing out that Endeca’s tools differ from faceted navigation and organization systems in a number of key ways. These differences should affect strategy and purchase decisions on the best approach to providing high quality search experiences for users.

The Endeca model is based on Guided Navigation, a product concept that blends elements of user experience, administration, functionality, and possible information structures. In practice, guided navigation feels similar to facets, in that sets of results are narrowed or filtered by successive choices from available attributes (Endeca calls them dimensions).

But at heart, Endeca’s approach is different in key ways.

  • Facets are orthogonal, whereas Endeca’s dimensions can overlap.
  • Facets are ubiquitous, so always apply, whereas Endeca’s dimensions can be conditional, sometimes applying and sometimes not.
  • Facets reflect a fundamental characteristic or aspect of the pool of items. Endeca’s Dimensions may reflect some aspect of the pool of items (primary properties), they may be inferred (secondary properties), they may be outside criteria, etc.
  • The values possible for a individual facet are flat and equivalent. Endeca’s dimensions can contain various kinds of structures (unless I’m mistaken), and may not be equivalent.

In terms of application to various kinds of business needs and user experiences, facets can offer great power and utility for quickly identifying and manipulating large numbers of similar or symmetrical items, typically in narrower domains. Endeca’s guided navigation is well suited to broader domains (though there is still a single root at the base of the tree), with fuzzier structures than facets.

Operatively, facets often don’t serve well as a unifying solution to the need for providing structure and access to heterogeneous collections, and can encounter scaling difficulties when used for homogenous collections. Faceted experiences can offer genuine bidirectional navigation for users, meaning they work equally well for navigation paths that expand item sets from a single item to larger collections of similar items, because of the symmetry built in to faceted systems.

Guided navigation is better able to handle heterogeneous collections, but is not as precise for identification, does not reflect structure, and requires attention to correctly define (in ways not confusing / conflicting) and manage over time. Endeca’s dimensions do not offer bidirectional navigation by default (because of their structural differences – it is possible to create user experiences that support bidirectional navigation using Endeca).

In sum, these differences should help explain the popularity of Endeca in ecommerce contexts, where every architectural incentive (even those that may not align with user goals) to increasing the total value of customer purchases is significant, and the relevance of facets to searching and information retrieval experiences that support a broader set of user goals within narrower information domains.

Comment » | Enterprise, Information Architecture, User Experience (UX)

10 Best Practices For Displaying Tag Clouds

February 25th, 2007 — 12:00am

This is a short list of best practices for rendering and displaying tag clouds that I originally circulated on the IXDG mailing list, and am now posting in response to several requests. These best practices are not in order of priority – they’re simple enumerated.

  1. Use a sin­gle color for the tags in the ren­dered cloud: this will allow vis­i­tors to iden­tify finer dis­tinc­tions in the size dif­fer­ences. Employ more than one color with dis­cre­tion. If using more than one color, offer the capa­bil­ity to switch between sin­gle color and mul­ti­ple color views of the cloud.
  2. Use a sin­gle sans serif font fam­ily: this will improve the over­all read­abil­ity of the ren­dered cloud.
  3. If accu­rate com­par­i­son of rel­a­tive weight (see­ing the size dif­fer­ences amongst tags) is more impor­tant than over­all read­abil­ity, use a mono­space font.
  4. If com­pre­hen­sion of tags and under­stand­ing the mean­ing is more impor­tant, use a vari­ably spaced font that is easy to read.
  5. Use con­sis­tent and pro­por­tional spac­ing to sep­a­rate the tags in the ren­dered tag cloud. Pro­por­tional means that the spac­ing between tags varies based on their size; typ­i­cally more space is used for larger sizes. Con­sis­tent means that for each tag of a cer­tain size, the spac­ing remains the same. In html, spac­ing is often deter­mined by set­ting style para­me­ters like padding or mar­gins for the indi­vid­ual tags.
  6. Avoid sep­a­ra­tor char­ac­ters between tags: they can be con­fused for small tags.
  7. Care­fully con­sider ren­der­ing in flash, or another vector-based method, if your users will expe­ri­ence the cloud largely through older browsers / agents: the font ren­der­ing in older browsers is not always good or con­sis­tent, but it is impor­tant that the cloud offer text that is read­ily digestible by search and index­ing engines, both locally and publicly
  8. If ren­der­ing the cloud in html, set the font size of ren­dered tags using whole per­cent­ages, rather than pixel sizes or dec­i­mals: this gives the dis­play agent more free­dom to adjust its final rendering.
  9. Do not insert line breaks: this allows the ren­der­ing agent to adjust the place­ment of line breaks to suit the ren­der­ing context.
  10. Offer the abil­ity to change the order between at least two options — alpha­bet­i­cal, and one vari­able dimen­sion (over­all weight, fre­quency, recency, etc.)

For fun, I’ve run these 10 best practices through Tagcrowd. The major concepts show up well – font, color, and size are prominent – but obviously the specifics of the things discussed remain opaque.

Best Practices For Display as a Text Cloud
best_practices_textcloud.jpg

Comment » | Tag Clouds

Smart Scoping For Content Management: Use The Content Scope Cycle

February 19th, 2007 — 12:00am

Con­tent man­age­ment efforts are justly infa­mous for exceed­ing bud­gets and time­lines, despite mak­ing con­sid­er­able accom­plish­ments. Exag­ger­ated expec­ta­tions for tool capa­bil­i­ties (ven­dors promise a world of automagic sim­plic­ity, but don’t believe the hype) and the poten­tial value of cost and effi­ciency improve­ments from man­ag­ing con­tent cre­ation and dis­tri­b­u­tion play a sub­stan­tial part in this. But unre­al­is­tic esti­mates of the scope of the con­tent to be man­aged make a more impor­tant con­tri­bu­tion to most cost and time over­runs.

Scope in this sense is a com­bi­na­tion of the quan­tity and the qual­ity of con­tent; smaller amounts of very com­plex con­tent sub­stan­tially increase the over­all scope of needs a CM solu­tion must man­age effec­tively. By anal­ogy, imag­ine build­ing an assem­bly line for toy cars, then decid­ing it has to han­dle the assem­bly of just a few full size auto­mo­biles at the same time.

Early and inac­cu­rate esti­mates of con­tent scope have a cas­cad­ing effect, decreas­ing the accu­racy of bud­gets, time­lines, and resource fore­casts for all the activ­i­ties that fol­low.

In a typ­i­cal con­tent man­age­ment engage­ment, the activ­i­ties affected include:

  • tak­ing a con­tent inventory
  • defin­ing con­tent models
  • choos­ing a new con­tent man­age­ment system
  • design­ing con­tent struc­tures, work­flows, and metadata
  • migrat­ing con­tent from one sys­tem to another
  • refresh­ing and updat­ing content
  • estab­lish­ing sound gov­er­nance mechanisms

The Root of the Prob­lem
Two mis­con­cep­tions — and two com­mon but unhealthy prac­tices, dis­cussed below — drive most con­tent scope esti­mates. First: the scope of con­tent is know­able in advance. Sec­ond, and more mis­lead­ing, scope remains fixed once defined. Nei­ther of these assump­tions is valid: iden­ti­fy­ing the scope of con­tent with accu­racy is unlikely with­out a com­pre­hen­sive audit, and con­tent scope (ini­tial, revised, actual) changes con­sid­er­ably over the course of the CM effort.

Together, these assump­tions make it very dif­fi­cult for pro­gram direc­tors, project man­agers, and busi­ness spon­sors to set accu­rate and detailed bud­get and time­line expec­ta­tions. The uncer­tain or shift­ing scope of most CM efforts con­flicts directly with busi­ness imper­a­tives to care­fully man­age of IT cap­i­tal invest­ment and spend­ing, a neces­sity in most fund­ing processes, and espe­cially at the enter­prise level. Instead of esti­mat­ing spe­cific num­bers long in advance of real­ity (as with the Iraq war bud­get), a bet­ter approach is to embrace flu­id­ity, and plan to refine scope esti­mates at punc­tu­ated inter­vals, accord­ing to the nat­ural cycle of con­tent scope change.

Under­stand­ing the Con­tent Scope Cycle
Con­tent scope changes accord­ing to a pre­dictable cycle that is largely inde­pen­dent of the specifics of a project, sys­tem, orga­ni­za­tional set­ting, and scale. This cycle seems con­sis­tent at the level of local CM efforts for a sin­gle busi­ness unit or iso­lated process, and at the level of enter­prise scale con­tent man­age­ment efforts. Under­stand­ing the cycle makes it pos­si­ble to pre­pare for shifts in a qual­i­ta­tive sense, account­ing for the kind of vari­a­tion to expect while plan­ning and set­ting expec­ta­tions with stake­hold­ers, solu­tion users, spon­sors, and con­sumers of the man­aged con­tent.

The Con­tent Scope Cycle
cm_scope_cycle.png

The high peak and ele­vated moun­tain val­ley shape in this illus­tra­tion tell the story of scope changes through the course of most con­tent man­age­ment efforts. From the ini­tial inac­cu­rate esti­mate, scope climbs con­sis­tently and steeply dur­ing the dis­cov­ery phase, peak­ing in poten­tial after all dis­cov­ery activ­i­ties con­clude. Scope then declines quickly, but not to the orig­i­nal level, as assess­ments cull unneeded con­tent. Scope lev­els out dur­ing sys­tem / solu­tion / infra­struc­ture cre­ation, and climbs mod­estly dur­ing revi­sion and replace­ment activ­i­ties. At this point, the actual scope is known. Mea­sured increases dri­ven by the incor­po­ra­tion of sup­ple­men­tal mate­r­ial then increase scope in stages.

Local and Enter­prise Cycles

Apply­ing the context-independent view of the cycle to a local level reveals a close match with the activ­i­ties and mile­stones for a con­tent man­age­ment effort for a small body of con­tent, a sin­gle busi­ness unit of a larger orga­ni­za­tion, or a self-contained busi­ness process.

Local Con­tent Man­age­ment Scope Cycle
cm_scope_local.png
At the enter­prise level, the cycle is the same. This illus­tra­tion shows activ­i­ties and mile­stones for a con­tent man­age­ment effort for a large and diverse body of con­tent, mul­ti­ple busi­ness units of a larger orga­ni­za­tion, or mul­ti­ple and inter­con­nected busi­ness process.

Enter­prise Con­tent Man­age­ment Scope Cycle
cm_enterprise_cycle.png

Scope Cycle Changes
cm_scope_changes.png

This graph shows the amount of scope change at each mile­stone, ver­sus its pre­de­ces­sor. Look­ing at the changes for any pat­terns of clus­ter­ing and fre­quency, it’s easy to see the cycle breaks down into three major phases: an ini­tial period of dynamic insta­bil­ity, a sta­tic and sta­ble phase, and a con­clud­ing (and ongo­ing, if the effort is suc­cess­ful) phase of dynamic sta­bil­ity.

Scope Cycle Phases
cm_scope_phases.png

Where does the extra scope come from? In other words, what’s the source of the unex­pected quan­tity and com­plex­ity of con­tent behind the spikes and drops in expected scope in the first two phases? And why dri­ves the shifts from one phase to another?

Bad CM Habits

Two com­mon approaches account for a major­ity of the dra­matic shifts in con­tent scope. Most sig­nif­i­cantly, those peo­ple with imme­di­ate knowl­edge of the con­tent quan­tity and com­plex­ity rarely have direct voice in set­ting the scope and time­line expec­ta­tions.

Too often, stake hold­ers with exper­tise in other areas (IT, enter­prise archi­tec­ture, appli­ca­tion devel­op­ment) frame the prob­lem and the solu­tion far in advance. The con­tent cre­ators, pub­lish­ers, dis­trib­u­tors, and con­sumers are not involved early enough.
Sec­ondly, those who frame the prob­lem make assump­tions about quan­tity and com­plex­ity that trend low. (This is in com­pan­ion to the exag­ger­a­tion of tool capa­bil­i­ties.) Each new busi­ness unit, con­tent owner, and sys­tem administrator’s items included in the effort will increase the scope of the con­tent in quan­tity, com­plex­ity, or both. Ongo­ing iden­ti­fi­ca­tion of new or unknown types of con­tent, work flows, busi­ness rules, usage con­texts, stor­age modes, appli­ca­tions, for­mats, syn­di­ca­tion instances, sys­tems, and repos­i­to­ries will con­tinue to increase the scope until all rel­e­vant par­ties (cre­ators, con­sumers, admin­is­tra­tors, etc.) are engaged, and their needs and con­tent col­lec­tions fully under­stood.
The result is clear: a series of sub­stan­tial scope errors of both under and over-estimatio, in com­par­i­son to the actual scope, con­cen­trated in the first phase of the scope cycle.
Scope Errors
cm_scope_error.png

Smart Scop­ing
The scope cycle seems to be a fun­da­men­tal pat­tern; likely an emer­gent aspect of the envi­ron­ments and sys­tems under­ly­ing it, but that’s another dis­cus­sion entirely. Fail­ing to allow for the nat­ural changes in scope over the course of a con­tent man­age­ment effort ties your suc­cess to inac­cu­rate esti­mates, and this false expec­ta­tions.
Smart scop­ing means allow­ing for and antic­i­pat­ing the inher­ent mar­gins of error when set­ting expec­ta­tions and mak­ing esti­mates. The most straight­for­ward way to put this into prac­tice and account for the likely mar­gins of error is to adjust the tim­ing of a scope esti­mate to the nec­es­sary level of accu­racy.

Rel­a­tive Scope Esti­mate Accu­racy
cm_estimate_accuracy.png

Scop­ing and Bud­get­ing
Esti­ma­tion prac­tices that respond to the con­tent scope cycle can still sat­isfy busi­ness needs. At the enter­prise CM level, IT spend­ing plans and invest­ment frame­works (often part of enter­prise archi­tec­ture plan­ning processes) should allow for nat­ural cycles by defin­ing classes or kinds of esti­mates based on com­par­a­tive degree of accu­racy, and the estimator’s lee­way for meet­ing or exceed­ing implied com­mit­ments. Enter­prise frame­works will iden­tify when more or less accu­rate esti­mates are needed to move through fund­ing and approval gate­ways, based on each organization’s invest­ment prac­tices.

And at the local CM level, project plan­ning and resource fore­cast­ing meth­ods should allow for incre­men­tal allo­ca­tion of resources to meet task and activ­ity needs. Tak­ing a con­tent inven­tory is a sub­stan­tial labor on its own, for exam­ple. The same is true of migrat­ing a body of con­tent from one or more sources to a new CM solu­tion that incor­po­rates changed con­tent struc­tures such as work flows and infor­ma­tion archi­tec­tures. The archi­tec­tural, tech­ni­cal, and orga­ni­za­tional capa­bil­i­ties and staff needed for inven­to­ry­ing and migrat­ing con­tent can often be met by rely­ing on con­tent own­ers and stake hold­ers, or hir­ing con­trac­tors for short and medium-term assis­tance.

Par­al­lels To CM Spend­ing Pat­terns
The con­tent scope cycle strongly par­al­lels the spend­ing pat­terns dur­ing CMS imple­men­ta­tion James Robert­son iden­ti­fied in June of 2005. I think the scope cycle cor­re­lates with the spend­ing pat­tern James found, and it may even be a dri­ving fac­tor.
Scop­ing and Matu­rity

Unre­al­is­tic scope esti­ma­tion that does not take the con­tent scope cycle into account is typ­i­cal of orga­ni­za­tions under­tak­ing a first con­tent man­age­ment effort. It is also com­mon in orga­ni­za­tions with con­tent man­age­ment expe­ri­ence, but low lev­els of con­tent man­age­ment matu­rity.

Two (infor­mal) sur­veys of CMS prac­ti­tion­ers span­ning the past three years show the preva­lence of scop­ing prob­lems. In 2004, Vic­tor Lom­bardi reported: “Of all tasks in a con­tent man­age­ment project, the cre­ation, edit­ing, and migra­tion of con­tent are prob­a­bly the most fre­quently under­es­ti­mated on the project plan.” [in Man­ag­ing the Com­plex­ity of Con­tent Man­age­ment].

And two weeks ago, Rita War­ren of CMSWire shared the results of a recent sur­vey on chal­lenges in con­tent man­age­ment (Things That Go Bump In Your CMS).

The top 5 chal­lenges (most often ranked #1) were:

  1. Clar­i­fy­ing busi­ness goals
  2. Gain­ing and main­tain­ing exec­u­tive support
  3. Redesigning/optimizing busi­ness processes
  4. Gain­ing con­sen­sus among stakeholders
  5. Prop­erly scop­ing the project

…“Prop­erly scop­ing the project” was actu­ally the most pop­u­lar answer, show­ing up in the top 5 most often.

Accu­rate scop­ing is much eas­ier for orga­ni­za­tions with high lev­els of con­tent man­age­ment matu­rity. As the error mar­gins inher­ent in early and inac­cu­rate scope esti­mates demon­strate, there is con­sid­er­able ben­e­fit in cre­at­ing mech­a­nisms and tools for effec­tively under­stand­ing the quan­tity and qual­ity of con­tent requir­ing man­age­ment, as well as the larger busi­ness con­text, solu­tion gov­er­nance, and orga­ni­za­tional cul­ture concerns.

Comment » | Enterprise, Ideas, Information Architecture

PEW Report Shows 28% Of Internet Users Have Tagged

February 1st, 2007 — 12:00am

The Pew Internet & American Life Project just released a report on tagging that finds 28% of internet users have tagged or categorized content online such as photos, news stories or blog posts. On a typical day online, 7% of internet users say they tag or categorize online content.

The authors note “This is the first time the Project has asked about tagging, so it is not clear exactly how fast the trend is growing.”
Wow – I’d say it’s growing extremely quickly. Though I am on record as a believer in the bright future of tag clouds, I admit I’m surprised by these results. The fact that 7% of internet users tag daily is what’s most significant: it’s an indication of very rapid adoption for the practice of tagging in many different contexts and many different kinds of audiences, given it’s brief history.

I’d guess this adoption rate compares to the rates of adoption for other new network-dependent or emergent architectures like P2P music sharing or on-line music buying.

You’re correct if you’re thinking there is a difference between tagging and tag clouds. And if you’ve read the report and the accompanying interview with Dr. Weinberger, you’ve likely realized that neither Dr. Weinberger’s interview nor the report specifically addresses tag cloud usage. But remember the First Principle of Tag Clouds: “Where there’s tags, there’s a tag cloud.” By definition, any item with an associated collection of tags has a tag cloud, regardless of whether that tag cloud is directly visible and actionable in the user experience. So that 7% of internet users who tag daily are by default creating and working with tag clouds daily.

It might be time for tag clouds to look into getting some sunglasses.

Comment » | Tag Clouds

Back to top