September 15th, 2005 — 12:00am
JP Morgenthal of DMReview.com offers a snapshot of the process for defining enterprise semantics in Enterprise Architecture: The Holistic View: The Role of Semantics in Business.
Morgenthal says, “When you understand the terms that your business uses to conduct business and you understand how those terms impact your business, you can see clearly how to support and maintain the processes that use those terms with minimal effort.”
Not a surprise, but how to make it happen, and how to explain that to the business?
Related posts:
Comment » | Architecture, Information Architecture
September 14th, 2005 — 12:00am
In the same way that information architecture helps take users’ understandings of the structure, meaning, and organization of information into account at the level of domain-specific user experiences, information spaces, and systems, the complex semantic boundaries and relationships that define and link enterprise-level domains is a natural area of activity for enterprise information architecture.
Looking for some technically oriented materials related to this level of IA – what I call enterprise semantic frameworks – I came across a solid article titled Enterprise Semantics: Aligning Service-Oriented Architecture with the Business in the Web Services Journal.
The authors – Joram Borenstein and Joshua Fox – take a web-services perspective on the business benefits of enterprise-level semantic efforts, but they do a good job of laying out the case for the importance of semantic concepts, understanding, and alignment at the enterprise level.
From the article abtract:
“Enterprises need transparency, a clear view of what is happening in the organization. They also need agility, which is the ability to respond quickly to changes in the internal and external environments. Finally, organizations require integration: the smooth interoperation of applications across organizational boundaries. Encoding business concepts in a formal semantic model helps to achieve these goals and also results in additional corollary benefits. This semantic model serves as a focal point and enables automated discovery and transformation services in an organization.”
They also offer some references at the conclusion of the article:
- Borenstein, J. and , J. (2003). “Semantic Discovery for Web Services.” Web Services Journal. SYS-CON Publications, Inc. Vol. 3, issue 4. www.sys-con.com/webservices/articleprint.cfm?id=507
- Cowles, P. (2005). “Web Service API and the Semantic Web.” Web Services Journal. SYS-CON Publications, Inc. Vol. 5, issue 2. www.sys-con.com/story/?storyid=39631&DE=1
- Genovese, Y., Hayword, S., and Comport, J. (2004). “SOA Will Demand Re-engineering of Business Applications.” Gartner. October 8.
- Linthicum, D. (2005). “When Building Your SOA…Service Descriptions Are Key.” WebServices.Org. March 2005. www.webservices.org/ws/content/view/full/56944
- Schulte, R.W., Valdes, R., and Andrews, W. (2004). “SOA and Web Services Offer Little Vendor Independence.” Gartner. April 8.
- W3C Web Services Architecture Working Group: www.w3.org/2002/ws/arch/
Related posts:
Comment » | Architecture, Information Architecture, Modeling
February 8th, 2005 — 12:00am
How Much Information? 2003 is an update to a project first undertaken by researchers at the School of Information Management and Systems at UC Berkeley in 2000. Their intent was to study information storage and flows across print, film, magnetic, and optical media.
It’s not surprising that the United States produces more information than any other single country, but it was eye-opening to read that about 40% of the new stored information in the world every year cones from the U.S.
Also surprising is the total amount of instant message traffic in 2002, estimated at 274 terabytes, and the fact that email is now the second largest information flow, behind the telephone.
Some excerpts from the executive summary:
“Print, film, magnetic, and optical storage media produced about 5 exabytes of new information in 2002. Ninety-two percent of the new information was stored on magnetic media, mostly in hard disks.”
“How big is five exabytes? If digitized with full formatting, the seventeen million books in the Library of Congress contain about 136 terabytes of information; five exabytes of information is equivalent in size to the information contained in 37,000 new libraries the size of the Library of Congress book collections.”
“Hard disks store most new information. Ninety-two percent of new information is stored on magnetic media, primarily hard disks. Film represents 7% of the total, paper 0.01%, and optical media 0.002%.”
“The United States produces about 40% of the world’s new stored information, including 33% of the world’s new printed information, 30% of the world’s new film titles, 40% of the world’s information stored on optical media, and about 50% of the information stored on magnetic media.”
“How much new information per person? According to the Population Reference Bureau, the world population is 6.3 billion, thus almost 800 MB of recorded information is produced per person each year. It would take about 30 feet of books to store the equivalent of 800 MB of information on paper.”
“Most radio and TV broadcast content is not new information. About 70 million hours (3,500 terabytes) of the 320 million hours of radio broadcasting is original programming. TV worldwide produces about 31 million hours of original programming (70,000 terabytes) out of 123 million total hours of broadcasting.”
Related posts:
Comment » | The Media Environment
February 7th, 2005 — 12:00am
In the latest issue of ACMQueue, Tim Bray is interviewed about his career path and early involvement with the SGML and XML standards. While recounting, Bray makes four points about the slow pace of adoption for RDF, and reiterates his conviction that the current quality of RDF-based tools is an obstacle to their adoption and the success of the Semantic Web.
Here are Bray’s points, with some commentary based on recent experiences with RDF and OWL based ontology management tools.
1. Motivating people to provide metadata is difficult. Bray says, “If there’s one thing we’ve learned, it’s that there’s no such thing as cheap meta-data.”
This is plainly a problem in spaces much beyond RDF. I hold the concept and the label meta-data itself partly responsible, since the term meta-data explicitly separates the descriptive/referential information from the idea of the data itself. I wager that user adoption of meta-data tools and processes will increase as soon as we stop dissociating a complete package into two distinct things, with different implied levels of effort and value. I’m not sure what a unified label for the base level unit construct made of meta-data and source data would be (an asset maybe?), but the implied devaluation of meta-data as an optional or supplemental element means that the time and effort demands of accurate and comprehensive tagging seem onerous to many users and businesses. Thus the proliferation of automated taxonomy and categorization generation tools…
2. Inference based processing is ineffective. Bray says, “Inferring meta-data doesn’t work… Inferring meta-data by natural language processing has always been expensive and flaky with a poor return on investment.”
I think this isn’t specific enough to agree with without qualification. However, I have seen analysis of a number of inferrencing systems, and they tend to be slow, especially when processing and updating large RDF graphs. I’m not a systems architect or an engineer, but it does seem that none of the various solutions now available directly solves the problem of allowing rapid, real-time inferrencing. This is an issue with structures that change frequently, or during high-intensity periods of the ontology life-cycle, such as initial build and editorial review.
3. Bray says, “To this day, I remain fairly unconvinced of the core Semantic Web proposition. I own the domain name RDF.net. I’ve offered the world the RDF.net challenge, which is that for anybody who can build an actual RDF-based application that I want to use more than once or twice a week, I’ll give them RDF.net. I announced that in May 2003, and nothing has come close.”
Again, I think this needs some clarification, but it brings out a serious potential barrier to the success of RDF and the Semantic Web by showcasing the poor quality of existing tools as a direct negative influencer on user satisfaction. I’ve heard this from users working with both commercial and home-built semantic structure management tools, and at all levels of usage from core to occasional.
To this I would add the idea that RDF was meant for interpretation by machines not people, and as a consequence the basic user experience paradigms for displaying and manipulating large RDF graphs and other semantic constructs remain unresolved. Mozilla and Netscape did wonders to make the WWW apparent in a visceral and tangible fashion; I suspect RDF may need the same to really take off and enter the realm of the less-than-abstruse.
4. RDF was not intended to be a Knowledge Representation language. Bray says, “My original version of RDF was as a general-purpose meta-data interchange facility. I hadn’t seen that it was going to be the basis for a general-purpose KR version of the world.”
This sounds a bit like a warning, or at least a strong admonition against reaching too far. OWL and variants are new (relatively), so it’s too early to tell if Bray is right about the scope and ambition of the Semantic Web effort being too great. But it does point out that the context of the standard bears heavily on its eventual functional achievement when put into effect. If RDF was never meant to bear its current load, then it’s not a surprise that an effective suite of RDF tools remains unavailable.
Related posts:
Comment » | Semantic Web, Tools