Public RDF Data Sets at rdfdata.org
rdfdata.org offers a great collection of RDF data sets and services that generate RDF.
Related posts:
experience design, emerging media, business and technology [circa 2014…]
rdfdata.org offers a great collection of RDF data sets and services that generate RDF.
Related posts:
In the latest issue of ACMQueue, Tim Bray is interviewed about his career path and early involvement with the SGML and XML standards. While recounting, Bray makes four points about the slow pace of adoption for RDF, and reiterates his conviction that the current quality of RDF-based tools is an obstacle to their adoption and the success of the Semantic Web.
Here are Bray’s points, with some commentary based on recent experiences with RDF and OWL based ontology management tools.
1. Motivating people to provide metadata is difficult. Bray says, “If there’s one thing we’ve learned, it’s that there’s no such thing as cheap meta-data.”
This is plainly a problem in spaces much beyond RDF. I hold the concept and the label meta-data itself partly responsible, since the term meta-data explicitly separates the descriptive/referential information from the idea of the data itself. I wager that user adoption of meta-data tools and processes will increase as soon as we stop dissociating a complete package into two distinct things, with different implied levels of effort and value. I’m not sure what a unified label for the base level unit construct made of meta-data and source data would be (an asset maybe?), but the implied devaluation of meta-data as an optional or supplemental element means that the time and effort demands of accurate and comprehensive tagging seem onerous to many users and businesses. Thus the proliferation of automated taxonomy and categorization generation tools…
2. Inference based processing is ineffective. Bray says, “Inferring meta-data doesn’t work… Inferring meta-data by natural language processing has always been expensive and flaky with a poor return on investment.”
I think this isn’t specific enough to agree with without qualification. However, I have seen analysis of a number of inferrencing systems, and they tend to be slow, especially when processing and updating large RDF graphs. I’m not a systems architect or an engineer, but it does seem that none of the various solutions now available directly solves the problem of allowing rapid, real-time inferrencing. This is an issue with structures that change frequently, or during high-intensity periods of the ontology life-cycle, such as initial build and editorial review.
3. Bray says, “To this day, I remain fairly unconvinced of the core Semantic Web proposition. I own the domain name RDF.net. I’ve offered the world the RDF.net challenge, which is that for anybody who can build an actual RDF-based application that I want to use more than once or twice a week, I’ll give them RDF.net. I announced that in May 2003, and nothing has come close.”
Again, I think this needs some clarification, but it brings out a serious potential barrier to the success of RDF and the Semantic Web by showcasing the poor quality of existing tools as a direct negative influencer on user satisfaction. I’ve heard this from users working with both commercial and home-built semantic structure management tools, and at all levels of usage from core to occasional.
To this I would add the idea that RDF was meant for interpretation by machines not people, and as a consequence the basic user experience paradigms for displaying and manipulating large RDF graphs and other semantic constructs remain unresolved. Mozilla and Netscape did wonders to make the WWW apparent in a visceral and tangible fashion; I suspect RDF may need the same to really take off and enter the realm of the less-than-abstruse.
4. RDF was not intended to be a Knowledge Representation language. Bray says, “My original version of RDF was as a general-purpose meta-data interchange facility. I hadn’t seen that it was going to be the basis for a general-purpose KR version of the world.”
This sounds a bit like a warning, or at least a strong admonition against reaching too far. OWL and variants are new (relatively), so it’s too early to tell if Bray is right about the scope and ambition of the Semantic Web effort being too great. But it does point out that the context of the standard bears heavily on its eventual functional achievement when put into effect. If RDF was never meant to bear its current load, then it’s not a surprise that an effective suite of RDF tools remains unavailable.
Related posts:
Here’s a some snippets from an article in the Web Services Journal that nicely explains some of the business benefits of a services-based architecture that uses ontologies to integrate disparate applications and knowledge spaces.
Note that XML / RDF / OWL – all from the W3C – together only make up part of the story on new tools for how making it easy for systems (and users, and businesses…) to understand and work with complicated information spaces and relationships. There’s also Topic Maps, which do a very good job of visually mapping relationships that people and systems can understand.
Article:Semantic Mapping, Ontologies, and XML Standards
The key to managing complexity in application integration projects
Snippets:
Another important notion of ontologies is entity correspondence. Ontologies that are leveraged in more of a B2B environment must leverage data that is scattered across very different information systems, and information that resides in many separate domains. Ontologies in this scenario provide a great deal of value because we can join information together, such as product information mapped to on-time delivery history mapped to customer complaints and compliments. This establishes entity correspondence.
So, how do you implement ontologies in your application integration problem domain? In essence, some technology – either an integration broker or applications server, for instance – needs to act as an ontology server and/or mapping server.
An ontology server houses the ontologies that are created to service the application integration problem domain. There are three types of ontologies stored: shared, resource, and application. Shared ontologies are made up of definitions of general terms that are common across and between enterprises. Resource ontologies are made up of definitions of terms used by a specific resource. Application ontologies are native to particular applications, such as an inventory application. Mapping servers store the mappings between ontologies (stored in the ontology server). The mapping server also stores conversion functions, which account for the differences between schemas native to remote source and target systems. Mappings are specified using a declarative syntax that provides reuse.
RDF uses XML to define a foundation for processing metadata and to provide a standard metadata infrastructure for both the Web and the enterprise. The difference between the two is that XML is used to transport data using a common format, while RDF is layered on top of XML defining a broad category of data. When the XML data is declared to be of the RDF format, applications are then able to understand the data without understanding who sent it.