There are numerous misconceptions about the Semantic Web, largely caused by a misunderstanding of its aims and technologies. I’ve created this simple FAQ help dispel some of the myths.
I have basic technical knowledge. Tell me what the Semantic Web is.
The Semantic Web is a vision, that hopes to join together dispersed bits of data on the internet, very much like web pages are currently joined (linked) together. As with the current web of pages, anyone can create data on the Semantic Web, and anyone can “join” one bit of data to another.
Data isn’t forced into using a specific structure (like web pages are with HTML), so the data can be about anything; people, weather, books, movies, currency exchange rates or the distribution of geese in Europe.
Put simply, it’s like installing a huge relational database on the internet, where anyone can add tables to the database, and anyone can add data to a table. Tables and data can map to one another if you want, like foreign and primary keys.
Note that the Semantic Web (capital S, capital W) is often differentiated from the semantic web (small s, small w), which is basically a smaller-scale vision of making the current web more ’semantic’ (e.g. marking-up the ‘meaning’ of words on a page, rather than how they should look).
What’s the point of that?
If you’ve ever used a relational database — such as MySQL or SQL Server — you’ll know how useful they are, which is why relational databases are used ‘behind the scenes’ on most websites today. By storing distinct data-sets that inter-relate (for example, books, authors, and sales), you can very quickly find data that matches certain rules, such as books by a particular author, the top five best selling books, or top five best selling authors. Although each data-set can be maintained separately, they become increasingly powerful as they are joined together.
Imagine the possibilities if these data-sets were not just inter-related inside one database, but each database in the world was also inter-related. The authors table from a bookstore could be mapped to a birth records table in a government department (so, for example, you could get top five best selling books by nationality of the author). A historical database of world conflict could then be included to show the top five best selling books by authors who had been born in a country during time of war. And so on.
By treating the many large sets of data on the web as a single database, we’ll be able to create some incredibly powerful and valuable tools. The current trend of ‘mash-ups’ goes some way towards exploring this idea (usually mixing the data from only two data sets).
It sounds like a lot of hard work. I don’t want to have to re-do everything.
If everything goes to plan, you won’t have to do anything more than you are already. The Semantic Web doesn’t rely on any new data being created; there are already millions of suitable databases being used on the web. These databases just need to be ‘made available’ in a Semantic Web friendly format. So it’s more a job for application developers, to enable this to happen. Application databases are already capable of being published to a variety of formats (usually including SQL, XML, CSV, and so on), so adding another format to this list isn’t Earth-shattering.
So what is a “Semantic Web friendly format”, as you put it?
This is where it gets a bit tricky… There are a number of “formats” suitable for the Semantic Web, the most popular of which is RDF/XML. It doesn’t really matter that there are other formats also in use; nearly all of them are based on the same ‘data model’ (RDF), so they can be easily combined and made interoperable.
There are two basic approaches for making this data available to the Semantic Web. You could publish the whole database as a big RDF/XML ‘dump’, like dmoz do. Alternatively, you could make your database accessible via a SPARQL interface, which basically allows people to query your database via a web service, and have the relevant results returned in RDF (Microsoft have adopted this approach for their Profile Manager).
Remember that — unless you’re an application developer — you won’t have to worry about this, as this functionality will be provided by the software you use.
As it happens, I am an application developer. The Semantic Web technologies are too confusing.
Well, yes they are a bit. The documentation doesn’t help, and there aren’t enough high-level libraries around. Hopefully this will change now that organisations such as Microsoft and Adobe are seriously investing in the technologies.
Don’t be put off by all the talk of OWL, N-triples, graphs and topic maps. The only knowledge you need to get going is the RDF model, the RDF/XML syntax, and maybe RDF Schema.
The RDF model is fairly simple, and basically boils down to three bits of information: something (the ’subject’) has a something (the attribute or ‘predicate’) of a certain value (often referred to as the ‘object’). So, Dog X has a Height of 2 feet. Dan Zambonini has a Nationality of British. “Don’t Make Me Think” has an Author of Steve Krug. You get the idea; anyone familiar with metadata or databases should recognise the basic model.
RDF/XML is where it gets a little tricky… Don’t worry about all the little intricacies; read up a little on the striped syntax, and you’ll be able to start creating some RDF/XML files in next to no time (i.e. these files encode lots of RDF-modelled data into an XML file).
If you’re an application developer, you almost certainly know Object Oriented (OO) programming. Keeping the model of OO programming in mind, take a quick glance over the RDF Schema documentation, and you should recognise some of it. It basically lets you define ‘classes’ and ’sub-classes’ to use in your RDF, along with a level of ‘data-typing’ (so, for example, you could say that ‘British’ could be used as a Nationality, but not as a Height).
Don’t worry if this is getting confusing; you can probably start churning out RDF/XML without worrying too much about the schema; just take your lead from existing RDF/XML data (e.g. FOAF data). If, on the other hand, you really dig data-typing and the OO paradigm, you may even want to venture into OWL, which follows on from RDF Schema.
These “top down” approaches never work; you’ll never get everyone to agree on how things relate to one another.
The Semantic Web doesn’t rely on a top down approach. It can start with individual groups — or even individuals — putting their own data out there, without worrying about how it relates to what other people are doing. We’ve seen this with FOAF, RSS did it for a while (when it was RDF based), and there’s no reason that many of the newer formats emerging today couldn’t have made themselves RDF compatible (e.g. the sitemaps protocol wouldn’t have been that much different if it has been RDF based).
If you’re designing a new XML format, it’s worth considering making it RDF/XML compatible; you’ll still get all the benefits of XML, plus you’ll be fully equipped to take advantage of RDF’s extensibility and interoperability.


"There are two basic approaches for making this data available to the Semantic Web."
#3, put it directly in the webpage. Regarding RSS, isn't that a bad sign? It ~WAS~ RDF. That one still leaves me disappointed and a bit hesitant to believe RDF is the best way to apply semantics to the web.
The Semantic Web (SW) never worked, won't ever work and was designed ignoring basic principles that prevent it from ever working. There are good reasons why Google's people chuckle at the SW and chide Tim Berners-Lee (TBL).
SW papers over all previous research (by better people) in ontologies, language, logic and AI. Somehow, if we all put our data out there, we are told that all disparities in different ontologies (definitions of terms) will somehow disappear and the SW will flourish. Of course the SW people will rush in and say, "No, No, that's not what we mean!" but they cannot provide a logical, meaningful description of what the SW should do that that truly makes sense. It's all vague puppies and butterflies.
The SW is TBL's vanity project and we should treat it as such. TBL is a nice guy but not nearly bright enough and far too stubborn to admit that, while he got the browser right, he got what he calls the SW all wrong. SW is TBL's bong pipe dream and he's never been able to admit that his idea was stillborne. His message is constant: inhale the smoke and drink the Kool-Aid.
I wrote this some time ago:
===========================
An attempt at a very short explanation of rdf, the important ingredient of the semantic web:
- triple(s): (instance A_thing [of type A_class]) {HAS (some sort of relation Semantic_link) WITH (instance B_thing [of type B_class])} {}:1 or more of these
- Semantic_link is not a hyperlink in the traditional sense, but it is defined by/at a URL and is directional
- Semantic_link, and if used (to make things explicit in the data, not implicitly in the code using the data) A_class and B_class, are to be defined in an ontology, aka RDF Vocabulary, aka RDF Schema
- An ontology is the definition of a group of classes and their relationships (usually for just a certain domain or field of expertise, FOAF being a simple example)
- This definition can be written in a 'standard' way by using the OWL notation.
- When different knowledge-/databases have ontologies that intersect or are the same, you can combine the data by finding A_thing or B_thing mentioned more than once
One could make all relations in a SQLdatabase explicit by transforming it into an RDF database when you define the relations between the columns and use that in the transformation script.
With Prolog or Sparql taking advantage of the fact that the first-degree relations are already in the data, this allows the programmer to program in a syntax/language that is closer to what (s)he's trying to model.
> you'll still get all the benefits of XML
Can I still use XSLT?
Can I still use XML Schema?
Can I still use CSS?
...
Is Microsoft really investing seriously in RDF? Maybe, but I don't know what the evidence is. The Profile Manager you mention is part of their little-known Connected Services Framework which, to date, has been mainly targeted at Telcos and used for implementing provisioning systems. Quite how CSF fits in with Microsoft's 'connected systems' strategy is a matter of some debate. I'm fairly sure that CSF isn't seen as 'mainstream' within the MS Connected Systems Division. I don't know of any other Microsoft initiatives that are using RDF, though a quick Google search suggested they may be using it somewhere inside Vista. Microsoft is not monolithic in terms of its product groups, and the fact that one group is supporting RDF and SPARQL for a straightforward profile database does not suggest a corporate commitment to RDF from Microsoft as a whole. I wouldn't get your hopes up to high on this one. Microsoft is a large company, and it would be statistically highly improbable that there aren't some RDF/SW enthusiasts within its ranks. It will also employ RDF/SW sceptics (of whom there are many). Like any other software company, I wouldn't expect MS to seriously invest in these specifications until it can see a clear business case and emerging market for the RDF-based SW.
thanks a lot for your opinion and explaination abaut semantig but why don't you try to make and explanation in edducation side another student able to understand too..
gongrats and keep on!!
Dan your articles are always excellent reading. Person can learn so much just reading it. I hope that in the future your text will be just as good.
jenny