Author: José A. Senso (Universidad de Granada)
Source: IweTel - Thinkepi
Url document: http://www.thinkepi.net/ ... / http://listserv.rediris.es/ ...
The following, along with replicas that resulted, was originally published in the mailing list last IweTel 10/04/2007. As mentioned, gave rise to interesting replicas that can be located loose in the days thereafter, file IweTel list, or grouped together with the initial text in the repository of Thinkepi.
[Begin text]
After several years since the emergence of the concept of Semantic Web seems that in certain respects the "invention" is evolving in a more than favorable. There are a lot of projects that employ some (or most) layers of the famous graphic explanatory Berners-Lee. Today we can find many programs with varying success, have been able to capture many of the ideas introduced in this philosophy of managing the data. But where they still have taken few steps, or at least not very strong, is in practice more visible to users: browsers. It is assumed that all this large amount of information which is structured in XML, described with metadata, organized with ontologies and recovered by means of intelligent agents, should be visible by some method . What mechanism is with the average user is most familiar and integrated within the existing site?
Indeed, browsers provide a link between the surfer and information, and making transparent ignoring all this conglomeration of acronyms, protocols and standards. If this happens on the web "normal", ie we work with today, regardless of whether the name is 2.0, dynamic, blogosphere or who want to invent new, it is logical that something similar happens with the natural evolution: the semantic web.
A quick tour projects and, especially, semantic software allows us to distinguish between two different methods for browsers that provide semantic information display. The first one, I will call Semantic Browser, which is intended to browsers designed specifically for the semantic web. The second, which I call semanticize the Navigator, adds elements to existing programs to extend their chances of drawing certain semantic features seaworthiness incorporated into web pages. In
Navigators group Semantic highlights, of course, an invention of Tim Berners-Lee, to be honest, is not so successful as it should. Tabulator is his name, although in these early versions works as a browser within a browser, it is logical that the evolution over time it becomes an independent software. This open source program based on Ajax, it works in Firefox (you need to solve a small problem of security as explained in the help ) or Opera widget . This browser is
based on a protocol that its creators have called bread crumbs (bread-crumbs). The idea is to browse resources in a scalable, which means you do not need to load into memory all the information contained in the file you are viewing (usually in RDF, but not limited). The information will show as the user is going to require, as a person is slowly picking up crumbs of soil to reach the desired destination.
Along with this new navigation system, the browser can identify on a map the geographical location expressed as a file (for example, who have identified a FOAF file the coordinates of your workplace, show the exact location) via Google Mashup or query using the query language for RDF SPARQL . Although still a long way to go, the proposed platform is very promising. Of course, there are other browsers in this category are able to show that semantic information similarly. What stands BigBlogZoo , Haystack client that runs on Eclipse or Aktive Space.
Although there are many programs to semanticize the Navigator, which is spreading more and more possibilities for the future presented is Piggy Bank. It is written in Java extension for the Firefox browser that makes it possible to extract certain key elements of a website and store it in RDF.
Depending on the information you find on a web page, Piggy act in two ways. Thus, if the site has an RDF file or any application thereof, such as FOAF, or meta-information regardless of whether is Dublin Core meta tags or HTML, the program will capture the informacióny integrated into a repository, as a local database, organized according to the structure described. If, however, the site has no information of this type, the software will invoke a scraper to remove this informacióny the structure.
The Screen Scraping is a technique that is used for automatic extraction of text, obviating the binary information (images, multimedia, etc.).. The scrapers are able to work programs with any text to processes and structures. In fact, they are frequently used by Internet search engines as an appendix to work done by their spiders. Scroogle , for example, uses this technique to search Google without leaving those annoying commercials about the results. Piggy
includes three different scrapers written in JavaScript which are fully configurable, you just have to have some minimal knowledge in this programming language, but also new thinking can be used to retrieve images on Flickr ( FlickrPhotoScraper ) or search for friends to turn social networks Frying Orkut Scraper or Linkedin ). Even tells you how you should do one to search for apartments .
The information collected can be added tags to describe it. The technique of putting each contribute keywords has become very popular thanks to sites like del.icio.us CiteULike or, as it leads a community to build a taxonomíay post it to a semantic global bank, and this is another interesting choice that we see in Piggy Bank. The bank is a repository semantic community descriptions in RDF that allows its users to share information they have collected. It is a very easy to publish and share structured information. Although currently there are only two: one generic , which is a mess, and another specifically established for the conference ISWC2005. The idea of \u200b\u200bcreating semantic banks for professional groups, subject areas, etc., Is more than interesting. It is still less interesting studying a mechanism to get together all the tags created by individual users of the system. If an ontology would be able to collect the names provided in a folksonomíay allow, in addition, people could determine the relationship between them, facilitate the creation of folksologies (folk ontologies). But this is a topic for another text. Piggy
addition, other programs, which are presented as extensions of Firefox and expand the possibilities offered by the browser. Highlight them all especially Greasemonkey and Chickenfoot as they facilitate the inclusion scripts to manipulate web page elements automated.
really can not close this paper with a conclusion. No one can say that projects choose to create specific semantic browsers have a more solid basis than those who decide to extend the semantic possibilities of the current browsers. Maybe a hybrid system, which provides protocol Tabulator bread crumbs with the automatic generation of RDF descriptions and semantic bank Piggy Bank, along with the ability to navigate and boo Search of MSPACE (incidentally, that after watching the demo anyone would occur many possibilities of applying this program to a library) settle the ideal browser.
0 comments:
Post a Comment