Sunday, July 29, 2012

Plone and microdata: the "event" case

After reading some book and good tutorials about  HTML 5, one of the most promising part I found is the HTML 5 extendibility. In one word: one step towards semantic Web.

Microdata introduction (very very quickly)

I don't want to put there examples of how use microdata with HTML 5: the Web is full of simple and interesting examples.
Let simply describe in words what you can do: you will continue writing your pages normally, for example a page where you put information about a person, then using some additional HTML attributes you can mark some information into the page saying "Ehi there! This DIV is the name of the person, this P is it's address..." and so on. You simple need to play with itemscope, itemtype and itemprop attributes.

What is the useful part of this? Integration!
For example: some search engines understand these new syntax and can index this information in a special way. If the page is about a person information, the SERP page from the search engine can display immediately the name and other data, giving a quick preview.

But not only search engines. Scanning a remote web site for finding information about peoples became something that a machine can do.

Right now the suggested set of standards is the one at schema.org.
You can find there a good set of basic category types and subtypes (Person, Book, Movie, Review, Organization, ...).

Take a look there, is really interesting.

What about Plone?

Plone 4.2 is there! It has been released some days ago!
Looking at the changes of this amazing release you can find that it's now using the HTML doctype (it's still using an XML valid template language, so we are now using what is informally called XHTML 5).

What is changed with this? Nothing... HTML 5 make you able to use a lot of cool stuff, but an HTML 4 (XHTML 1.0) code can be still a valid HTML 5 code. You are not forced to do cool stuff.
Real changes will start now; future release of addons, or new Plone releases, can start using news features. Some features are already there but probably you didn't noticed them (for example: Archetypes already support the new placeholder HTML attribute).

Let's go back to microdata format defined at schema.org. When I first read all possible types defined there, I immediately focused on the Event type, and you can understand why: we have an Event content type in Plone.
How can be simple to spill out from it microdata informations?

Getting microdata for Event in Plone

Let me aswer immediatly at the question above: it's very simple. Plone Event content type contains all needed information for provide microdata, so it's only a matter of content view.

I did some test in a new add-on. This product is embarrassing simple: pushing this new feature inside Plone is really matter replacing the Event view, nothing more is needed.

Problem 1: testing the format

Seems that there aren't a lot of testing framework for you microdata. Google is the provider of the main one: the Rich Snippet Testing Tool, but is not so simple and clear to use.
Note: the Google SERP page is supporting the event entity but note that providing the Event microdata will not ensure you that Google will use it.

Testing a Plone Event content with collective.microdata.event with the RNTT gives me some feedback of page validity, but also a lot of warning about missing information.

Also: Google documentation seems not really updated right now. Their example still use other microdata formats while a warning message explicitly suggest you to use schema.org.

Another tool commonly suggested is a JavaScript based ones: Microdata Tool.

Problem 2: changes at Plone templates

Changes at Plone code are simple: the new event_view template provided can be a simple copy of the one from CMFPlone.
There's only a single change that I did and I didn't like and this is related to the position where to put the itemscope and itemtype attributes.

Starting from Plone 4, the "best" way to create views for Plone content type is the one that use content-core slot of the main template. This change simplyfied content's view because common fields, like title and description, has been moved away from views: we no more need to copy the same code in every template.
However the old-style way used in Plone 3 (filling the main slot) is still available.

What is the problem? The right way there is to to put itemscope and itemtype in the container element that contains all event data, but right now the main event container is outside the event_view.pt template.
So: the new view provided with collective.media.event is gone back to the Plone-3-style, using a template that fill the main slot.

I don't like this. What is missing there is a way of being able to mark a content in some way, and let Plone extract the itemtype from there.
This is probably a simple change to be done to the Plone core (probably getting the itemtype of contents using an adapter can be enough).

Next step(s)

Let's sum what other features are missing to Plone for being able to make contents use microdata.
  • mark events also in folders views, collections... (know that, in a page, you can provide more than one event)
  • do not forget the new plone.app.event!
  • a way to provide other formats from schema.org to other content types