Friday, September 21, 2012

Quick note about TAL changes on recent Plone versions

I know that using complex TAL expression on Zope Page Templates is bad and symptom of bad software architecture...
...however sometimes you have an old product that... simply works!

Only recently, with Plone 4.2, we found that something changed in ZPT language rules (don't know if this is realated to the Chameleon introduction, or Zope guys found some security issues): what is changed is how you can define TALES string expressions.

The only definition of TALES string expression I know is the one learned when I was young from the old-fashion book "Definitive Guide to Plone" (wow... we are talking of Plone 2.0!).

Roughly speaking: a string expression is something that looks like follow:
"string: statictext ${path expression} another static part"

And in the definition of path expression is possible to concatenate multiple path expressions as follow:
"context/foo/bar|context/baz"

So: in old Plone (Zope) you were able to write something like this:
"string: statictext ${context/foo/bar|context/baz} another static part"

In new Plone (Zope) versions this is not possible anymore, you'll get an explicit error (so I really think this is not a bug, but a wanted limitation):
$ must be doubled or followed by a simple path in expression

We must use only simple path expressions, no multiple set..

Fixing old code is simple: just put the path expression in a tal:define, then use the simple defined variable in string expression.

You are warned.

Saturday, September 8, 2012

Plone and microdata: adding support to microdata to Plone

My last article was about adding microdata support to Plone event content type.
The article also introduced what changes needs to be done to Plone for getting microdata support (in general) and the resulting product (collective.microdata.event 0.1) applied all those changes.

However microdata inside a framework like Plone is not only something you can add, is also something you can support; so I did my best for creating a product that help people to support easily schema.org vocabulary in Plone.

Introducing collective.microdata.core

The resulting experiment is collective.microdata.core, a base package that provide a set of minimal features I already introduced with collective.microdata.event, but this time without relying on any microdata vocabulary.

The product is not for final users, but for developers and integrators. It simply give those features:
  • Provide the definition for the Thing schema.org type (the most basic ones) for all Plone content types (because every content can be a "thing")
  • Provide a rude adapter for obtaining a Thing definition from a content (a very little set of informations)
  • Provide a catalog indexer for saving into brains the most specific microdata type
  • Provide the catalog indexer implementation for Thing.
The package is very small because there isn't a lot of work to do. Unluckily the most part of the work is inside the content's view (like I said in the previous post) and this is still something not very easy to do right now.

Testing you microdata (directly in Plone)

I already wrote about how can be difficult testing microdata today with online tools and what is the JavaScript Microdata Tool. The collective.microdata.core product can adding this little JavaScript library to Plone, just for testing purpose.

This can help you "seeing" microdata inside Plone pages.
Microdata tool with Events

Microdata information inside folder content listing views

Having microdata in content views is great, however you must know that you can provide more that one microdata snippet inside a page. Going back to events example: you can provide a list of events, and search engines can index them all.

The optional package collective.microdata.contentlisting is doing this, but know that the task is not very easy.

Once again: we need to customize Plone views (in that case I talk about folder listings views). The product (right now) is limiting itself to customize the "standard" and "summary" views.

The problem in this task is related to the different type of information we need to put inside those views, when we met different microdata types.
Plone itself put some custom logic inside views, for displaying better information when the listed item is an Event (for displaying start and end date) or a News Item (for displaying the creator; I think this is a new feature, never saw before). This can't be a general purpose approach.

So: this experimental package simplify a lot views, delegating what to be displayed to other tiny-views that 3rd party product can provide. Obviously, default tiny-views are supported by the product so you didn't see any difference.

Once again, collective.microdata.event (new 0.2 version) is supporting this extension.

Microdata tool with a folder listing view

The new version of collective.microdata.event

Version 0.2 is only a refactoring: all logic has been moved to the core package, and this has been transformed to a working implementation of the other two packages.

Moving on

There's still the main issue: when provide microdata implementation we need a deep customization of content's view (if you remember my previous article: customizing with the use of old-way "main" macro). This are an issue that an add-on product can't fix without customizing the Plone main_template (and this is something I really don't like).

It's impossible to have a microdata support in Plone without putting some new features in the Plone core. Is not possible to be sure how important microdata can be in the near future, but I'm quite sure that can be really useful. I think that Plone need to directly support microdata (at least, for the content's main view).

What we can do then? What about a PLIP?

Sunday, July 29, 2012

Plone and microdata: the "event" case

After reading some book and good tutorials about  HTML 5, one of the most promising part I found is the HTML 5 extendibility. In one word: one step towards semantic Web.

Microdata introduction (very very quickly)

I don't want to put there examples of how use microdata with HTML 5: the Web is full of simple and interesting examples.
Let simply describe in words what you can do: you will continue writing your pages normally, for example a page where you put information about a person, then using some additional HTML attributes you can mark some information into the page saying "Ehi there! This DIV is the name of the person, this P is it's address..." and so on. You simple need to play with itemscope, itemtype and itemprop attributes.

What is the useful part of this? Integration!
For example: some search engines understand these new syntax and can index this information in a special way. If the page is about a person information, the SERP page from the search engine can display immediately the name and other data, giving a quick preview.

But not only search engines. Scanning a remote web site for finding information about peoples became something that a machine can do.

Right now the suggested set of standards is the one at schema.org.
You can find there a good set of basic category types and subtypes (Person, Book, Movie, Review, Organization, ...).

Take a look there, is really interesting.

What about Plone?

Plone 4.2 is there! It has been released some days ago!
Looking at the changes of this amazing release you can find that it's now using the HTML doctype (it's still using an XML valid template language, so we are now using what is informally called XHTML 5).

What is changed with this? Nothing... HTML 5 make you able to use a lot of cool stuff, but an HTML 4 (XHTML 1.0) code can be still a valid HTML 5 code. You are not forced to do cool stuff.
Real changes will start now; future release of addons, or new Plone releases, can start using news features. Some features are already there but probably you didn't noticed them (for example: Archetypes already support the new placeholder HTML attribute).

Let's go back to microdata format defined at schema.org. When I first read all possible types defined there, I immediately focused on the Event type, and you can understand why: we have an Event content type in Plone.
How can be simple to spill out from it microdata informations?

Getting microdata for Event in Plone

Let me aswer immediatly at the question above: it's very simple. Plone Event content type contains all needed information for provide microdata, so it's only a matter of content view.

I did some test in a new add-on. This product is embarrassing simple: pushing this new feature inside Plone is really matter replacing the Event view, nothing more is needed.

Problem 1: testing the format

Seems that there aren't a lot of testing framework for you microdata. Google is the provider of the main one: the Rich Snippet Testing Tool, but is not so simple and clear to use.
Note: the Google SERP page is supporting the event entity but note that providing the Event microdata will not ensure you that Google will use it.

Testing a Plone Event content with collective.microdata.event with the RNTT gives me some feedback of page validity, but also a lot of warning about missing information.

Also: Google documentation seems not really updated right now. Their example still use other microdata formats while a warning message explicitly suggest you to use schema.org.

Another tool commonly suggested is a JavaScript based ones: Microdata Tool.

Problem 2: changes at Plone templates

Changes at Plone code are simple: the new event_view template provided can be a simple copy of the one from CMFPlone.
There's only a single change that I did and I didn't like and this is related to the position where to put the itemscope and itemtype attributes.

Starting from Plone 4, the "best" way to create views for Plone content type is the one that use content-core slot of the main template. This change simplyfied content's view because common fields, like title and description, has been moved away from views: we no more need to copy the same code in every template.
However the old-style way used in Plone 3 (filling the main slot) is still available.

What is the problem? The right way there is to to put itemscope and itemtype in the container element that contains all event data, but right now the main event container is outside the event_view.pt template.
So: the new view provided with collective.media.event is gone back to the Plone-3-style, using a template that fill the main slot.

I don't like this. What is missing there is a way of being able to mark a content in some way, and let Plone extract the itemtype from there.
This is probably a simple change to be done to the Plone core (probably getting the itemtype of contents using an adapter can be enough).

Next step(s)

Let's sum what other features are missing to Plone for being able to make contents use microdata.
  • mark events also in folders views, collections... (know that, in a page, you can provide more than one event)
  • do not forget the new plone.app.event!
  • a way to provide other formats from schema.org to other content types

Saturday, June 30, 2012

Approaching automated images optimization in Plone

State of a naked Plone
Is not a news that images inside your site commonly cover a big percentage of the total size of the page (and probably also a good part of the information provided).
You can obviously optimize your Web server for using browser cache and take care of a lot of additional trick, however this probably only solve some kind of problems (I mean: you must do it... it's important!).

There are sites where images are changed (or added) frequently, maybe every day or hour. You can still think about use only small sized image, but this is not always simple to do (for a lot of users) or applicable.

Let think about a Plone site where your main page is a collection that show news.

How a Plone collection of news looks like The screenshot above is taken from a basic Plone 4.2 site where I simply used a collection of news with a modified version of the folder_summary_view template (where I show the 400x400 resized format called image_preview).

The original sizes of images above are not giant but not too small: first news is not giving you any image, 168KB for the second news, 266KB for the third news and 328Kb for the fourth news, for a total size of 762KB).

Now: don't forget one of the most useful Plone feature: the integration with PIL.
Using the PIL (or Pillow ;-)) library Plone is automatically (server side) resizing your image when you ask for a resized version of it (offtopic: when I explain Plone features to users, this is still one of the preferred ones).

So PIL is doing a lot of good job here: we are not downloading 762KB that browser will simply resize after download, but we download a resized version of the image istead.

You can see this in the "Document Size" report taken from the Web Developer Toolbar:
Document size: thanks to PIL Now you can ask: are not the 226KB of JavaScript the real size problem of the page?
Not exactly:
  • JavaScript source can be gzip compressed by Web server (like Apache in front of Plone)
  • JavaScript source are probably also cached in browser; although you update your site's contents frequently (like said above: you add news every hour) JavaScript is always the same. So: it simpler to use the browser cache with JavaScript than with images.
About the last note above: excluding layout images (logo, icons, ...) you must not forget that in Plone images are contents.
About image optimization
I remember a very interesting chapter about image optimization in one of the last book I read (Even Faster Web Sites, from Steve Souders): I learn a lot of information about different images format and problems using them.

The main argument of the chapter is about lossless image optimization.

When you use images for the Web you are often wasting bytes that you can save instead.
I'm not talking of compressing the images while loosing information and give to your users uglier images just for save some kilobytes: I'm talking of saving bytes while keeping the same level of visual information.

The book above talk of a lot of command lines tool that do the trick:
What they do is: optimizing the image when possible and removing image metadata.

Some weeks before reading the book, Denys Mishunov show us a cool tool for Mac front-end developers: ImageOptim. What this tool does is nothing more that try to run all of the tools above (and some other) on images that the user provide.

So: front-end developer must take care of providing the best image compression and optimization they can. Tools like YSlow or Google PageSpeed can easily help you to find images that you need to optimize.
This also should help your site with search engines optimization.

Let's go back to the example page above: I will probably need to run image optimization on all my Plone theme images once but, as you can see above, layout images are a minimal part of the total size of the page.

What I really can't do is: force my users to optimize images before loading them!

What we can do in Plone
My first idea:
As all tools above are command line tools, why don't use theme inside Plone? Why don't call theme as external processes before storing data in Plone?
I'm not the first that think about this task: Jon Stahl wrote a couple of articles about Plone and images optimization two year ago.
Inside the article you can read a sentence that say "Doing a mediocre job on this would probably be pretty easy, but it will take some focused effort to really nail the details that will make this sing"...
I quickly understand that my idea was exactly the kind of mediocre job Jon is talking about :-), and he's right.
Let's move on to understand why.

I put my idea in an alpha product for Plone: collective.optimage.
Do not use it in production until you read carefully the documentation and know what it's doing.
What the product will give you is:
  • react when a new image file is provided to IATBlobImage contents
  • take the mimetype of the file and run all registered optimization handler configured onto the blob file
  • substitute the original blob with the optimized ones
The product can be also configured for using some kind of image optimization tool while ignore others. Indeed the product will do nothing and force you to register manually additional ZCML, one for every external tool.
You can also easily provide your own.

Let's repeat the test with the same page. I added collective.optimage to the buildout then uploaded again all images inside news:
Document size: thanks to collective.optimage We saved 7 kilobytes, but a lot more if we refer to full size images:
  • 147 Kb instead of 168 Kb for the image of the second news
  • 94 Kb instead of 266 Kb for the image of the third news
  • 299 Kb instead of 328 Kb for the image of the fourth news
Total size now: 540 Kb instead of 762.

Problems of the approach
Apart some technological choices I did (just to be not forced to monkey-patch Plone code), the main problem is low performance.

When you use collective.optimage, your Zope thread is running an external process (if you configure more than a provider for the same kind of image, you'll run all of theme) and it's waiting for the execution to end.
Depends on image, format and tool used, this can be a long task: saving the image can became 2/5/10 seconds slower.
Some big images require a lot of time, while some tool are slower than other (for example: I provided an optimization handler for pngout, but it's disabled because I found it really slow).
This is not what you need if your Plone site can host a lot of concurrent editors.

Other approaches
Why don't run this task as scheduled job during the night? Why simply try to optimize all image blob file when the server is not working at 100% (like you'll probably do with a static HTML site)?
This can also be done offline (after all: with blob support images are simply file on the filesystem).

This is a task possible solution but you will provide to your users the unoptimized version of your images when they are "new":
  • editor save the news item
  • first-day visitors will download the unoptimized version
  • night job optimize the image
  • other visitors will get the optimized image from here to end of times
This can be enough if your Plone site is an image archive but not if your main need is optimize news, or other images that expire quickly. A news item image in a productive site can be downloaded thousands of time until another news take over it. Who care about old news?

An approach in the middle can be delay the job for some seconds or minutes, putting the task in a queue of "image to be optimized". This can probably be reached quickly using plone.app.async (so I bet on this as "best solution") but note that this didn't solve totally the delay from having the optimized image available to visitors (but this delay can be really short, like some seconds, in most cases).

Stop! Why collective.optimage is not working on my News Item?
Because I found (and I was stuck when I discover this) that right now images of News Item content type in Plone are not stored in blob. The plone.app.blob product is only working for File and Image content types. I hope this will change in future...

Instead of supporting also non-blob-image, I preferred to test those features on a patched plone.app.blob version that support News Items also.
You can try my fork of plone.app.blob.

Saturday, May 19, 2012

Documentation for Plone products: some bad and good attitude

Times ago I wrote some notes about how to release Plone products. I think I can add some other information to that article, mainly about documentation.
The "documentation argument" is already in the old post (section "Deprecated README") but I want to talk about something different.

One time again: all those minor issues are what I find difficult to explain when I give training to customers that want to learn Plone development.

Let's talk about how documentation change when you release a new version of a Plone product!
HISTORY.txt / CHANGES.txt
Sometimes I surf the Cheeseshop because I need some non-Plone-Zope library and the first thing I miss of the Zope world is the lack of a changes history directly given on the Pypi page.

I've not a bad behavior warning about this obviously! I think that every other framework in the universe must follow the Zope and Plone community example (what I mean is: the history file being also part of the long_description egg option)!

Commonly when history file is well done can also help you to find new features or dependency version changes... this great (but: keep reading)!
README: uncomplete
There are good products (often, sadly, developer targeted product) that can give you great things but have very poor documentation. What I can say about them? This is wrong for me... nothing new. I never liked a simple description of the product, maybe with a sentence like "look at the code and tests".

One example: http://pypi.python.org/pypi/Products.contentmigration. A really good product with a poor documentation.
README: good at the first release, deprecated later
Another bad attitude is to create a good (great) README file at the first release, then never changing it again later (yeah, I know... I've already touched this argument but let's going deep).

One of the thing I don't like is when the HISTORY.txt file also became the source of documentation. Sometimes reading the HISTORY/CHANGES section of a product you can find additional information or features.
I mean: this is good, but not only in the HISTORY section!

When you learn this about Plone, you'll start to read also at the history section of the product you like, but this is sad. Why don't update also the README?

Why this happen? I don't know. Maybe for laziness, maybe because you think that this new little feature is not important... who knows?
Sometime I think that the problem is also consequence of the work of multiple developer. Maybe the Guru Plone Developer did the product first, the another Little Plone Developer added a tiny feature and he someway fear to update also the README file.

So: if you are a product developer, please update the documentation. If you like to use 3rd party products: look at the HISTORY.txt: you could find good news there.

Example: did you know that, when using LinguaPlone you can rely onto a Plone site root view (@@language-switcher) that will redirect you to the proper language folder? This can be found in the history of version 3.1a1.
Package numeration changes
Have you ever experienced this? You installed a product version x.y.z, and it worked. Then a x.y.z+1 is released. No documentation changes, few lines of code in the history, that simple says "we fixed this issue..."... but your upgrade was not so easy, or probably failed.
Looking at the source, you discovered that this release is really different from the one before.

You will suffer of this problems commonly only when you need to keep compatibility with older Plone releases. Another example (still related to Products.contentmigration): what is the last version you could use on Plone 3?
For what I know is the 2.0.1 version. The is also a 2.0.2 or 2.0.3 version (and you can think: this is only a minor release, it's better to keep it) but this will not work on Plone 3 anymore.

This is only a problem of how numbers are used in the release version: given x.y.z I like to:
  • keep x for a deep technological change (for example: now your product will require PostgreSQL to work)
  • keep y for a feature change or addition
  • keep z only for bugfix
You see it? It's the same logic that Plone use!

What about a version that drop support of an old version of Plone?
Probably the most user friendly choice is to change the x number (Like LinguaPlone did: starting from version 4 and above you are warned that is only a Plone 4 compatible version... unluckily you need to go to the history section to read this! :-)),  maybe changing only y can be acceptable... but in my opinion not z!
Please, keep it for bugfix only!

Monday, April 23, 2012

Form in Plone: a simple approach using collective.wtforms

If you follow the Plone-Developers mailing list, you probably already know about a recent thread called "Rewrite old cpt forms to new technology like z3cform". If not: let simply me say that it's talking about removing old Plone stuff, replacing it with the new-way of doing form in Plone: z3c.form.

Although this is a very interesting discussion (that can also help you understand how things works inside the community) the argument of my article came from a single comment of Nathan Van Gheem, that introduced to me collective.wtforms.
This is a Plone integration for a Python, framework independent, library that generate forms: WTForms.

WTForms in general
Using WTForms in Python seems really easy, as introduced in the Getting started section of the documentation. The concept behind are the same we already know from Zope and Plone libraries:
  • a schema definition
  • a set of field types
  • a set of widgets
As it is a Python only framework we don't find ZCA around us.

Using collective.wtforms
The collective.wtforms package is simple. The Plone integration seems a simple work. It only gives you a base WTFormView class (a Zope 3 view that easily integrated you form in the Plone layout) and a WTFormControlPanelView class (if you ever need a Plone control panel form).

That's it.

Inside the view definition you must then use the basical WTForm features. Let's see an example:
from wtforms import Form
from wtforms import TextField
from wtforms import validators
from collective.wtforms.views import WTFormView

class Form1(Form):
    one = TextField("Field One", [validators.required()])
    two = TextField("Field Two")
    three = TextField("Field Three")

class Form1View(WTFormView):
    formClass = Form1
    buttons = ('Create', _(u'Cancel'))
    #label = _(u'Form 1')

    def submit(self, button):
        if button == 'Create' and self.validate():
            # do fun stuff here
            self.context.value = self.form.one.data
Then you need a zcml registration:
  <browser:page
      name="form1"
      for="*"
      class=".forms.Form1View"
      permission="zope2.View"
  />
You can find the example above, and other discussed later, in the package example.wtforms.
Problems
Of course, as always, is simple making simple things.
I found mainly two problems: the widget layout and internationalization.

WTForms widget layout
The template definition done in collective.wtforms is enough to display a form in the Plone way, however when displaying the "real widget" code we are using the WTForms core features. In that case we sometimes see some strange HTML (I mean: strange for Plone users).

One example: when using RadioField fields, the form radio set in wrapped in a UL/LI HTML structure.

This is not a big problem, just I want to say that this is uncommon in Plone forms.
Obviously WTForms can be extended and supports custom widgets.

A bigger task: I18N
A bigger problem was internationalization. The current alpha version of collective.wtforms (1.0a3) doesn't support internationalization of the UI, however fixing this is simple (you can find my changes in a fork of the original project)

With small changes you can see a fully translated of:
  • form title
  • form general description
  • submit buttons
  • fields label
  • fields description
The main problem is that WTForms doesn't support any internationalization.

Recently they added a new i18n module that helps users to translate the internal label (like: the error message after you didn't provided a required field). However this is not usable out of the box in Plone, because Plone translation mechanism is not the basic Python ones.
I tested it adding an italian translation to WTForms (it was missing in the core, so I also provided it to authors and they quickly integrate it. Man: I really love open source!) and I see no difference.

So what I did is to integrate the native ".pot" translation file into the Plone environment and leave this translations to the Zope Page Template engine... and obviously it worked!

Then: I needed some other simple fixes (like: we can't directly render the WTForms label, but we need to use manually render it using TAL).

Again: the the fork for see some code.

Vocabularies
Translating the vocabulary labels for select, multiselect and radio fields was not so simple. WTForms simply want an iterable argument named choices.

What I was forced to do (better patterns are welcome) is to provide a VocabularyWrapper class where vocabulary labels are translated accessing directly the translation machinery.

Conclusions
I'm sure that we can find also other form library outside Zope (Deform can be another valid choice and also YAFOWIL), however I find the use of WTForms really simple and easy to learn.

Sunday, March 18, 2012

Views hits counter for videos in Plone

Some years ago we developed a Plone site that mainly provided video and multimedia contents. One of requirements was to obtain a views counter for videos in a classic YouTube style, but this feature at the end was cut off.
However the idea to obtain this in Plone for collective.flowplayer persisted in my mind (another product in the set of "products that I want to develop someday and probably I will never do because I don't have time").

Recently I traveled from Ferrara to Rome by train, and back forgetting the book I was reading at home, so... why don't take some time to investigate again this old feature?

Literature
First of all: what are the know ways of performing this task?
I took some time to look at the Web; unluckily I don't find some general rules or patterns. I though this not as a "visitors counter" feature: if a user visit the page where we are displaying the video player, it doesn't mean that the user will see the movie.

Now the problem: even if I don't want a perfect system and I don't care if after 100 views the counter will show 90 or 110 (the error will be the same for all views in the site, so statistically speaking, useful), I'd like to create a system where:
  • an evil visitor can spent time to raise manually it's video
  • an evil bot can't raise my counter to 10.000 in few seconds!
Let's move on, step by step on what we can do.

Monitoring the Play of my clip
Flowplayer JavaScript APIs are well done and already integrated in collective.flowplayer, so in Plone. The simplest thing we can do is to rely on the onStart clip's event. After this event has been captured by our callback we can think about sending a call to the server, that mean "the user is seeing the video".

The question: is this enough? If I provided a 5 minutes clip and my lazy visitor only see few seconds after moving away, can I count this as a new video view? Even if my super-fast Internet connection already downloaded and buffered it all?

My answer is "no" (however this can't be taken as "the right answer"). Let's try to think about a system that monitor only if the video has been totally view!

Monitor the end of my clip
The next step is to use also the onFinish event. We can monitor if the user starts to see the clip but also we can check if the clip is completed. In this way we can send to the server our message only when the video has been finished.

This is a step forward, but we can't be sure again that the user really watch the video. He could start to watch at the clip and then move the slider some seconds before the end.

Monitor cuepoints
The two JavaScript events above were already know by me, I used them while developing collective.flowplayer_toolbar.
What I learned looking at the documentation is that Flowplayer APIs also support cuepoints. Cuepoints are a set of events callback that are automatically called every n seconds (let me use 5 as n for our example).

Now we send to the server the final message only if the onStart event has been executed and if all cuepoints also are also executed. To do this we simply use a counter that is increased after every cuepoint execution.
We still rely on the onFinish event, but we can use JavaScript to be sure that the message is sent only if the counter reached an certain amount. This amount must be a value that depends on clip duration but again: Flowplayer APIs contains method to obtain video duration.

Starting to think at the Evil Guy
In a perfect world, where all are Good Guys and no one will ever try to break rules, the general description of the code given above can be enough. But we have Evil Guy.
Who is the Evil Guy? Is a technically-low-level visitor that don't know how to write code or hack JavaScript, but simply try to raise a views counter of a clip. Is a cheater.
How he can do? First of all he can spend a lot of time clicking the "Play" button again and again, every time the clip reach the end (Evil Guy always has a lot of free time).

To protect our site from this type of attack we simply need to monitor that this clip has been already view by that user. When the video starts we can send a call to the server and get from it a generated random token that we keep secretly in the JavaScript environment and on the server itself.
We will not use cookie for this, because Evil Guy can quickly learn how to manipulate them, so we choose to keep this information in the Zope session (this can lead to problems with multiple Zope instances, in that case we probably need some complex RAM cache). We store also the video path, because we want that users still able to look at other site's clip (and raise counters normally on them).

When the video reach the end we still send to the server our message but this time we also send the secret token. Only if the token match we raise the counter.
Another click of the Play button will call again the server, but this time we can see that there is already a token for that clip stored in the session. This mean that the visitor already saw that clip. This time we will stop immediately any other operation.

Even if the Evil Guy reload the browser page, he can't do anything else until the session expires.

The collective.flowplayerclipviews product
Travel from Ferrara to Rome is long, but not enough! All I described right now is more or less what you will find in the first version of collective.flowplayerclipviews. As you can imagine, I'm really far from the target!

Evil Guy gets smarter
Evil Guy can also start learning programming, and so know that JavaScript is a client side language and he can cheat, stopping the boring time he need wait for press the Play button again.
First of all: destroying the browser session is simple as close the browser or erase cookies.
After that he can simply take the secret token from the server and call the server again few milliseconds after, pretend to see the clip!

So now we can start to think about video duration server side. As we said, Flowplayer can give to you the video duration inside JavaScript environment but what about server side? There collective.flowplayer rely on the hachoir suite: inside all multimedia content Plone will store some annotations about video information (mainly: width and height).
Unluckily for us, right not collective.flowplayer is not storing also the video duration (that hachoir supports, but Plone doesn't need) so for now let simply think that collective.flowplayer will do this in the future... and we are using this future version in the rest of this article.

Having the video duration server side can be used to stop Evil Guy from raising the clip counters very quickly. Even if he write a program that take the token (simulating the Play button) and immediately call the server (simulating the clip end), we can also check that the latter request that mean "video terminated", arrives a certain amount of time later the "video started": this amount of time is the video duration!

This will be a short victory. Evil Guy can then start running hundreds of cheating Play operation contemporaneously, wait for the video duration, then send the finish message. In this way he must wait for the video duration, but he can raise the counter of hundreds/thousands anyway.

Can this be avoided? The only way is memorizing the address of the Evil Guy and keeping it in memory for some time. However ZODB is not the place for this kind of temporary data: we can think about storing this information outside, again in a RAM cache environment, or an external database, but we need to keep the write operation on ZODB at minimum.

ConclusionsAfter all those changes we can say that we have a good system... but we need to keep in mind that we can't be sure that the visitor really see our video! He can press the Play button, the go to take a shower! World is not a perfect place.

The collective.flowplayerclipviews right now is only a proof of concept, not to be used in production. If you look at the source, you can see that the _getClipDuration method is not implemented. After that all Plone feature will be there (we still need the external IP address storing structure).

During the time I spent writing this article (commonly it requires me some sessions), I tried again the search of general documentation about my argument. This time I added "plone" to the set of keyword... then I discover that we already have a Plone product that does this task (my fault: I don't googled well)!
I'm talking of collective.piwik.flowplayer! This product is part of the Plumi suite. I already checked Plumi before starting this article, but I miss that feature. Also know that this product is right now deprecated in favor of collective.piwik.mediaelement.

As many other Plumi internal submodules, those products are usable outside a whole Plumi site. Both modules are not implementing the counter feature in Plone but smartly rely on an external software: Piwik.

Piwik is an analytic software, we can say it's a competitor on Google Analytics, based on a JavaScript snippet that you must put in the page.
Apart all other analytics features, looking at the source seems that it is simply checking the Play button pressure... (let me say that I tend to complicate my own life and this is probably the good way). If this is enough for you, I strongly suggest to rely onto this service.

Conclusion (this time, really)
I hope I shown you that having this feature in Plone is possible.

Sunday, February 19, 2012

Honey I Blew Up My Plone Product!

This article talk about Plone products but I think this is a very general argument that hit every programming language: the lack of modularity in products.
Know that:
  • I'm not talking of the Plone core, that is obviously going in the right direction (the plone.* and plone.app.* universe is one of the clearest example) but only of products add-ons.
  • I'm not also talking of bad code (no "Spaghetti code"). Maybe your final result is great and will change the world! I don't care.
Your first time
If you are not a genius, probably the first Plone product you did was crappy. Apart of the features in it, its structure was probably something like a single product called "Products.MyProjectXy".
What this mean is: all Plone features you need in a single product (a real monolith!).
Maybe you were already an experienced programmer (so your product was well designed... inside) but the final result was not the best.

After some times, when you started looking at other product's code (or if you bough the Martin's book), you probably quickly learned that splitting the product is good: the Separation of concerns is really important.

What is the level of modularity you need to reach? In how many different eggs/packages you need to split your project? I've no a real response about it... we have a Third Normal Form about software design? I don't remember too much of my lessons about program engineering... :-)

Thinking modular
Let me say: splitting a Plone project in more than one egg is "boring". Sometimes you have the sinister feeling that you are loosing your time (and I also remember times when ZopeSkel wasn't there to help us) and the temptation to ignore this practice. Maybe a Demon of Bad Software is talking at your soul in that moment.

An example: you are in a very productive state, lines and lines of wonderful code and new features are flourishing under you fingers (Daniel Pink call this the "Flow State")... then you develop something you know it's better to be moved away from the module you are developing because you know that this choice, one day, will pay you back. But to do that, you need to stop your "Python Frenzy" for some minutes.
Don't know you, but this sometimes hurts me. However I know this is good and must be done so (bleeding) I move the code away. Not later! Immediately! Why? Because the path to "Products.MyProjectXy" is easy... when you are there, it's too late!

Every time a customer ask you for a new feature:
  • think if this feature can be made more "general purpose"...
  • ... or if another customer can like it ...
  • ... or if someone else already asked for something similar.
If at least one of those answer is true, probably you must develop something "more general".

Too famous Plone features (sometime more that the product itself)
Some concrete examples.

Let me say that sometimes, when you create a Plone product, you don't know how useful to the community it can became in future. Worst, sometimes a feature inside you product can be useful to other...

... is not always easy to "think at future": the list of projects that follow talk about great Plone products I used a lot of times and the saved my time. So probably done by great developers... however inside them I can still found sometimes the problem I'm described above.

The Kupu link administration
We have all moved away from WYSIWYG Editor Kupu, preferring the famous TinyMCE.
However Kupu still have a useful feature inside it (I still remember the first time I see it at one of Plone Conference... woah!): an administrative tool that can transform your links inside documents from "normal" ones to the "Use resolve UID" form, or get back to the normal form.

Unluckily this feature is shipped with the Kupu product (and: using AJAX stuff, it need that Kupu si really installed to work properly).

The Maps location field and widget
The product Maps give us some simple but direct features of integration with Google Maps. Also, it contains a new type of field and widget: the LocationField (that can contains 2 coordinated values) and the LocationWidget (that help filling the LocationField using Google Maps interface).

This new field/widget pair can be very useful, but this feature is inside the Maps project, not given as external dependency. Also you are forced to install the product in your Plone site if you need it (having it only in your buildout is not enough).

Flowplayer metadata extractions
The well know collective.flowplayer is giving simple Plone Video features to Plone. Instead of simply giving you the video player integration, it contains a simple piece of code that extract video metadata from uploaded files using the 3rd party library hachoir.

We have two problem there: if you need the metadata extraction feature you need to add Flowplayer to your buildout (minor issue: this time you don't need to install it) but also you are forced to use hachoir, that is not a perfect library (first of all: it brakes you pdb!).

Other examples?
I'm sure there are other good products that give you additional features you can like to have also outside the products itself.

Conclusion
Keep you products compact but always think about software reuse. Experience done with your own customers and users can help you to know when a piece of technology need to be moved away and maintained alone. Also: small products are simple to be maintained.

So: don't listen at the Demon that say "Enlarge you Egg".

Friday, January 6, 2012

Tabindex: handle focus and JavaScript events on every elements

My last reading (still in progress) is a book about JavaScript: "JavaScript Cookbook" (Shelley Powers - O'Reilly). It's full of very interesting chapters (probably it will feed many future article here).

Today I will talk about a JavaScript behavior learned in one of the examples found in the book, related to JavaScript events inside Web pages.

Types of usable event
Using pure JavaScript, not all existing events can be used on every page elements. For example: one of the first things I liked when I started using jQuery year ago was the possibility to use onclick event for whatever DOM page element, also in Internet Explorer (that commonly dosn't allow this event to be fired for all types of element).

The example I found in the book talks about another event: onkeypress.
What kind of elements can react to this kind of event? Commonly only elements able to take the focus (like links, but more common are form elements like textarea of input).
The example shows how using another uncommon JavaScript attribute we can change this behavior, enabling all other page elements. I'm talking of tabindex attribute.

How to use tabindex
As I'm mainly a Plone programmer, when tabindex was removed from all templates of the CMS (don't remember exactly, but I think it was when Plone 3.0 was released) I was happy. Why? Because tabindex is commonly not needed in a well-designed Web page.

The tabindex attribute define the order in which a user navigate the page using the keyboard (the TAB /SHIFT+TAB key) but if you define the order of your DOM elements in a logical manner you will simply not need this attribute. However note that it is not deprecated:
  • Some non-common pages can need this attribute to force the navigation first to a page section that isn't at the top (to be honest, I can't find a simple example of this)
  • A value of tabindex of "-1" make an element, commonly navigable using keyboard, not accessible anymore.
See this good tabindex reference for more details.

Now the great tabindex magic I learned: when tabindex is used on a DOM element (with a value of 0 or more), it magically gain the power of obtain focus.

Let see some examples.
Please note:
  • I tested examples with Firefox, Opera, Chrome and Safari, on MacOS. With older Internet Explorer versions you probably need some fixes in event handling.
  • If you are on MacOS, use Chrome because the keyboard navigation with TAB is a mess. See below for more on this.
The first example is a simple HTML page with 2 links, one paragraph, and where onclick and onkeypress on those elements raise simply an alert message (the text inside the node).

<html>
<head>
<script type="text/javascript">
<!--
window.onload=function() {

    var handleAction = function(event) {
        alert(this.innerHTML);
    }

    var p = document.getElementById('foo');
    p.onclick = p.onkeypress = handleAction;

    var links = document.getElementsByTagName('a');
    for (var i=0;i<links.length;i++) {
        links[i].onclick = links[i].onkeypress = handleAction;
    }

}
//-->
</script>
</head>

<body>
    <a href="javascript:;">Link 1</a>
    <p id="foo">
        Test here!
    </p>
    <a href="javascript:;">Link 2</a>
</body>
</html>
First of all the mouse event: you will be able to click on both links and the paragraph (not sure of this in all IE versions as I'm not using jQuery there).

Now let's try the keyboard navigation. Using the TAB you will be able to move only onto link elements. When one element get the focus, clicking a key you will get the same message you get for a mouse click.

If you look at the code you'll see that I'm trying to register the onkeypress event also on the HTML paragraph, but it is ignored.
This is right: a world where every DOM element can get the focus will make keyboard navigation painful. But: how can I do if I really need that a single DOM element (commonly not focusable) could take focus and keyboard events?

This is where tabindex help. If you look at the second example you will see that the only difference is the use of tabindex on the P node.
<html>
<head>
<script type="text/javascript">
<!--
window.onload=function() {

    var handleAction = function(event) {
        alert(this.innerHTML);
    }

    var p = document.getElementById('foo');
    p.onclick = p.onkeypress = handleAction;

    var links = document.getElementsByTagName('a');
    for (var i=0;i<links.length;i++) {
        links[i].onclick = links[i].onkeypress = handleAction;
    }

}
//-->
</script>
</head>

<body>
    <a href="javascript:;">Link 1</a>
    <p id="foo" tabindex="0">
        Test here!
    </p>
    <a href="javascript:;">Link 2</a>
</body>
</html>
However, the behavior difference is big.
You can try again the page with keyboard navigation with all browsers used before (again: I suggest to use Chrome if on MacOS).

Now you are able to give the focus to the paragraph! Also, keyboard event is now raised on the paragraph. You probably get the point...
Tabindex attribute can make focusable whatever element of the page and you are able to use keyboard events on them.
Isn't this great? I think that could be helpful sometimes.

Browsers differences (AKA: MacOS hate TAB)
Let's change the article focus.
Testing example one and two with all defined browsers shown some different behavior.
The tabindex definition says that element with tabindex are the first that take the focus (from the lower value to the upper), keeping the DOM order when values are equals; only then, all other focus capable elements are traversed (still, using the top-down DOM order).

For a reason I never deepen, on MacOS (but workarounds can be found on the Web) only form elements seems able to take focus (links not). Also with Firefox, that on other Linux/Windows environment works like a charm, I can't put the focus on links.
The only tested browser on MacOS that works as expected is Chrome, however also Chrome does something unexpected: if you test the second example you will see that the DOM order are kept (while the tabindexed ones must be the first).

But example 2 works also on other browsers on MacOS: you are able to put the focus using TAB key onto the paragraph.
So you can think "I can try to put a tabindex on every DOM page element I want to be accessible using keyboard to fix this MacOS problem".

If you test example 3 (that put "tabindex=0" also on A elements) you'll that this is false. Still, links are not accessible using keyboard. Strange...