Posted by: felipe | March 3, 2009

Thinking Ubiquity in Portuguese

Making Ubiquity work in different languages is a harder problem than simply translating its strings and making simple adjustments so that the translation feels correct in the target language. Since Ubiquity provides a natural language interface between the user and the computer, the way that the user interacts with the commands should feel natural at his language, conforming (although not strictly necessary) with the language’s grammar, and specially conforming with how the user thinks and expects to give commands in his own language. Here are some thoughts of how the Ubiquity NL interface could work in portuguese.

In portuguese, the verbs inflect a lot. A verb is said slightly differently for each variation of tense/mood/person/number. However, except for the so-called irregular verbs (happens a lot in linking verbs, which are not very useful for ubiquity), every inflection keeps the same root (or stem) from the infitive form. This means that we can let the user write in any form, and from that we can detect the verb root and match that root against the database of commands.

This would work well for most of the verbs commands from Jono’s list. Most of those verbs would be written in imperative form, in a direct interpretation of giving a command to the browser. Some users would prefer to write it in infinitive form, because it’s common in portuguese, but the parsing wouldn’t have a problem with that. So far so good.

However, not for all verb commands that works. For example, the command map is troublesome. The verb map (pt: mapear) in portuguese is somewhat rare, being much more common its noun form (pt: mapa).  We would think of the command “map san francisco” much more as:

pt: mapa de são francisco
en: map  of san francisco

rather than

pt: mapeie são francisco [para mim]
en: map    san francisco [for me]

Still, we can take advantage of that. Again, we can use the same root in our favor. Not only the inflections of the verb map keeps the same lemma, but since they come from the same origin, the noun map (pt: mapa) does to. So, when we detect a noun instead of a verb, we can interpret the command “pt: mapa de são francisco” (en: map of san francisco) as:

pt: me dê um mapa de são francisco
en: gimme a  map  of san francisco

or other variations like “show me”, “find me”, “search for me”. Note that this fits well in the concept of the overlord verbs.

Now, if we discard the preposition of off of the parsing (which is a sensible thing to do: prepositions are generally considered stop-words), we can reach a generic point where we can interpret both the verb form and the noun form in the same way (using the shortcuts for parsing the overlord verbs):

pt: mapeie sao francisco  (verb form) or
pt: mapa de são francisco (noun form)

After stemming, this command would fall into the root map_, and can be interpreted either as the “gimme a map” (overlord verb + noun) or “map” (just verb) example. I would favor for the verb + noun, since it sounds much more natural portuguese.

So here we have a possible solution to tackle both verb commands and noun commands. Hyphenated-phrases commands can receive the same treatment as verbs (if they are kept, I believe they will be gone in favor of the overlord verbs, right?)

Transitive verbs in portuguese has the same order as in english (i.e. verb + direct object + possible indirect objects) and are similar in nature, so the parsing order would need no different treatment. The most important thing that changes the order are adjectives (here they come after the noun), but adjectives are not usually in the context of the command, only in the context of the content (in an arb_text usually), so they wouldn’t need treatment in the ubiquity parser level.

On the commands as names of websites or services, I’m still putting some thought on that and will post a follow-up soon.


Responses

  1. […] a Ubiquity user, put together a wonderful look at what Ubiquity might look like in Portuguese. He has some great points here particularly regarding the “map” verb used in […]

  2. […] I read this post about “Thinking Ubiquity in Portoguese” and Mitcho’s blog, I started asking to […]

  3. […] Thinking Ubiquity in Portugese, by Felipe […]

  4. I am against translations of any firefox addon, but in especially against a translation for ubiquity.

    See, the point is:
    Translating extensions bloats them!
    Particularly in the case of a program like ubiquity, this will bloat so much it will blow up.
    And the more bloated extensions you have, the slower your firefox gets.

    I really don’t wanna be an asshole, but in this case, I really have to say that people who are too stupid to remember 23 english imperative forms should stay away from a computer anyway.

  5. I don’t feel that translating Ubiquity to Portuguese or any other language is such a great value, but I certainly don’t agree with A+’s arguments. Can’t see why would the plugin be bloated. I can’t imagine the idea would be for a single plugin object to contain all the languages. There would certainly be different versions of the plugin. That said it seems to me that in the case of Ubiquity the effort to convert it to another language would be to great. This is because Ubiquity is very particular due to the fact that the way it works depends on natural language. Plus, translating Ubiquity would actually mean translating all the commands. Since Ubiquity allows anyone to create and share their own commands in a simple way, I doubt that the effort to get the commands in different languages would be very successful.

  6. você me entende?


Categories