Signature

Working with WebKit as a UI compositing engine

UILayer

UILayer provides a JavaScript API on top of WebKit for working with the concept of layers. Instead of manipulating DOM elements using a myriad of mixed concepts, you go though a single, well defined API.

If you’re running a WebKit browser (Chrome, Safari, iPhone, iPad, etc), have a go with one of these:

Read more at rsms.me/uilayer →

—Oct 16, 2011

February 24, 1955 – October 5, 2011

Here’s to the crazy ones

The misfits. The rebels. The troublemakers. The round pegs in the square holes.

The ones who see things differently. They’re not fond of rules. And they have no respect for the status quo. You can quote them, disagree with them, glorify or vilify them.

About the only thing you can’t do is ignore them. Because they change things. They invent. They imagine. They heal. They explore. They create. They inspire. They push the human race forward.

Maybe they have to be crazy.

How else can you stare at an empty canvas and see a work of art? Or sit in silence and hear a song that’s never been written? Or gaze at a red planet and see a laboratory on wheels?

We make tools for these kinds of people.

While some see them as the crazy ones, we see genius. Because the people who are crazy enough to think they can change the world, are the ones who do.

Steve, you’ve been a great inspiration and changed so many lives to the better. Your legacy will live on.

—Oct 05, 2011

Designing a modern web-based application — Dropular.net

One and a half years ago me and Andreas released a new version of dropular.net — a new kind of web app that runs completely in the browser. Today, this approach to designing web-based applications running client-side has become popular, so I thought I’d share some of the issues, approaches and design choices made during the development of Dropular.

I designed Dropular just as I would design a desktop application — the UI and related logic runs on the host computer (client). The host knows how to present a GUI and the host knows about user input, end-user’s environment state and so on, making UI code running on the client-side the natural choice. Then again, there’s always data. Dropular.net communicates with one or more backend access points to read and write data, verify authentication and so on.

Basically, we serve only data from the access point and run almost all code in the client web browser:

Figure 1

When a client visits dropular.net, three files are sent as the response: index.html, index.css and index.js — view, layout and logic, respectively.

If you view the source of any of these files you might notice that the code looks suspiciously computer generated. That’s because it is computer generated. As part of Dropular, I wrote a web-app client-server kit dubbed oui which given a source tree compiles and produces a runnable index.html file (together with an index.js and index.css file).

Oui provides a CommonJS module interface and groups together LESS/CSS, JS and HTML into logical modules which are name-spaced.

Demo and source code

First, let’s have a look at the actual product and experience (since it’s invite-only). This is a screen cast of me using the current, live website:

Now, here’s a redacted snapshot of the Dropular source code: https://github.com/rsms/dropular-2010. Note that this code depends on oui and a few other open source projects.

Authentication

authIn a model where the logic lives in the client, security is a different nut to crack. You need to deal with automatic re-authentication, network reconnection, server fade-over, etc.

Authentication is performed in a two-step process, allowing an intermediate representation to be cached in the client, enabling automatic fail-over to other access points and automatic login when later visiting the site.

Figure 2

It goes a little something like this:

[Step 1] Client sends a request for challenge:

← GET /session/sign-in?username=John

[Step 2] The server:

  1. Verifies and fetches information about the user related to username
  2. Generates a UUID that is uses as a session ID
  3. Creates a temporary session object in memory, associated with that session ID
  4. Generates a nonce using: BASE16( SHA1_HMAC( server_secret, timestamp ":" random_data ) )
  5. Puts the nonce in the user’s session and registers a hook to clear the NONCE upon next request containing the associated session ID.
  6. Responds to the client with the nonce and the user’s canonical username:
→ 200 OK {"nonce":<nonce>, "sid":<session_id>, "username":"john"}

[Step 3] The client:

  1. Stores the session_id locally, to be used for future requests
  2. Displays a user interface where the user inputs her username and password
  3. Calculates the passhash: BASE16( SHA1( username ":" password ) )
  4. Calculates the auth_response: BASE16( SHA1_HMAC( auth_nonce, passhash ) )
  5. Sends another request, this time with a payload, to the server:
← POST /session/sign-in {"sid":<session_id>, "username":"john", "auth_response":<auth_response>}

[Step 4] The server:

  1. Calculates an auth_token: timestamp ":" BASE62( SHA1_HMAC( passhash, server_secret ":" timestamp ) )
  2. Saves the auth_token in the user’s session object
  3. Verifies the auth_response sent by the client: BASE16( SHA1_HMAC( nonce, passhash ) ) == auth_response
  4. Deletes the nonce from the user’s session
  5. Responds with the auth_token and a complete description of the user (name, email… things like that):
→ 200 OK {"auth_token":"xyz", "sid":"xyz", "user":{<user details>}}

The client stores the auth_token locally, to be used for future automatic re-authentication.

Later in time, when the client sends whatever request to the server and includes its session_id, the server will evaluate the following logic:

if session = Session.get(session_id):
  if session.auth_token:
    allow_request()
  else:
    perform [Step 4]
else:
  session = Session.new()
  → 401 {"sid":session.id}
  perform [Step 3], starting at point 4, and finally re-send original request (client)

Since [Step 3], point 4 and forward does not require user input, re-authentication—and thus backend fade-over—can be done completely in the background. The user experience will the that the original request (e.g. a click on a button to show some content) takes a little longer time than it usually would.

User interface

LayoutSince the user interface is created, rendered and maintained solely by the client (i.e. web browser), there needs to be some structure. The HTML DOM is actually a great view representation and in combination with CSS gives you control of each pixel on the screen, and at the same provides a nice separation between view structure and layout. However, these kinds of websites tend to have many different “screens”, or view states if you will, quickly making your regular HTML and CSS code a freaking mess.

What we did was to have a mind set as if we where writing a desktop application — we define logical components in folders and files that reflect the structure of these components. We then process, or compile, these sources into machine-and-network optimized code (HTML, CSS & JavaScript), just like you do with “regular” software development.

A nice side-effect of having an intermediate “compile” step is that you can write your source code in whatever language suits you and your project — your no longer limited to the languages and coding styles dictated by web browsers. For instance, you can define your layout code in LESS instead of CSS and write your logic in Move instead of JavaScript.

Downsides to a “compiling” approach

This approach obviously has some downsides, one of them relatively painful: debugging. Since the code you’re running does not directly reflect the source files you have in your structure of logical modules, finding and fixing a problem becomes harder as you need to back-track and search your source for certain things. In the end, we took a pragmatic approach to this and simply generated human-readable code that’s annotated with the path names of the source.

Templates in the DOM

For views that aren’t permanent (most views aren’t) we are using HTML templates, kept inside the DOM as data-only node trees:

<module id="drops-drop">
  <drop>
    <h1></h1>
    <img>
      ...
  </drop>
</module>

With CSS making any module node tree a data node tree, exempt from layout and display:

module { display:none; }

Some module logic (e.g. JavaScript code) then clone appropriate parts of its template, which is made available to a module using the __html convenience variable:

// Register a hook for a certain URL
oui.anchor.on(/^drops\/(<id>[a-zA-Z0-9]{25,30})/, function(params, path, prevPath) {
  // Load data for drop with <id>
  oui.app.session.get('drops/drop/'+params.id, params, function(err, drop) {
    // Make a new instance of the <drop> child node tree of our HTML template:
    var view = __html('drop');
    // Configure the view
    view.find('h1').text(drop.title || drop.origin);
    ...
    // Finally add the view to an active part of the DOM
    mainView.setView(view);
  });
});

Data storage and the problem of many-to-many

dataSince we have a very clean separation of data and presentation, CouchDB made a lot of sense to us. In CouchDB, data is represented by logical structured blobs called “documents” — basically it’s a key-to-JSON store.

Dropular.net has a feature where you can follow any number of other users and look at a feed of images created by all those users. In data terms, this is a many-to-many relationship which when using a RDBMS like MySQL is expensive (computational wise). With CouchDB on the other hand, many-to-many relationships are very easy to define and they are cheap to maintain!

Basically, we lazily define a CouchDB view per user _design/user-drops-USERNAME/_view/from-following with the following map function:

function (doc) {
  var following = %FOLLOWING;
  var user, createdBy, created; // find lowest timestamp
  for (user in doc.users) {
    var t = doc.users[user][0];
    if (!created || t < created) {
       created = t;
       createdBy = user;
    }
  }
  for (user in doc.users) {
    for (i=following.length; --i > -1;) {
      if (following[i] === user)
        emit([created, user], doc._id);
    }
  }
}

Example: GET http://dropular.net/api/users/rsms/following/drops

{ "drops": [
    { "id": "9uQyA5tTraIkiVHG5l1XdPRHwfg", "key": [1274772176060,"suprb"] },
    { "id": "cytCrjVJPQOCXiSqF1MV6GqPEt1", "key": [1273706969465,"suprb"] },
    ...
  ],
  "total": 2701,  "offset": 0
}

Now, CouchDB will make sure to run this function every time any related document is modified, added or removed, effectively keeping all relevant “from-following” indexes up-to-date. CouchDB is very good at these incremental updates, so even though this looks complex and slow, this function is compiled to an internal representation and only run on the modified values, not the complete data set, making an update both atomic and complete within a few milliseconds.

As part of Dropular, I wrote a Node.js module for dealing with CouchDB that has a very low level of abstraction: https://github.com/rsms/node-couchdb-min.

Access points aka The Server

As any app that centralizes data, authentication, etc, you need something to serve as the hub. We call these access points, as they are the contact surface between the client application and whatever lives in the central backend (CouchDB, AWS services, etc). Since I’ve been involved in Node.js for a long time, Node.js was a given choice. Actually, this was such a successful solution that we sustained over 1000 API requests/second on one single small AWS EC2 instance with less than 1.0 in load (during our initial release which caused a thundering heard-like wave of visitors). Even for a commercial website, that number is considered good.

Scalability as an effect

The Oui app kit supports virtually an infinite number of access points to be used, making this approach of running all the logic in the client an extremely scalable solution.

Figure 3

— Just add as many access points as you need, effectively scaling close-to linearly (at least as close to linear as your backend dependencies allow).

A model which relies on persistent sessions or rendering of the user interface in a central location (i.e. on the server) can never reach this level of scalability-to-price/complexity ratio.

Today & tomorrow

Today, 21 months later, the current browser technology allows for even more sophisticated client-and-access-point solutions, where everything from complex image processing (canvas 2D and 3D) to data processing (WebWorkers) can be done client-side. DOM manipulation is much cheaper, JavaScript runs much faster and OAuth 2.0 is an easy-to-use (in contrary to 1.0), suitable authentication schema for these kinds of approaches. 3D-transforms for hardware accelerated, high-performance 2D and 3D user interface effects as well as host-native, fluent animations defined in simple CSS.

I’m really curious to see what’s next — the web is rapidly transforming from a “hacky” document presentation technology to a rich application development and distribution platform with standards that make sense. No more live hacking on FTP servers or behemoth HTML-generating Java servers.

—Sep 10, 2011

An update on the programming language Move

A few months back I wrote a programming language called Move. Before the advent of Move, JavaScript (on Node.js) was my universal language of choice. Two years earlier it was Python. During the last four months I have basically been exclusive with Move — quick hacks, data mangling scripts, network services, websites, iphone apps… you name it. What initially was a fun week of language research and interviews with people, turned into a very usable programming language and library.

Today Move has evolved — from what was first released on March 2, 2011 — over nine releases, making the language even simpler and listening to user feedback.

The web is the future

It’s inevitable. The web (well, stuff over HTTP viewed in a web browser) is the app platform of tomorrow. Move provides a normalized environment, a uniform library with a standard CommonJS module system. This is actually a huge deal since what traditionally incurred the feeling that JavaScript was a second-class programming environment, should be hugely contributed to the vast number of bugs, API differences and library discrepancies implied by the web browser landscape. Although the API of today’s web browsers are generally coherent, there are still a bunch of differences your everyday JavaScript web browser engineer need to — painfully — be aware of and feature-test for.

For example, a common thing to do is to iterate over items in a list:

someArray.forEach ^(item, i) {
  print i+':', item
}
someObject.forEach ^(key, value) {
  print key+':', value
}

On modern browsers, that code will be very fast as it’s implemented natively.

In JavaScript, you would need to do some feature testing…

var i, key, apply = function (item, i) {
  console.log(i+':', item);
};
if (typeof Array.prototype.forEach === 'function') {
  someArray.forEach(apply);
} else {
  for (i = 0; i < someArray.length; ++i)
    apply.call(someArray, someArray[i], i);
}
var HOP;
if (typeof Object.prototype.hasOwnProperty === 'function') {
  HOP = Object.prototype.hasOwnProperty;
} else {
  HOP = function (name) { return this[name] !== undefined; };
}
apply = function (key, value) {
  console.log(key+':', value);
};
for (key in someObject) {
  if (HOP.call(someObject, key))
    apply.call(someObject, key, someObject[value]);
}

I would chose the Move way any day of the week.

“But I’ll just use jQuery” you say and wonder if this normalization thingy in Move is just yet another library. No. Move provides a ES5 environment, equivalent to a modern version of Chrome or Safari. As time progress, the ES5 standard will be completely implemented on all platforms. When that happens you no longer need a 3rd party library (theoretically speaking). You’re investing in learning the future standard instead of the API of a 3rd party compatibility library.

Note that some features of ES5 requires compiler, vm or otherwise host-level support in older browsers, which is thus impossible to “glue together”. On the topic, Annotated ECMAScript 5 is a great and accessible documentation of the ES5 standard.

Another pretty awesome feature of Move is that you get a standard CommonJS module system. That basically means that slicing up your code into modules is easy peacy:

import foo, bar
x = foo.someFunction()
y = bar x
export z = y * x

Part of the Move language are the two keywords import and export. Import is a convenience preprocessor for the CommonJS require function.

import foo

Compiles to:

foo = require 'foo'

The export keyword similarly converts a statement like:

export foo = 5

To:

exports.foo = foo = 5

The exports variable is part of the CommonJS module specification and represents the API that your module (source file) provides (exports) to other source code.

It’s completely optional to use these convenience keywords.

Here’s a more complete example of modules in a web browser environment:

<script src="http://movelang.org/move.js"></script>
<script type="text/move" module="bar">
import foo, capitalize
export sayHello = ^(name) {
  print foo.makeHello capitalize name
}
</script>
<script type="text/move" src="capitalize.mv"></script>
<script type="text/move" module="foo">
export makeHello = ^(name) { 'Hello '+name+'!' }
</script>
<script type="text/move">
import bar
bar.sayHello 'worlds'
</script>

Slicing and dicing collections

Move falls into the category of “end-user data” programming languages (I just made that up), thus dealing with text and lists of items is a very common task. Move provides a slice syntax which should come natural to e.g. Python programmers:

print "hello"[1:3]  # "el"

x = [1,2,3,4]
print x[1:3]        # [2, 3]
x[1:3] = 9
print x             # [1, 9, 4]
x[1:] = [9, 10]
print x             # [1, 9, 10]

Since Move is heavily based around the core concept of first-class functions, this slice syntax compiles to simple function calls:

foo[1:3]
foo[1:3] = 9
foo[1:] = 9

Which is equivalent to:

foo.slice 1, 3
foo._move_setSlice 1, 3, 9
foo._move_setSlice 1, undefined, 9

This means that any object can support slices by simply implementing both or one of slice(startIndex, endIndex) → list (getter) and _move_setSlice(startIndex, endIndex, value) → list.

Embedded HTML and compiler preprocessor API

This is a pretty awesome feature: HTML literals.

url = "http://movelang.org/res/logo.png"
img = <img src="{url}"/>
img.width = 500
document.body.appendChild img

With this feature comes the ability to plug in preprocessors to the Move compiler. Embedded HTML (or EHTML for short) is currently the only plugin that ships with Move, but the preprocessor API is pretty simple: Create a module which exports a process function process(string moveSource, object compilerOptions) → string moveSource:

export process = ^(source, options) {
  # Transform source
  source
}

Then the preprocessor need to be registered with the compiler:

move.preprocessors['my-preprocessor'] = process

Finally, specifying the preprocessor when compiling:

move.compile {source:source, preprocess:['ehtml', 'my-preprocessor']}

The order of which preprocessors are specified in the “preprocess” argument to move.compile decides which is applied first. By default Move will enable the “ehtml” (Embedded HTML) preprocessor by default when run in a en environment that provides a DOM (i.e. a web browser).

Classes — Object factories with prototype chains

Since the birth of Move “class definition” has been a common feature request: Ability to define prefab prototype chains.

Move leans toward the “good parts of JavaScript”, thus the “new” keyword (added to JavaScript simply to make it look like Java — oh politics) should be avoided. Object.create is the recommended way of creating new objects based on custom prototypes. Still, Object.create is limited to creation (which is a good thing). From the eyes of Object.create, there’s no notion of constructor, or rather; there’s no difference between a constructor function or any other function.

As Douglas Crockford states it:

JavaScript is a prototypal language, but it has a new operator that tries to make it look sort of like a classical language. That tends to confuse programmers, leading to some problematic programming patterns.

Avoid the “new” keyword and use literals or factory functions instead.

Say hello to Move’s class construction function:

Animal = class {
  age: 1,
  toString: ^{ "I'm a "+@kind }
}

elephant = Animal {kind:"slow and kind fella"}
print Text elephant  # "I'm a slow and kind fella"
print elephant.age   # 1

In the above case we define the factory Animal, having a prototype with two values: age and toString. The Animal factory produce objects with a prototype of Animal.prototype.

We can create another factory which prototype inherit from the Animal prototype:

Cat = class Animal, {
  constructor: ^(name, age) {
    @kind = "furry little creature"
    name && (@name = name)
    age && (@age = age)
  },
  toString: ^{
    s = Animal.prototype.toString.call this
    s + " named " + @name
  }
}

cat = Cat {name:"Busta", age:10}
print Text cat  # "I'm a furry little creature named Busta"
print cat.age   # 10

Note that we defined a constructor function on the prototype. In this case, calls to the factory will invoke that function (instead of the implicit and generic create function).

Constructor functions (or you could think of them as initialization functions) need not be defined on subclasses in order to invoke a superclass’s constructor:

Zelda = class Cat, {
  name: "Zelda",
  toString: ^{ "I'm awesome and my name is "+@name }
}

Since the above Zelda prototype does not define a constructor, the parent prototype’s constructor will be called (that is, Cat.prototype.constructor) when the Zelda factory is invoked

zelda = Zelda {age:5}
print Text zelda  # "I'm awesome and my name is Zelda"
print zelda.age   # 5

As usual with Move, class is simply a runtime function (__move.runtime.__class).

Actually, the Zelda factory and prototype chain can be described (and traversed) like this:

Zelda                                # [function]
Zelda.prototype                      # { name:"Zelda", toString:[function] }
Zelda.prototype.prototype            # -> Cat.prototype
            Cat.prototype            # { constructor:[function], toString:[function] }
            Cat.prototype.prototype  # -> Animal.prototype
                   Animal.prototype  # { age:1, toString:[function] }
                   Animal.prototype.prototype  # Object.prototype

Helpful command line interface

The move CLI tool (a program with a text interface) acts both as an operating system entry-point for Move programs run directly on a system and as a utility for dealing with and processing Move code.

Run a Move program:

move foo.mv
move run foo.mv

Run Move code from stdin:

echo 'print "hello"' | move run
move run < foo.mv

Output the parser’s Abstract Syntax Tree that represents your code, as JSON:

move compile --ast foo.mv

Show the JavaScript generated by the compiler:

move compile foo.mv

Show the list of global options and available commands:

move -h
move --help

Show documentation for a specific command (“compile” in this example):

move help compile

Create a stand-alone web-browser compatible JavaScript file from one or more source files:

move compile --bundle-standalone foo.mv bar.mv leet.js
move compile --bundle-standalone --basedir lib lib/*.{mv,js}
move compile --bundle-standalone --basedir lib --output bundle.js lib/*.{mv,js}

This article and examples in it assumes the latest release of Move at the time of publishing this (0.4.2).

More information on Move can be found at movelang.org

—Jul 30, 2011

Air Shaffer

Went flying with Shaffer and Lee this Friday evening, starting at Palo Alto, flying up over SFO, through San Francisco, over Oakland and finally back to Palo Alto.

—Jun 25, 2011

3 months at Facebook

I’ve now been at Facebook in Palo Alto, California for almost three months. And I love it.

This little textual outlet of mine has been silent for a while, mainly because I’ve been so caught up in a very exciting thing we’re making at Facebook, and probably will continue to be for a while.

Me in the Facebook HQ backyard

What really blows my mind about this place is how small it feels, yet we are thousands of people working at Facebook. The organizational structure is very flat and most responsibility is distributed, which is a very interesting concept. I work as a product designer—in our small but amazing Product team—meaning I do everything from conceptual development and management, to interaction design and graphic design.

Chris, Brandon, Francis and Joey

We generally have one product designer and one product manager pair up to form a “mini product team” in each project. This gives me the feeling close to that of a small start-up — “let’s do this together!”.

At the end of this year we will be moving into our own little town — a new totally awesome campus in Menlo Park — which is currently in it’s last stages of construction.

You can find a couple of images from my first two months in “My first two months”…

Moving to San Francisco

Moving to San Francisco, California from Stockholm, Sweden is a whole different story.

San Francisco seen from Twin Peaks

First off, this is a totally amazing place, full of life.

There’s tons of paperwork that is obscure, boring and tricky. For instance, while in the USA, no one will tell you that you need to file an AR-11 “Change of Address” form within 10 days, or you are breaking federal law and might get kicked out of the country. Or file for an SSN using old physical paper and pen which is then manually handled and processed by a bunch of humans.

Yes, most things here in the USA is still on physical paper, traveling in physical envelopes, just to be scanned or re-entered into a computer again, by a human.

Compared to Sweden, I’d say the infrastructure of California is about 25 years behind. Checks are still heavily used, banks are immature, etc. Being in San Francisco, you’re lucky if you get 500kbit/s over 3G — if you can even get stable enough connection. 4G here is more a myth than something that actually exists (I have a 4G modem and have yet to find a connection after 2 months of use). In Stockholm, you basically never go below 1Mbit/s over 3G and connections are very stable.

Then we have basic infrastructure, like cars, public transportation, landline connectivity, etc. Everything barely working. For instance, public transportation buses are old and technically inefficient with their giant tires and old diesel engines, spewing out black smoke. Busses like that would never even pass the minimum environment requirements in Sweden. Even the Stanford University runs buses like these.

But all that stuff is just an itch — people here are amazing!

I happily trade this lack of modern infrastructure for the brilliant openness and warmth of these people.

Highway 1 road trip makes us jumpsy

Now, back to changing the world.

—Jun 15, 2011

Spotify box by Jordi Parra

Fellow designer Jordi Parra recently finished his masters degree project “Spotify box” — a beautiful little radio-like device which plays music through the Spotify platform. What’s really neat about Jordi’s Spotify box—except from its gorgeous design—is that it brings back the physical interaction with music as an object, but adjusted to the 21st century. A playable item is represented by a small token, conveying a link by wireless RFID technology.

See for yourselves:

Some more pictures of the Spotify box, grabbed from Jordi’s work log:

—Mar 24, 2011

My take on Firefox 4

In the fast-paced world of web browsers Mozilla Firefox owns the second largest market share. Yesterday the much anticipated version 4 of Firefox was released and these are my reflections. I’ll mostly be comparing Firefox 4 to Google Chrome 11 on Mac OS X, since that’s what I use day-to-day.

Firefox 4

Startup time and general UI responsiveness

Firefox has traditionally been the “slow one” on Mac OS, mainly due to heavy disk I/O and CPU usage as an effect of the XUL technology and other tech details, like the absence of a font cache in earlier versions. Firefox 4 feels much more responsive than Firefox 3.6 (the previous version), but is still relatively slow to start when compared to Safari or Chrome, both which are native applications that don’t need to parse a ton of XML when they start (like Firefox does).

This is a minor issue and somewhat subjective.

Windows & tabs

As with most modern browsers, multitasking in Firefox is based around a tabbed user interface. The window of Firefox 4 looks like this:

Screen shot 2011 03 23 at 16x

Like Chrome, tabs have a maximum width and will uniformly shrink (horizontally) when more room is needed for new tabs. However, the window header feels rather large and a bit clumsy. Compared to Chrome, it’s actually just a few pixels higher, but the tabs in Firefox are smaller and the window contains the current tab’s title.

Since most people using these web browsers use a mouse or trackpad to navigate tabs, having small tabs (and thus hit areas) is a bad idea since the user will have a harder time hitting the right tab. Mozilla (the creators of Firefox) probably decided to make a compromise on the size of the tabs in order to fit a title into the window header, something that Chrome lacks.

I personally don’t believe in long and descriptive window (or page) titles in this context. They tend to be a mere repetition of a (in most cases) better and richer title displayed in the actual website. I would rather see that Firefox went down the same lane as Chrome and Internet Explorer 9 and skipped the window title. Something like this:

Alternative without window title

Update: Abhijit Shylanath pointed out that removing the title bar is possible in the Microsoft Windows version of Firefox 4.

Some details related to window UX where Firefox 4 got it right but Chrome fails:

Closing tabs

A really nice feature in Chrome is how you can quickly close a bunch of tabs because of how tabs align. Basil Safwat has a good write-up about this. I expected Firefox 4 to sport the same nifty UX after reading “Making tab closing as easy as click, click, click”. But no.

Closing multiple tabs

Mozilla are already on to this which appears to get fixed til the next release. (Thanks to Erik Möller for the tip).

Switching tabs

A really neat feature, new in Firefox 4, is the ability to switch to already open tabs by simply typing in (partially matching) text or url into the location bar.

Update: There’s an experimental feature in Google Chrome 11 (and newer) which can be enabled to provide similar functionality. Visit about:flags and enable “Focus existing tab on open”.

Developer’s console

The interactive console primarily used during website development is called “Web Console” in Firefox 4 and — unlike Chrome, Safari and previous versions of Firefox — it appears on the top of the window rather than at the bottom. Albeit still in a split-screen pane.

Web Console in Firefox 4

The position of the console is probably mostly a matter of taste. However, this “Web Console” is much simpler than the one found in Safari and Chrome as it only provides an interactive JavaScript console and event log. The console found in WebKit browsers features a broad set of different kinds of developer tools, including script debugger, network monitor, DOM inspector, etc. These features can however be added to Firefox by installing various extensions.

A big downside with the “Web Console” in Firefox 4 is that it dramatically slows down the loading of any website (feels like a factor of 5).

System integration

Any modern computer system (i.e. operating system, applications and settings) is a carefully balanced circus act. You have color calibration, text input, language dictionaries, UI behavior, etc. — all shared between applications to provide an intuitive, low-barrier and powerful user experience.

Poor system integrationUnfortunately due to how Firefox is built (it’s like it has its own little operating system which runs the actual app), system integration is relatively poor.

For instance, Firefox is unable to use the user’s spelling dictionary. If I open up an application (for instance TextMate or Pages), write some text and the spell checker finds a word it doesn’t recognize, e.g. my name “Rasmus”, I can tell it that “This is a correctly spelled word, please learn it” and all other applications will later know how “Rasmus” is spelled. Except for Firefox which uses its entirely own spelling system. This means that, like in the screenshot to the left, you will need to teach the Firefox spell-checker and fill the Firefox spelling dictionary from scratch.

This is how the user’s spelling dictionary (shared among all apps, but not supported by Firefox) works:

OS X user spelling dictionary

Although Firefox 4 fails horribly on integrating with the shared spelling system, it does get it right with word dictionary lookups (Ctrl+Cmd+D by default), correctly showing the dictionary word-pop-over. Chrome does not show this pop-over, but instead launches the Dictionary application, relieving Chrome itself from user input focus (a bad thing). Update: This will likely be fixed in Chrome 12 (thanks to Christopher Berlusconi Quackenbush for the tip).

Another relatively irritating lack of system integration is dragging stuff from Firefox to e.g. the desktop. When you drop something it will not appear where you dropped it, but rather added to the default location (somewhere on the rightmost part of your desktop). I tend to use my desktop as a landscape of semi-temporary piles of documents, visually grouped. Using Firefox, I must re-position the file each time after I’ve dropped it on the desktop. Tedious and definitely unnecessary.

File selection/opening

An excellent and very useful feature in both Chrome and Safari is the ability to drop a file onto a <input type="file"> input control:

Dropping files in Chrome

Unfortunately, this is not possible in Firefox 4. In fact, if you try to drop a file on a file control, Firefox will replace the whole website by browsing to the file you dropped, effectively causing any unsaved data to vanish into the void of interwebs (i.e. text typed into a form).

Also, Firefox uses a new window when browsing for files rather than the OS X standard “sheet” display (according to the Apple HIG and Apple’s own products). Here we see Google Chrome (left) displaying a standard OS “open file sheet” and Firefox (right) opening a new window (with a mysterious “Hide extensions” checkbox which did nothing when I checked it):

File browser dialogs

The problem with displaying a new window is

  • There’s no apparent attachment/link between a web page’s window which requests your attention to select a file.

  • The position of the “open file” window is not visually synchronized with the calling web page’s window, thus leading to a higher cognitive load for the user.

Miscellaneous

When quitting (terminating) Firefox, the state of tabs and open windows is saved, but when later starting Firefox the previous state is not restored. To restore a previous state, you have to manually select “History” → “Restore Previous Session” in the main menu. Edit: This is possible by setting “Preferences” → “General” → “When Firefox starts” to “Show my windows and tabs from last time”. I much prefer how Chrome does this — automatically restore any saved state at launch. This makes restarting Chrome (when restarting the computer or upgrading Chrome) and later resume your work less of a pain.

When dragging around tabs you don’t see what you are dragging. Instead, an image representing part of the contents of the tab is used for the “drag image”. There’s also no feedback (except from a tiny blue “insertion arrow”) when moving a tab to a different position in the same window. Both Chrome and Safari follow OS standards and display a visual replica of the object you are moving — a full tab with all its GUI parts. When moving a tab around you also get instant feedback as other tabs rearrange themselves, backed by an aiding animation, to give you a hint of where you are about to place your tab.

Now what?

Although Firefox 4 is a clear improvement over the previous version, it’s not the ultimate sports car you’ve been dreaming about taking to the streets. If you are hooked on some of the many extensions available for Firefox which does not exist for Chrome, stick with Firefox. Otherwise I recommend you go with Google Chrome.

—Mar 23, 2011

A template for setting up Node.js-backed web apps on EC2

Quick web hacks are great fun — getting an idea, realizing it and publishing it during a day or three. What usually sucks the fun out of these things—when building websites—is the whole “server setup” dance. You need to fix access to a server, install an operating system, registering a domain name, configure software, etc.

I’ve become quite fond of the Amazon Elastic Compute Cloud (EC2) — a widely popular service for creating virtual servers. So I’ve found myself repeating practically the same steps for every site launched on EC2 (dropular.net and spotni.cc, for instance). This is something I think many people could benefit from, so I’ve put together a sort of template for quickly setting up a web site on EC2:

https://github.com/rsms/ec2-webapp

Key features include:

* It takes ~10 minutes to build Node.js and about 10 minutes of actual work from your part.

What’s really nice with this setup is that you deploy changes with git, automagically giving you the power to roll back to previous versions when you break stuff. The common workflow (or hackflow) is as easy as:

cd myapp
bin/myapp-httpd.mv
# hack hack test hack test...
git commit
git push
myapp-update restart

The myapp-update command simply ssh’s to your server and makes it pull and checkout the latest version, optionally restarting services (like Node.js servers or daemons).

Minimal “technical bureaucracy” yield more time for creative focus — just the way it should be, and we’re not compromising on versioning or orthogonality.

Getting started with EC2

Let’s start by creating an account at Amazon Web Services: Visit https://aws-portal.amazon.com/gp/aws/developer/registration/index.html and log in or create an account.

When you have created your account, head over to the AWS Management Console — a relatively easy-to-use web interface for starting and managing virtual servers. It should look something like this:

Screen shot 2011 03 23 at 11 29 05

Depending on where in the world your and/or your users are, you can chose one of several geographic regions. A rule of thumb is that the farther away a server is located, the slower will it be to access. In the top-left corner you find a selection box labelled “Region”. Click it to switch to any of the available regions.

  • If you live in Asia, pick one of the Asia regions
  • If you live in Europe and have mostly European visitors, pick the EU region
  • If you live in the eastern parts of the USA (or in western Europe with visitors from around the world), pick the “US east” region
  • If you live in the western parts of the USA, pick the “US west” region

After choosing your geographical region, click the alluring “Launch instance” button, chose “Community AMIs” and type in one of the following AMIs (a code identifying a specific operating system) into the filter text box:

  • US west: ami-ad7e2ee8
  • US east: ami-ccf405a5
  • EU west: ami-fb9ca98f
  • Asia Pacific (Singapore): ami-0c423c5e

Screen shot 2011 03 23 at 11 49 42

Click the “Select” button of the machine and it’s time to enter some “Instance details”. Note that we will use the term “instance” from here on — it’s the name Amazon uses for “virtual machine” or “server”.

Screen shot 2011 03 23 at 11 52 38

Let the “Number of instances” and “Availability Zone” be at their default values (“1” and “No Preference”). For “Instance Type”, chose “Micro” and click the “Continue” button.

For the next step, the only thing we want to change is the last setting; “Shutdown Behavior”. Set this to “Stop”, otherwise your server will disappear into the void of cyberspace if you accidentally type sudo shutdown when logged in:

Screen shot 2011 03 23 at 11 53 27

Then click the meaty “Continue” button.

During the next step, simply give the instance a name of your choice and once again click “Continue”.

We are now going to “Create a new Key Pair”. Enter a name for the key and click the “Create & Download your Key Pair” link:

Screen shot 2011 03 23 at 11 56 17

Important: This is the one key providing access to your new server. If you lose it you will no longer be able to access the server, so make sure to make a secure backup (e.g. send yourself an email with the key attached using a secure email provider like Gmail, or put it on an encrypted USB drive).

Then continue to the next step where we will “Create a new Security Group”. Name it “webapp” and add three of the pre-defined rules available in the “Create a new rule” drop-down box: SSH, HTTP and HTTPS:

Screen shot 2011 03 23 at 12 03 10

Click our favorite “Continue” button and you should get a summary of your configuration. Review the details and when feeling like a happy little puppy, press firmly on the “Launch” button.

Your instance will start to launch. Close the “wizard”, wait a few seconds and you should see something like this (select your instance in the list if the bottom part is empty):

Screen shot 2011 03 23 at 12 08 34

In the bottom part “Description” you will find the address of your instance labelled “Public DNS” (it should look similar to ec2-123-123-123-123.us-west-1.compute.amazonaws.com). Select, copy!

Now, let’s log in to our new server:

ssh -i path/to/myapp.pem ubuntu@XXX.compute.amazonaws.com

Where path/to/myapp.pem above should be replaced by the actual path of your private key (which we downloaded during the “Create a new Key Pair” step) and XXX.compute.amazonaws.com replaced with the “Public DNS” of your instance.

You should now be logged in to the server as the “ubuntu” user. Note that you should not (and can not, by default) log in as the “root” user. Instead, use the sudo command to execute stuff with super-user privileges.

It’s time to get busy — Head over to INSTALL.md “Install software” →

—Mar 23, 2011

An expensive photo frame

This weekend’s hack is an iPad app which displays your Facebook news feed in a “photo frame” fashion — large text, automatic, simplistic and suitable for passive viewing.

Photo

I never really got the iPad (my laptop or smartphone is always around) and have probably used it for about a total of 10 hours, so better make good use of the thing.

The app is actually a web app. iOS web apps are perfect for these simple hacks as no super-mega performance is required and the hassle of getting a native app on the App Store would yield more work than actually building the app.

Visit rsms.me/projects/fnews in your iPad, add the app to your Home Screen and start it.

It’s built on the Facebook Graph API which provides a smooth and low-barrier user experience.

Technical hurdles

The streaming/real-time API does not currently support news feed, thus a polling techinque is used instead. This works really well with a polling interval of around 1 minute.

The “like” button is unfortunately somewhat defunct as the only API for “like” is the Open Graph one which causes the app to like something rather than the user. Update: I’ve added a request for the publish_stream permission. If you already have connected Fnews with Facebook, you’ll need to log out (tap the hard-to-see gear in the bottom-right corner) and then log in again to give Fnews the rights to “like” something on your behalf. I really wish there was a separate publish_like permission as publish_stream sounds really scary to most users and will probably lower the user count. Anyhow, “Like” to the people!.

Foursquare check-ins use maps from Google Maps (the Static Map API) rather than Foursquare as their (Foursquare’s) API is too complex for this quick hack. Google Maps are prettier anyway. I never managed to “snapshot” any Gowalla check-ins during the weekend (they are relatively rare in my perspective), so currently no Gowalla “candy additions”.

Logging in to Facebook using the Facebook JavaScript API should be avoided in “web apps” as it relies on window.open, which will cause a blank white screen to take over your app. This is a broken behavior of iOS “web apps” framework. Instead, when logging in to Facebook while in “web app” mode, Fnews simply sends you to Facebook, which eventually sends you back to Fnews (thought HTTP 302). It’s not pretty but it works. Note that although the FB.ui API supports iframe dialogs, there’s an exception for authentication dialogs which must be opened in separate windows.

Instagram pictures turned out to be rather simple to acquire as they provide a simpler “embedding” API — in addition to their full-blown OAuthed behemoth API — which allows to simply concatenate a regular link URL with “media” to build a higher-res image URL. Neat.

Oh sweet, sweet source code

Available through GitHub: https://github.com/rsms/rsms.github.com/tree/master/projects/fnews

—Mar 13, 2011