Tag Archives: YouTube

Web APIs: a moving target

Google have recently announced version 3 of their YouTube API. This is great news for libgdata: it means we can have access to all the same functionality as before, just with a JSON flavour, rather than Atom.

Sarcasm aside, the last few years of working (on and off) on libgdata has made a number of things obvious about web APIs. Here are some ideas I’ve had for best practices for writing code which interacts with them. Some of these have made their way into libgdata; others would require an API break to implement. References to relevant examples of APIs in libgdata are given inline, but if something isn’t clear please leave a comment. As always, this list is probably incomplete and any additions or alterations to it would be appreciated.

  • Have a very general, flexible core API (example), and add a layer of specialisation on top of it (example). This allows client programs to use the general APIs to access new features in the web API if your library hasn’t yet caught up.
  • Use objects liberally. Objects can be extended with new properties without breaking API. Structs and function parameter lists cannot. Even if you end up creating objects with a single property (example), don’t create them as structs!
  • Don’t worry about CPU efficiency. The cost of creating objects or doing some ‘unnecessary’ extra processing to give your API more flexibility is nothing compared to the cost of a network round trip. Network round trips and memory consumption are the main costs.
  • As a corollary to the previous point, the API should be zero-copy for consumers. If possible, try to design the API so that programs using it won’t have to take copies of all the data they access, as this will end up doubling the memory consumption of the application unnecessarily. One way to do this (which is what libgdata does) is to make the objects returned by network requests effectively immutable — e.g. a query will return a new set of result objects each time it’s performed, rather than updating an existing set of them.
  • Always try to think one step ahead of the web API designers. This is part of making your API flexible: if you’re thinking about the directions the web API could go in and the features which could be added to it in future, you’ll be more prepared when the web API designers suddenly spring them on you. libgdata managed this with its authentication API, but didn’t manage it with the core feed/entry API.
  • Report bugs against the web API. In the case of libgdata, many of the bugs we reported have been ignored, but that’s not the point. By reporting bugs, you help other consumers of the web API, and give (a little) feedback to the web API designers as to how people are using, or expecting to use, the web API. (And also how broken it is.)
  • Make everything asynchronous (example). Absolutely everything which could result in a network request should be asynchronous, cancellable, and support returning errors (even if cancellation isn’t initially implemented and no errors are initially returned). This prevents having to break API in the future to make a method asynchronous. Methods which will result in network requests should be clearly separated from non-networking methods, e.g. by using a different naming scheme for them (my_object_request_property() versus my_object_get_property(), for example).
  • Design the API with batch processing in mind. Wherever possible, allow sets of objects to be passed to methods, rather than individual objects. If the web API doesn’t support batch processing, the method can just implement a loop internally. If it does, the use of batch processing allows for an order n reduction in network round trips. libgdata failed at this, having to tack batch operations onto the API as an afterthought (example). Fortunately (or perhaps unfortunately) it hasn’t been much of an issue because Google’s batch API never really went anywhere. Clients of libgdata have wanted to use batch functionality, however, and it would have been best implemented from the start.
  • Integrate concurrency control in the core of your API (example). Web APIs are interfaces to large distributed systems. As we’ve found with libgdata, concurrency control is important, both in managing conflicts between different clients (e.g. when concurrently modifying an object) — but also in managing conflicts between clients and internal server processes. For example, just after a client creates a document on Google Docs, the server will modify it to add missing metadata. These modifications (and the accompanying change in the object’s version number) are exposed to clients. Google’s APIs (and hence libgdata) implement optimistic concurrency control using HTTP ETags. All operations in libgdata take an ETag parameter. This works fairly well (ignoring the fact that some web API operations inexplicably don’t support ETags).
  • Don’t expose specifics of the web API in your API. Take a look at all the functionality exposed by the web API (and all the functionality you think might be added in future), then design an API for it without reference to the existing web API. Once you’re done, try to reconcile the two APIs to make sure yours is actually implementable. This means your API isn’t tied to some esoteric behaviour when the web API gets fixed. However, if done incorrectly this can backfire and leave your API unable to map to future changes in the web API. Your mileage may vary.
  • Testing is tricky. You want to test your code against the web API’s production servers, since that’s what it’ll be used against. However, this requires that the machine running the tests is connected to the Internet (which often isn’t the case). It also means your unit tests can (and will) spuriously fail due to transient network problems. The alternative is to test your code against an offline mock-up of the web API. This solves the issues above, but means that you won’t notice changes and incompatibilities in the web API as they’re introduced by the web API developers. libgdata has never managed to get this right. I suspect the best solution is to write unit tests which can be run against either a mock-up or the real web API. Automated regression testing would run the tests against the mock-up, but developers would also regularly manually run the tests against the real web API.

libgdata

It's about time to announce something I've been working on for about three months now: libgdata. It's a GLib-, libsoup- and libxml2-based library for accessing GData APIs, as used by most Google services. There already exist several such libraries in a variety of languages, but as far as I'm aware this is the first one written in C — and thus the first which is widely accessible to the GNOME stack. So far it has decent support for YouTube video queries, and the beginnings of Google Calendar support.

Having ported the Totem YouTube plugin to use libgdata, my next plan is to port the evolution-data-server Google Calendar backend as well. With that done, libgdata will hopefully be stable and fully-featured enough for people to get to work on starting to fulfil Rob Bradford's dream of tighter desktop integration with web services.

Matchstick train finished

The matchstick train with its coal tender.I have been led to believe I started this model matchstick train nine years ago (when I was eight); a present from my grandmother. I'm pleased to say it's now finished!

Of course, that wasn't nine years of hard work. Anybody would be hard-pressed to get me to do that; it was an initial spurt of activity, followed by years of bits of the model just sitting on a shelf. I picked up construction again last year, and I've been working on it on and off since then.

Unfortunately, it's not as well-built as it could be, with several of the structural parts of the model not being square (or even flat, in places) due to mistakes I made all those years ago. Still, I think it's turned out OK!

Next, I think I'll finish off all those Airfix models I've neglected over the years. There are about three helicopters in various stages of misassembly or decay waiting for some love.

In GNOME news, I'm hacking on getting YouTube upload support into Conduit, with the eventual aim of adding a Conduit plugin in Totem to allow video upload to any supported video website. I've been having awful trouble with the Python GData API (again), but I think it's just about sorted now. I got the first video uploaded ten minutes ago, and it's cleanup from here on!

High-resolution YouTube videos

It took a little longer than I'd have liked, considering the simplicity of the patch, but I've finally added support for high-resolution videos in Totem's YouTube plugin.

Compare:

Korpiklaani\'s Wooden Pints in low resolution.

Korpiklaani\'s Wooden Pints in high resolution.

On the top is the old, low-resolution video, and on the bottom is the spiffy new high-resolution video, automatically downloaded using YouTube's fmt=18 option if your connection speed is set to 1.5Mbps or higher. As per Bastien's testing, we can't use the fmt=6 option due to it not being supported for many videos.

The difference may not be immediately obvious due to the window being so small for the screenshot, but it is visible — just look at the left-hand side of the hut's roof. Enjoy.

Totem YouTube plugin

Update (2011-08-27): A lot has happened to the Totem YouTube plugin since this blog post. It's been ported to C, extended to do HD videos, trimmed down to no longer do HD, and then moved to Grilo, where it now lives. It can be used in Totem from the Grilo plugin, which provides a unified UI for accessing video websites like YouTube from Totem. For more information on the history of the plugin, see my blog posts about Totem.

YouTube has come to Totem…

…in the form of a plugin I've written which allows you to browse YouTube from the comfort of everyone's favourite movie player. It allows searching for videos, and when you play a video, it displays its related videos.

The feature I'm most proud of though, is the fact that it automatically paginates when you scroll down the search results, loading more results as you go down. With a hint from Patrys in the comments in my blog post on it (and thanks to the other guys who left comments :)), it works by loading results immediately if you scroll 80% or more of the way down the treeview with the mouse or keyboard, or if you let go of the scrollbar handle more than 80% of the way down when moving it with the mouse. Query pages are loaded in a separate thread, and then the results are brought in with an idle function. This isn't quite as lag-free as I'd've wanted, but I can't see much more I can do to improve things.

Anyway, try things out by downloading and compiling SVN Totem, then enabling the "YouTube browser" plugin. You'll need the GData Python library (that's what the YouTube API uses, PyGTK 2.12 and Python 2.5. Before anyone asks, there's no way to sanely support video uploading yet, as Google unfortunately haven't yet exposed a public upload API. :(

Oh, and the cake is a lie.