Category Archives: GNOME

Posts about GNOME and programming for it.

Mini-GUADEC 2022 Berlin: retrospective

I’m really pleased with how the mini-GUADEC in Berlin turned out. We had a really productive conference, with various collaborations speeding up progress on Files, display colour handling, Shell, adaptive apps, and a host of other things. We watched and gave talks, and that seemed to work really well. The conference ran from 15:00 to 22:00 in Berlin time, and breaks in the schedule matched when people got hungry in Berlin, so I’d say the timings worked nicely. It helped to be in a city where things are open late.

c-base provided us with a cool inside space, a nice outdoor seating area next to the river, reliable internet, quality A/V equipment and support for using it, and a big beamer for watching talks. They also had a bar open later in the day, and there were several food options nearby.

At least from our end, GUADEC 2022 managed to be an effective hybrid conference. Hopefully people in Guadalajara had a similarly good experience?

Tobias and I spent a productive half a day working through a load of UI papercuts in GNOME Software, closing a number of open issues, including some where we’d failed to make progress for months. The benefits of in-person discussion!

Sadly despite organising the mini-GUADEC, Sonny couldn’t join us due to catching COVID. So far it looks like others avoided getting ill.

Travel

Allan wrote up how he got to Berlin, for general reference and posterity, so I should do the same.

I took the train from north-west England to London one evening and stayed the night with friends in London. This would normally have worked fine, but that was the second-hottest day of the heatwave, and the UK’s rails aren’t designed for air temperatures above 30°C. So the train was 2.5 hours delayed. Thankfully I had time in the plan to accommodate this.

The following morning, I took the 09:01 Eurostar to Brussels, and then an ICE at 14:25 to Berlin (via Köln). This worked well — rails on the continent are designed for higher temperatures than those in the UK.

The journey was the same in reverse, leaving Berlin at 08:34 in time for a 18:52 Eurostar. It should have been possible to then get the last train from London to the north-west of England on the same day, but in the end I changed plans and visited friends near London for the weekend.

I took 2 litres of water with me each way, and grabbed some food beforehand and at Köln, rather than trying to get food on the train. This worked well.

Within Berlin, I used a single 9EUR monatskarte for all travel. This is an amazing policy by the German government, and subjectively it seemed like it was being widely used. It would be interesting to see how it has affected car usage vs public transport usage over several months.

Climate

Overall, I estimate the return train trip to Berlin emitted 52kgCO2e, compared to 2610kgCO2e from flying Manchester to Guadalajara (via Houston). That’s an impact 50× lower. 52kgCO2e is about the same emissions as 2 weeks of a vegetarian diet; 2610kgCO2e is about the same as an entire year of eating a meat-heavy diet.

(Train emissions calculated one-way as 14.8kgCO2e to London, 4.3kgCO2e to Brussels, 6.5kgCO2e to Berlin.)

Tobias gave an impactful talk on climate action, and one of his key points was that significant change can now only happen as a result of government policy changes. Individual consumer choices can’t easily bring about the systemic change needed to prevent new oil and coal extraction, trigger modal shift in transport use, or rethink land allocation to provide sufficient food while allowing rewilding.

That’s very true. One of the exceptions, though, is flying: the choices each of the ~20 people at mini-GUADEC made resulted in not emitting up to 50 tonnes of CO2e in flights. That’s because flights each have a significant emissions cost, and are largely avoidable. (Doing emissions calculations for counterfactuals is a slippery business, but hopefully the 50 tonne figure is illustrative even if it can’t be precise.)

So it’s pretty excellent that the GNOME community supports satellite conferences, and I strongly hope this is something which we can continue to do for our big conferences in future.

Tourism

After the conference, I had a few days in Berlin. On the recommendation of Zeeshan, I spent a day in the Berlin technical museum, and another day exploring several of the palaces at Potsdam.

It’s easy to spend an entire day at the technical museum. One of the train sheds was closed while I was there, which is a shame, but at least that freed up a few hours which I could spend looking at the printing and the jewellery making exhibits.

One of the nice things about the technical museum is that their displays of old machinery are largely functional: they regularly run demonstrations of entire paper making processes or linotype printing using the original machinery. In most other technical museums I’ve been to, the functioning equipment is limited to a steam engine or two and everything else is a static display.

The palaces in Potsdam were impressive, and look like a maintenance nightmare. In particular, the Grotto Hall in the Neues Palais was one of the most fantastical rooms I’ve ever seen. It’s quite a ridiculous display of wealth from the 18th century. The whole of Sanssouci Park made another nice day out, though taking a picnic would have been a good idea.

Thanks!

Thanks again to everyone who organised GUADEC in Guadalajara, Sonny and Tobias for organising the mini-GUADEC, the people at c-base for hosting us and providing A/V support, and the GNOME Foundation for sponsoring several of us to go to mini-GUADEC.

Sponsored by GNOME Foundation

Looking at project resource use and CI pipelines in GitLab

While at GUADEC I finished a small script which uses the GitLab API to estimate the resource use of a project on GitLab. It looks at the CI pipeline job durations and artifact storage for the project and its forks over a given period, and totals things.

You might want to run it on your project!

It gives output something like the following:

Between 2022-06-23 00:00:00+00:00 and 2022-07-23 00:00:00+00:00, GNOME/glib and its 20 forks used:

  • 4592 CI jobs, totalling 17125 minutes (duration minimum 0.0, median 2.3, maximum 65.0)
  • Total energy use: 32.54kWh
  • Total artifact storage: 4426 MB (minimum 0.0, median 0.2, maximum 20.9)

This is useful for giving a rough look at the CI resources used by a project, which could be useful for noticing low-hanging fruit for speeding things up or reducing resource waste.

What can I do with this information?

If total pipeline durations are long, either reduce the number of pipeline jobs or speed them up. Speeding them up almost always has no downsides. Reducing the number of jobs is a tradeoff between convenience of development and resource usage. Two ideas for reducing the number of jobs are to make some jobs manual-only, if they are very unlikely to find problems. Or run them on a schedule rather than on every commit, if it’s OK for them to catch problems up to a week after they’re introduced.

If total artifact storage use is high, store fewer artifacts, or expire them after a week (or so). They are likely not so useful after that point anyway.

If artifacts are being used to cache build dependencies, then consider moving those dependencies into a pre-built container image instead. It may be cached better between CI runners.

This script is rubbish, how do I improve it?

Merge requests welcome on https://gitlab.gnome.org/pwithnall/gitlab-stats, or perhaps you’d like to integrate it into cauldron.io so that the data could be visualised over time? The same query code should work for all GitLab instances, not just GNOME’s.

How does it work?

It queries the GitLab API in a few ways, and then applies a very simple model to the results.

It can take a while to run when querying for large projects or for periods of over a couple of weeks, as it needs to make a REST request for each CI job individually.

Mini-GUADEC 2022 in Berlin

GUADEC 2022 has been happening in person for the first time in two years, in Guadalajara. Twenty of us in Europe met up in Berlin for a mini-GUADEC, to attend the main conference remotely. There have been several talks given from here using the nice A/V setup in c-base, who are hosting us.

I gave my talk this afternoon, on the threading rework which is ongoing in gnome-software. The slides are here, the notes are here (source is here), and the recording should be available soon on the GUADEC YouTube channel.

As part of the question and answer session afterwards, it was suggested that it might be helpful to write a blog post about strategies for making async code in C more readable. I’ll try and write something about that soon.

Thanks to the GUADEC organising team for hosting the conference and integrating remote participation so well, to Sonny and Tobias for organising the Berlin mini-GUADEC, and to the c-base technical people for setting things up nicely here.

State persistence for apps and sessions: Endless Orange Week

The second part of my project in Endless Orange Week was to look at state persistence for apps and sessions. At its core, this means providing

  • some way for apps to save and restore their state (such as which part of the UI you’re looking at, what’s selected, and unsaved content);
  • code in the session manager to save and restore the state of all applications, and a list of which applications are running, when shutting down/restarting/logging out/logging in/starting up.

Those two bullet points hide a lot of complexity, and it’s not surprising that I didn’t get particularly far in this project! It requires coordinated changes in a lot of components: GLib, GTK, gnome-session and applications themselves.

A lot of these changes have been prototyped or worked on before, by various people, but nothing has yet come together. In fact, gnome-session used to fully support restoring apps to a certain degree — before it was ported away from XSMP, it used to support saving the set of apps when closing a session, and re-starting those apps when starting the session again. It did not support restoring the state of each app, though, just the fact that it was running.

Breaking down the problem

Updating and building on the proposals and branches from others so far, we end up with:

  • Changes to GLib to add a ‘restart data’ property to GApplication, which allows the application to expose its current state to the session manager (to be saved), and which is initialised with the restored state from the session manager on startup. These build heavily on changes proposed by Bastien Nocera, but my tweaks to them are still pretty exploratory and not ready for review.
  • Code in GTK to support serialising the widget tree. This implements saving the state of the UI, and was prototyped by Matthias Clasen. My additions (also not yet ready for review) tie it in to the gnome-session API. Further work (but not very much of it) would have to be done to tie Matthias’ proposals into the final shape of the GLib API.
  • Preparatory cleanups of old code in gnome-session (this one is ready to review and hopefully merge!).
  • Work to re-support session restore in gnome-session. This is mostly ready, but needs tidying up and testing (which is likely to be a significant amount of work). It ties in with systemd transient scope work which Benjamin Berg and Iain Lane have been working on.

The final two pieces of the puzzle above took most of my week, and included a lot of time spent learning the architecture of gnome-session, and working out a bit of a protocol for apps to register for session restore and, in particular, for gnome-session to be able to re-launch apps the right way when restoring a session. That’s something which deserves a blog post of its own at some point in the future, once I’m sure I’ve got everything straight in my head.

In summary, there’s not been much progress in terms of creating merge requests. But the week was, I think, well spent: I’ve learned a lot about the shape of the problem, and feel like I have a better idea of how to split it up into chunks which it might be possible to land over time to get the feature working. Many thanks to Endless for giving me the opportunity to do so.

Certainly, I think this project is too big to easily do in a single GNOME release. There’s too much coordination required between different projects with different cadences and development resources.

The next step will be to land the preparatory gnome-session cleanups, and to discuss and land the GLib API so that people can start building with that.

Runtime control of debug output: Endless Orange Week

Recently at Endless we had a week of focused working on projects which are not our day-to-day work. It was called ‘Endless Orange Week’, and everyone was encouraged to explore a topic of their choosing.

I chose to look at two projects, both of which included a D-Bus/API component. My thinking was that review of the new interfaces on each project might take a while, so it would make sense to have two projects running in parallel so I could switch between them when blocked.

I’m going to blog about the two projects separately, to avoid one mega-long post.

The first project was to add a D-Bus debug interface for applications. This would allow debug output from an application to be turned on and off at runtime, rather than just being set with a command line argument or environment variable when the application is first started.

This would allow users and developers to get debug output from long-running applications without having to restart them, as quite often restarting a process will destroy the state you were hoping to debug.

What I came up with is GDebugController, which is awaiting review in GLib now. It’s an interface, currently implemented by GDebugControllerDBus. When instantiated, GDebugControllerDBus creates a new D-Bus object and interface for controlling the debug output from the application. It hooks into the standard g_debug() message functions, and can be hooked into a custom log writer function if your application uses one of those.

It essentially exists to expose one D-Bus property and allow that to be hooked in to your log writer. It has to be a bit more complex than that, though, as it needs to be able to handle authorisation: checking that the D-Bus peer who’s requesting to enable debug output on your application is actually allowed to do so. For services in particular, this is important, as allowing any peer to enable debug output could count as a privilege escalation. They might be able to slow your process down due to the volume of debug output it produces; fill the disk up; or look at the outputted logs and see private information. GDebugControllerDBus has an authorize signal to support this, and it works quite similarly to the GDBusInterfaceSkeleton::g-authorize-method signal.

Using it in an application

Firstly, you need to wait for it to be reviewed and land in GLib. The API might change during review.

Once it’s landed, assuming nothing changes, you just need to create an instance of GDebugControllerDBus. It will create the D-Bus object and hook it all up. When peers change the debug level in your application, the default handler will call g_log_set_debug_enabled() which will change the behaviour of GLib’s default log writer function.

If you have a custom log writer function, you will need to change it to check g_log_get_debug_enabled() and output debug messages if it’s true.

Using it in a service

Using it in a service will typically involve hooking up authorisation. I’ve implemented support for it in libgsystemservice, so that it will be enabled for any user of libgsystemservice after version 0.2.0.

To use polkit for authorisation, set the GssService:debug-controller-action-id property to the ID of the polkit action you want to use for authorising enabling/disabling debug mode. libgsystemservice will handle the polkit checks using that. Here’s an example.

If that property is not set, a default policy will be used, where debug requests will be accepted unconditionally if your service is running on the session bus, and rejected unconditionally if it’s running on the system bus. The thinking is that there’s no security boundary on the session bus (all peers are equally trusted), whereas there are a lot of security boundaries on the system bus so libgsystemservice is best to fail closed and force you to write your own security policy.

That’s it! Reviews and feedback welcome. Many thanks to Endless for running this week and actively encouraging everyone to make use of it.

Add metadata to your app to say what inputs and display sizes it supports

The appstream specification, used for appdata files for apps on Linux, supports specifying what input devices and display sizes an app requires or supports. GNOME Software 41 will hopefully be able to use that information to show whether an app supports your computer. Currently, though, almost no apps include this metadata in their appdata.xml file.

Please consider taking 5 minutes to add the information to the appdata.xml files you care about. Thanks!

If your app supports (and is tested with) touch devices, plus keyboard and mouse, add:

<recommends>
  <control>keyboard</control>
  <control>pointing</control>
  <control>touch</control>
</recommends>

If your app is only tested against keyboard and mouse, add:

<requires>
  <control>keyboard</control>
  <control>pointing</control>
</requires>

If it supports gamepads, add:

<recommends>
  <control>gamepad</control>
</recommends>

If your app is only tested on desktop screens (the majority of cases):

<requires>
  <display_length compare="ge">medium</display_length>
</requires>

If your app is adaptive and works on mobile device screens through to desktops, add:

<requires>
  <display_length compare="ge">small</display_length>
</requires>

Or, if you’ve developed your app to work at a specific size (mostly relevant for mobile devices), you can specify that explicitly:

<requires>
  <display_length compare="ge">360</display_length>
</requires>

Note that there may be updates to the definition of display_length in appstream in future for small display sizes (phones), so this might change slightly.

Another example is what I’ve added for Hitori, which supports touch and mouse input (but not keyboard input) and which works on small and large screens.

See the full specification for more unusual situations and additional examples.

How your organisation’s equipment policy can impact the environment

At the Endless OS Foundation, we’ve recently been updating some of our internal policies. One of these is our equipment policy, covering things like what laptops and peripherals are provided to employees. While updating it, we took the opportunity to think about the environmental impact it would have, and how we could reduce that impact compared to standard or template equipment policies.

How this matters

For many software organisations, the environmental impact of hardware purchasing for employees is probably at most the third-biggest contributor to the organisation’s overall impact, behind carbon emissions from energy usage (in building and providing software to a large number of users), and emissions from transport (both in sending employees to conferences, and in people’s daily commute to and from work). These both likely contribute tens of tonnes of emissions per year for a small/medium sized organisation (as a very rough approximation, since all organisations are different). The lifecycle emissions from a modern laptop are in the region of 300kgCO2e, and one target for per-person emissions is around 2.2tCO2e/year by 2030.

If changes to this policy reduce new equipment purchase by 20%, that’s a 20kgCO2e/year reduction per employee.

So, while changes to your organisation’s equipment policy are not going to have a big impact, they will have some impact, and are easy and unilateral changes to make right now. 20kgCO2e is roughly the emissions from a 150km journey in a petrol car.

What did we put in the policy?

We started with a fairly generic policy. From that, we:

  • Removed time-based equipment replacement schedules (for example, replacing laptops every 3 years) and went with a more qualitative approach of replacing equipment when it’s no longer functional enough for someone to do their job properly on.
  • Provided recommended laptop models for different roles (currently several different versions of the Dell XPS 13), which we have checked conform to the rest of the policy and have an acceptable environmental impact — Dell are particularly good here because, unlike a lot of laptop manufacturers, they publish a lifecycle analysis for their laptops
  • But also gave people the option to justify a different laptop model, as long as it meets certain requirements:

All laptops must meet the following standards in order to have low lifetime impacts:

All other equipment must meet relevant environmental standards, which should be discussed on a case by case basis

If choosing a device not explicitly listed above, manufacturers who provide Environmental Product Declarations for their products should be preferred

  • These requirements aim to minimise the laptop’s carbon emissions during use (i.e. its power consumption), and increase the chance that it will be repairable or upgradeable when needed. In particular, having a replaceable battery is important, as the battery is the most likely part of the laptop to wear out.
  • The policy prioritises laptop upgrades and repairs over replacement, even when the laptop would typically be coming up for replacement after 3 years. The policy steers people towards upgrading it (a new hard drive, additional memory, new battery, etc.) rather than replacing it.
  • When a laptop is functional but no longer useful, the policy requires that it’s wiped, refurbished (if needed) and passed on to a local digital inclusion charity, school, club or similar.
  • If a laptop is broken beyond repair, the policy requires that it’s disposed of according to WEEE guidelines (which is the norm in Europe, but potentially not in other countries).

A lot of this just codifies what we were doing as an organisation already — but it’s good to have the policy match the practice.

Your turn

I’d be interested to know whether others have similar equipment policies, or have better or different ideas — or if you make changes to your equipment policy as a result of reading this.

Don’t (generally) put documentation license in appdata

There have been a few instances recently where people have pointed out that GNOME Software marks some apps as not free software when they are. This is a bug in the appdata files provided by those applications, which typically includes something like

<project_license>GPL-3.0+ and CC-BY-SA-3.0</project_license>

This is generally an attempt to list the license of the code and of the documentation. However, the resulting SPDX expression means to apply the most restrictive interpretation of both licenses. As a result, the expression as a whole is considered not free software (CC-BY-SA-3.0 is not a free software license as per the FSF or OSI lists).

Instead, those apps should probably just list the ‘main’ license for the project:

<project_license>GPL-3.0+</project_license>

and document the license for their documentation separately. As far as I know, the appdata format doesn’t currently have a way of listing the documentation license in a machine readable way.

If you maintain an app, or want to help out, please check the licensing is correctly listed in your app’s appdata.

There’s an issue open against the appdata spec for improving how licenses are documented in future — contributions also welcome there.

(To avoid doubt, I think CC-BY-SA-3.0 is a fine license for documentation; it’s just problematic to include it in the ‘main’ appdata license statement for an app.)

Simple HTTP profiling of applications using sysprof

This is a quick write-up of a feature I added last year to libsoup and sysprof which exposes basic information about HTTP/HTTPS requests to sysprof, so they can be visualised in GNOME Builder.

Prerequisites

  • libsoup compiled with sysprof support (-Dsysprof=enabled), which should be the default if libsysprof-capture is available on your system.
  • Your application (and ideally its dependencies) uses libsoup for its HTTP requests; this won’t work with other network libraries.

Instructions

  • Run your application in Builder under sysprof, or under sysprof on the CLI (sysprof-cli -- your-command) then open the resulting capture file in Builder.
  • Ensure the ‘Timings’ row is visible in the ‘Instruments’ list on the left, and that the libsoup row is enabled beneath that.

Results

You should then be able to see a line in the ‘libsoup’ row for each HTTP/HTTPS request your application made. The line indicates the start time and duration of the request (from when the first byte of the request was sent to when the last byte of the response was received).

The details of the event contain the URI which was requested, whether the transaction was HTTP or HTTPS, the number of bytes sent and received, the Last-Modified header, If-Modified-Since header, If-None-Match header and the ETag header (subject to a pending fix).

What’s that useful for?

  • Seeing what requests your application is making, across all its libraries and dependencies — often it’s more than you think, and some of them can easily be optimised out. A request which is not noticeable to you on a fast internet connection will be noticeable to someone on a slower connection with higher request latencies.
  • Quickly checking that all requests are HTTPS rather than HTTP.
  • Quickly checking that all requests from the application set appropriate caching headers (If-Modified-Since, If-None-Match) and that all responses from the server do too (Last-Modified, ETag) — if a HTTP request can result in a cache hit, that’s potentially a significant bandwidth saving for the client, and an even bigger one for the server (if it’s seeing the same request from multiple clients).
  • Seeing a high-level overview of what bandwidth your application is using, and which HTTP requests are contributing most to that.
  • Seeing how it all ties in with other resource usage in your application, courtesy of sysprof.

Yes that seems useful

There’s plenty of scope for building this out into a more fully-featured way of inspecting HTTP requests using sysprof. By doing it from inside the process, using sysprof – rather than from outside, using Wireshark – this allows for visibility into TLS-encrypted conversations.

GNOME Software performance in GNOME 40

tl;dr: Use callgrind to profile CPU-heavy workloads. In some cases, moving heap allocations to the stack helps a lot. GNOME Software startup time has decreased from 25 seconds to 12 seconds (-52%) over the GNOME 40 cycle.

To wrap up the sporadic blog series on the progress made with GNOME Software for GNOME 40, I’d like to look at some further startup time profiling which has happened this cycle.

This profiling has focused on libxmlb, which gnome-software uses extensively to query the appstream data which provides all the information about apps which it shows in the UI. The basic idea behind libxmlb is that it pre-compiles a ‘silo’ of information about an XML file, which is cached until the XML file next changes. The silo format encodes the tree structure of the XML, deduplicating strings and allowing fast traversal without string comparisons or parsing. It is memory mappable, so can be loaded quickly and shared (read-only) between multiple processes. It allows XPath queries to be run against the XML tree, and returns the results.

gnome-software executes a lot of XML queries on startup, as it loads all the information needed to show many apps to the user. It may be possible to eliminate some of these queries – and some earlier work did reduce the number by binding query parameters at runtime to pre-prepared queries – but it seems unlikely that we’ll be able to significantly reduce their number further, so better speed them up instead.

Profiling work which happens on a CPU

The work done in executing an XPath query in libxmlb is largely on the CPU — there isn’t much I/O to do as the compiled XML file is only around 7MB in size (see ~/.cache/gnome-software/appstream), so this time the most appropriate tool to profile it is callgrind. I ruled out using callgrind previously for profiling the startup time of gnome-software because it produces too much data, risks hiding the bigger picture of which parts of application startup were taking the most time, and doesn’t show time spent on I/O. However, when looking at a specific part of startup (XML queries) which are largely CPU-bound, callgrind is ideal.

valgrind --tool=callgrind --collect-systime=msec --trace-children=no gnome-software

It takes about 10 minutes for gnome-software to start up and finish loading the main window when running under callgrind, but eventually it’s shown, the process can be interrupted, and the callgrind log loaded in kcachegrind:

Here I’ve selected the main() function and the callee map for it, which shows a 2D map of all the functions called beneath main(), with the area of each function proportional to the cumulative time spent in that function.

The big yellow boxes are all memset(), which is being called on heap-allocated memory to set it to zero before use. That’s a low hanging fruit to optimise.

In particular, it turns out that the XbStack and XbOperands which libxmlb creates for evaluating each XPath query were being allocated on the heap. With a few changes, they can be allocated on the stack instead, and don’t need to be zero-filled when set up, which saves a lot of time — stack allocation is a simple increment of the stack pointer, whereas heap allocation can involve page mapping, locking, and updates to various metadata structures.

The changes are here, and should benefit every user of libxmlb without further action needed on their part. With those changes in place, the callgrind callee map is a lot less dominated by one function:

There’s still plenty left to go at, though. Contributions are welcome, and we can help you through the process if you’re new to it.

What’s this mean for gnome-software in GNOME 40?

Overall, after all the performance work in the GNOME 40 cycle, startup time has decreased from 25 seconds to 12 seconds (-52%) when starting for the first time since the silo changed. This is the situation in which gnome-software normally starts, as it sits as a background process after that, and the silo is likely to change every day or two.

There are plans to stop gnome-software running as a background process, but we are not there yet. It needs to start up in 1–2 seconds for that to give a good user experience, so there’s a bit more optimisation to do yet!

Aside from performance work, there’s a number of other improvements to gnome-software in GNOME 40, including a new icon, some improvements to parts of the interface, and a lot of bug fixes. But perhaps they should be explored in a separate blog post.

Many thanks to my fellow gnome-software developers – Milan, Phaedrus and Richard – for their efforts this cycle, and my employer the Endless OS Foundation for prioritising working on this.