Seedvault Backup: Why not add a launcher icon?

Hooray for Seedvault Backup! At last there is an open-source backup solution that can be built in to de-Googled Android-based phones such as those running LineageOS or CalxyOS. I should write more about why this is a fantastic development, but not right now.

This article is about something else: the importance of keeping a very clean default launcher experience for the Ordinary User, and how surprising the habits of the Ordinary User may seem to us if we are a developer or a Power User.

This story starts with Seedvault Backup currently (mid-2021) being relatively hard to find and awkward to access in the Android system settings menus. A developer proposed making it easier to find and access by adding a launcher icon for it. This would certainly make it easier to find and access. But here is the long version of my response to that proposal.

TL;DR: In my opinion it is probably best NOT to add a launcher icon by default, because that is not in the best interest of the ordinary user in their ordinary every-day activities. Basically, my argument is that an Ordinary User will treat it as set-and-forget settings, and will want to ignore it for nearly all of their life, so we shouldn’t insert an icon for it among their user apps.

The rest of this comment is rather long. It is not meant to be a rant, just a fuller explanation of why I make this suggestion, and I thought I might as well write it all down as it may not all be obvious to everyone.

I understand, it would indeed be nice to make the Seedvault UI a bit easier to find and access. It would especially be nice during the time I (the user) am setting up and testing the backup procedure, when I need to access it again and again. Let’s come back to this later, towards the end of this long comment. But I am not the ordinary user and this is not the usual way of using my phone, so let’s first think about the ordinary user in their ordinary usage.

I have had the privilege of watching a real Ordinary User recently and this is what I found in real life. The “ordinary user” I speak of is quite surprising to us techies: it is a person who is focused on their own activities and does not care to spend time understanding or interfering with how their phone is working. In fact we might consider them to be neglecting their phone’s needs and health. This ordinary user does not want to customize their phone: they do not even add favourite apps to their home screen! They just find them in the launcher or in recents. This ordinary user does not care about keeping the software up to date, they only do so if a pop-up forces them to, otherwise they ignore it. This ordinary user does not even bother to dismiss their notifications! That’s an optional extra thing they don’t need to do, they don’t want to waste their brain space on it, they just want to do something particular with some particular app right now and then shut the phone.

A typical default launcher such as on LineageOS shows all the user apps, plus just one “Settings” icon. That seems an intentional and user-focused way to organize things. The ordinary user likes to get some help setting up the phone, and then forget about it. They don’t need or want to visit what they regard as low-level settings like these again, so they don’t need easy access to them. Additional icons added to the launcher are just clutter that get in the way of them finding their user apps.

I noticed that LineageOS-for-microG adds a second settings icon for microG settings. In a small way, that is already prioritising the interests of power users and developers and techies while starting to degrade experience for the ordinary user. Let us not copy that mistake.

Backup, and microG-settings, are far from the only system settings that some users might sometimes like shortcuts for. They are just two of many. It should certainly be made possible for a ROM maker or a power user to create shortcuts to settings and settings-related apps such as Seedvault, and also to particular activities within it, so that such developers or power users can make a page of settings shortcuts if they want to. That would make sense, for example, with a launcher that allows creating sub-groups of launcher icons. But while making such things possible, let us not underestimate the importance of getting the default experience right for ordinary users.

Now back to the topic of making Seedvault easier to access for those users and those times when that’s needed, a few minor suggestions:

  • Enhancement: I noticed that searching the system settings for “backup” finds Seedvault’s UI, while “seedvault” does not find it. Conversely, searching for “seedvault” does find Seedvault’s “App info” screen, whereas searching for “backup” does not. Let’s at least add “seedvault” to the searchable string used for its settings UI, and add “backup” to the searchable string used by its “App info”, to make these consistent and more discoverable.
  • Enhancement: when the “Seedvault: backup running” notification is showing, clicking this should open the Seedvault UI. It currently doesn’t do anything when I touch it.
  • Minor enhancement of interest to power users: without adding this to the launcher, just to mark the UI activity as launchable, so that an “Open” button appears on its “app info” screen where currently there are only the Uninstall and Force Stop buttons.

Maybe we can think of more ways to make it findable when it’s needed without showing anything when it’s not needed.

(My versions: Seedvault 11-1.2, on LineageOS-for-microG 18.1 3-September-2021, on a OnePlus 6 “enchilada”)

Introducing Trax Cam

I made my first Android app. Well, I didn’t write it from scratch, I took an open source camera app and just renamed and modified it a bit, and rebuilt it.

I called it Trax Cam.

It is a fork of Open Camera. The initial, test version of Trax Cam makes just two small improvements over Open Camera:

  1. improving the visual difference of the shutter buttons between video and photo mode, recording and not recording, by contrasting colour and style;
  2. not forgetting the zoom level when temporarily switching to another app and back, switching between photo and video modes, and the like.

Impressive? Is this everything you have ever wanted from a camera app? No, I didn’t think so. That’s not the point of it. Dear reader, it’s not for you. Not yet, anyway. Sorry!

Trax Cam is not yet intended as a product for ordinary users. Rather, the purpose of this project, at least initially, is as a learning exercise for me. The status of it, at the time of writing, is it is stable, being based on a stable version of Open Camera, and I am running it as my main camera app on two different phones, but I am not committing to making further fixes or updates. I might or might not continue developing or updating it. Switching to other projects will give me a broader and probably more useful learning experience.

Default Camera for /E/-OS

/e/-foundation‘s /e/-OS comes with OpenCamera as its default camera app. I had previously installed OpenCamera on my main LineageOS-powered phone, alongside the default LineageOS camera code-named ‘snap’.

I would sometimes switch between OpenCamera and ‘snap’. I like different things about them, mainly the range of options in OpenCamera and the simplicity of ‘snap’, with neither of them managing to bring the best possible combination, and both still having annoyances and room for improvement.

It was while evaluating /e/ that I noticed again several of the shortcomings of OpenCamera, and decided it could be time for me to do something about it. I had some ideas noted in my head, and have now written them down in a bug tracker.

Trax Cam Issues: https://lab.trax.im/trax.im/traxcam/-/issues . Issue #1 is Remember my zoom level , #2 is Boldly indicate photo/video/recording states (both still “open” as I am not satisfied that I have yet completed them as well as could be done), and I have opened several more.

At the same time, I discovered /e/-foundation is recruiting among other positions a Camera developer: “[You want] to make open source camera apps as good as high-end camera apps?” (If you are an “experienced (5+ years) Android developer”, check it out.) It’s good to see that /e/-foundation recognizes the need and seems to have some resources to tackle it.

In my opinion, one thing /e/ could do to encourage volunteer developers to develop OpenCamera (or another) in a useful direction for /e/ and other libre Android uses, would be to publish a road map or a prioritization of issues for what they consider needs to be done. I feel that projects providing a summary of what they want done is often missing and often a surprisingly effective driver in open source development. Developers tend to be quicker and more efficient at implementing something that is at least loosely specified (along with rationale to explain why) than creating something that they need to invent from scratch.

Comparing with Other Cameras

Besides OpenCamera and Lineage ‘snap’ I also briefly tried FreeDCam and OxygenOS camera (proprietary). Those each have some very slick UI designs which bear studying and would be nice to bring in. In particular, if I recall correctly they both use swipe to review the last and earlier taken pictures, which seems to me more intuitive and easier to use.

I noticed the OxygenOS camera used slick-looking rotary dials for analogue settings such as exposure compensation. I really liked this at first sight. In use, I found their behaviour a bit fussy and skiddy, easy to leave it on a random in-between value, hard to return quickly to the previously used value. It would be great if we could create something along those lines of analogue beauty, but with more positive and robust behaviour to set and leave a desired value. For example, I think it is common and so needs to be easy to switch between “auto” and the last used non-auto value. (Not just for camera controls, indeed, but for all kinds of settings in all apps.)

Available on My F-Droid App Store

You can try out Trax Cam if you like, even though it is not (currently) aimed at general use. https://fdroid.foad.me.uk/fdroid/repo/im.trax.cam_80.apk is a direct link to the installer file for my initial test version. It’s not published on Google Play store. You will need your “install from unknown sources” option enabled in Android settings.

I also put up my own F-Droid app store, and put Trax Cam up on it, so we can install it on phones through the f-droid app. This too is not currently intended for general end users. However, you are welcome to try it. First install the f-droid app (https://f-droid.org/), then add my repository’s URL (https://fdroid.foad.me.uk) under settings, repositories, add.

If you are unfamiliar with F-Droid, it comprises:

  • an Open/Libre app store system by which anyone can set up and serve their own app repository, and anyone can browse and install apps from such repositories; and
  • the F-Droid app, for browsing and installing apps from any such repositories; and
  • a repository of Free/Libre apps for Android, which is the default repository that the F-Droid app connects to until you tell it about other ones.

Putting up my own app store is part of my ambition to support independent creation of free/libre software and self-owned services.

Matrix needs: Bring Your Own Domain

Owning Your Identity in matrix is currently not as cheap and easy as it should be. Hopefully this will be changing in the not too distant future.

The current options for using your own identity in Matrix are, roughly speaking:

  • rent a personal matrix homeserver as a managed service “in the cloud”
    • around £1 a month for your own domain name, plus
    • around £10 a month for the matrix service
  • run your own server
    • around £1 a month for your own domain name; plus
    • rent a virtual server from under £5 a month, or provide your own; plus
    • your skill and time

The sticking point is that currently you have to run a whole matrix “homeserver” for each DNS domain name that you want to use in a matrix user id. To register an account on the matrix.org server, for example, you must use an identifier like @some-name:matrix.org which is an identifier owned by matrix.org. There is no option to Bring Your Own Domain and register yourself as @myself:myself.org on that server.

The one domain per server model works well where the users are already members of an organisation with its own domain name, such as a government, a school or a business. For the ordinary individual who wants to own their own online identity by bringing their own domain name, however, the requirement to run their own matrix homeserver is currently too onerous.

The need for one (virtual) server per domain is a limitation of the current server software. It could be lifted in future. Then we could see commercial services offering Bring Your Own Domain accounts on their servers. This would be cheaper to run than a server per domain, and would bring a useful hybrid model of:

  • terms and conditions set by the service provider; with
  • self-ownership of one’s matrix account’s identity, giving the ability to move one’s account to another service provider with different TOCs, (or, in the extreme, self-hosting) without any loss of service or contacts

The matrix community needs to break that barrier somehow. Options include:

  • making tiny simple homeservers so it’s no burden for each user to run their own;
    • needed also for p2p Matrix, and beginning to happen with Dendrite
  • making homeservers that offer a Bring Your Own Domain option.

Besides Dendrite, the Matrix community has some other small-footprint homeservers under development which may be ready for use within the next year or so. I hope we will see Bring Your Own Domain (aka multi-tenant) capability being developed in at least one of this new generation of homeservers.

The good news is the options are looking likely to open up. The crucial fact is that you can own your identity in matrix, already.

Related:

Jitsi-Meet: Excuse Me, Who Are You?

I fired up https://meet.jit.si/ just to explore how it works. Following the hint, I chose a simple phrase for the meeting name, something like “MyFamily”, and hit “Go”.

What did I expect?

A blank meeting screen.

What did I see?

Six elderly people whom I don’t know, talking to each other, looking surprised to see me and saying “Oh, who is that?!”

Walking In on Strangers

I left the meeting after a few seconds. Perhaps they worked out what happened and how they could change their settings to stop strangers intruding. Perhaps they were worried. Perhaps I could have gone back in and explained that I walked in on them unintentionally and meant no harm. Perhaps.

There is a UI/UX problem here. Jitsi-Meet is insecure by default. Certainly it’s possible to use it in a more secure way, but that is not good enough.

The UI here is supposed to suggest that I can use a randomly generated phrase as a meeting name. The designers clearly intended that as a security measure: a sufficiently unguessable name is secure against accidental visits, and forms a part of the security measures against intentional attacks. I know that from my background knowledge of security practices and software design. From further background knowledge and exploration, I also know that I can press “GO” to accept the suggestion. But still I fell into this trap.

As a new user, it looks like I need to enter something, and I would expect to enter a simple and meaningful name, and would absolutely not expect that doing so would immediately remove the expected privacy.


Filed as a bug: https://github.com/jitsi/jitsi-meet/issues/5407 “Choosing a meeting name is insecure by default”

## Description
In real life just now, I chose a simple meeting name, something like "MyFamily" (but not that) and hit "Go". I saw and heard six elderly people whom I don’t know, talking to each other, looking surprised to see me and saying “Oh, who is that?!”

Their privacy was violated to me, and mine to them.

This is insecure by default, and inconsistent with claiming "security".

Reported here: https://blog.foad.me.uk/2020/03/26/jitsi-meet-excuse-me-who-are-you/

---

## Current behavior

On entering a simple meeting name there is a real chance of walking in on someone else's meeting, uninvited.

---

## Expected Behavior

On entering a simple meeting name, there should be some protections in place to prevent walking in on someone else's meeting, uninvited.

---

## Possible Solutions
There are several possible solutions and measures, including...
* explain about the random phrase and its security implications and make accepting the suggested phrase a more discoverable option;
* have the "create meeting" UI user choose whether they want a public/open meeting or a private/closed meeting;
* if a user-specified meeting name is chosen for a new meeting, then very strongly encourage setting a password (unless explicitly chose to start a public meeting);
* when "start a new meeting" UI leads to joining an existing meeting, that is a violation of expectation, so interpose a second step, e.g. "this already exists; try to join it or choose another name?"
* interpose a "new user is knocking on your door" step whereby existing participants have to explicitly accept the new user;
etc.

---

## Steps to reproduce
* one user uses the "Start a new meeting" UI using a common phrase like "MyFamily" as the name;
* another, supposedly unrelated user uses the "Start a new meeting" using the same common phrase like "MyFamily" as the name;
* see that they join, without protection;

---

# Environment details
Using https://meet.jit.si/ on 2020-03-26.

Improving Matrix.to Links

Sharing links to Matrix users or rooms or messages is broken. The “matrix.to” service currently does not “cut it” for federated use.

A standard matrix: URI scheme is proposed as a superior solution, but that is far from ready and we need a usable interim solution.

Where are we with having links like https://matrix.to/#/@nukeador:mozilla.org to suggest chat.mozilla.org? …

Rubén Martín (nukeador)

To make this interim matrix.to service work satisfactorily with a federated matrix network, while it lives at a centralized URL, we need to make it support decentralized use cases as far as possible.

Current problems

Problems, in terms of design issues:

  • users can’t customize it
    • e.g. hide unwanted options, add custom options, set a default
  • federated server admins can’t customize it
    • e.g. completely replace it, or customizations as for user
  • always points to riot.im
    • should point to user’s or federated server’s preferred Riot-like web client
  • can’t click through to any other client the user has installed
    • where client registered a URI scheme handler on the device
  • doesn’t remember the user’s choice and get out of their way
    • it could redirect automatically on future visits
  • the UI is not simple, clean, helpful
  • doesn’t offer the proposed standard matrix: URI scheme
    • should offer matrix: scheme links, initially as an advanced option, for those who are starting to test or use them

Many of these issues can be improved by changes to the central “matrix.to” web site.

I propose the following changes to the matrix.to service:

Together, these might address the most pressing of the usability issues.

(The last item, adding “matrix:” URIs, is important in a different way. Rather than solving an immediate user need, it promotes development of the matrix ecosystem towards replacing this interim system with a better one.)

One issue remains. The fact that federated server admins can’t customize the experience is hard to avoid. Some options are:

  • The page could let the user redirect to their preferred “matrix.to” replacement page, and remember the preference. That works for power users on a repeat visit, not for casual or first-time users. Better than nothing so I’m starting here.
  • The page could perform a client-local search for config settings, that’s wider scope than browser-local; some kind of Zero-configuration networking, perhaps; I haven’t looked into it.
  • An admin could set local DNS (within an organization’s internal network, or a home user’s router) to point the DNS name “matrix.to” to a local page. Only useful for users sitting on such a network; may be useful in offices etc.

Mock-ups

I am trying out these ideas. I will make source code and a demo available when I get around to it. I will update this post below, as and when I do so. I will listen to your feedback (in the Matrix room #matrix.to:matrix.org please, rather than blog comments) and incorporate your suggestions where they make sense to me.

Some early draft mock-ups:

Please join myself and others discussing this in the Matrix room #matrix.to:matrix.org .

Svn Shelving Development Review

This is a review of the development of experimental support for shelving in Subversion. Shelving v1 was released in Subversion 1.10, shelving v2 in Subversion 1.11, and shelving v3 in Subversion 1.12. These versions are not a linear sequence of improvements but rather three different designs, each developing a different aspect, with different pros and cons.

The Start

Shelving has been on the Subversion developers’ wish list for a long time. This item was initially picked by my employer to be a bite-sized task to show off creating a new feature. The idea was that we could quickly make a wrapper around “svn diff” and “svn patch” that would save the user’s changes into a file, revert those changes from the working copy, and later apply the changes back to the working copy. A built-in light-weight patch management feature.

Other popular version control systems have their own kinds of shelving. For example, git has a simple and powerful “git stash”, and perforce neatly treats “shelved” as one possible storage state of a changeset, like a variant of “committed”.

I started by reviewing the kinds of shelving that have been proposed for Subversion in the past, and the facilities offered by other version control systems, and sketching some ideas of what a shelving design for Subversion could look like, ranging from light-weight management of patch files through to DVCS-like local branching. One notable idea, yet to be developed, was that we should integrate the existing “changelist” feature into shelving: “shelve” would mean moving a changelist from the WC to a shelf, and “unshelve” applying a changelist from a shelf to the WC, all the while keeping track of it by its name. But first we would need something as simple as possible, a place to store shelved changes, and mechanisms to shelve and unshelve changes.

Shelving and Checkpointing

In the wish list along with shelving was also the idea of checkpointing one’s work in the working copy. The idea was along the lines of taking a snapshot of the working copy before making changes, so that we could roll back to an earlier state if the changes didn’t work out, particularly if the changes were something like an “update” that changes the WC base state, or a merge that might result in conflicts.

I had been involved in previous discussions around this idea, when the focus was on how we could implement snapshots within the WC DB. My own view of the WC DB is that its design and implementation is far too hairy and fragile to attempt something like this at the low level. I could imagine a design at a higher level where snapshots are saved more or less like commits in a repository. The architecture of Subversion’s client side and working copy is too far removed from the repository architecture to make that a feasible integration in the current code base. I did explore embedding an actual Subversion repository, or some layers of a repository, to use as a shelf storage unit. There are some possibilities there but not in a short term development.

Instead I re-imagined checkpointing as referring to taking snapshots of work in progress in the form of a series of numbered patches, capturing only the committable changes but not the WC base state. This would not provide the ability to roll back after a bad update, which is a pity. However, it would be useful. The evidence of this is that developers including myself already tend to save a series of numbered patch files to capture iterations of our work in progress, in certain cases where we feel committing is not appropriate or not convenient.

Therefore I ended up lumping the terms “shelving” and “checkpointing” together, “shelf” referring to a series of snapshots sharing a common name, description, and WC base; and “checkpoint” referring to one numbered snapshot within a series. Thus these would become light-weight UI mechanisms on top of the same underlying implementation which I refer to generically as shelving.

Shelving v1: Patches

We recognized from the outset that “svn diff” and “svn patch” do not support the full range of changes that Subversion can represent in a commit. They have no way to represent “binary” file content, nor copies and moves, nor creation and deletion of a directory. Defining representations for all types of change has been on the wish list for ever, and was something I thought we could come back to later, after implementing shelving of the currently supported change types. It is almost an orthogonal issue, with little bearing on the patch management layer other than defining a suitable failure mode when unsupported change types are present.

Shelving v1 therefore is basically a user interface for writing the output of “svn diff” into a named and numbered patch file, optionally running “svn revert” on the affected files, and later replaying a selected patch file into “svn patch”. Minor additions include saving a descriptive message, like a commit log message, for the user’s reference.

Handling of unsupported change types is rudimentary. A shelving attempt fails if unsupported changes are present in the requested scope. An unshelving attempt fails if any local modifications are present in the working copy among the paths to be unshelved. At least that is what the command-line interface implements. It was necessary to provide some checking or “dry run” APIs so that the client code could fail gracefully and let the user know that unsupported changes were going to prevent shelving or unshelving. Especially the latter, as we do not want the client to start applying changes to the WC and then hit a failure part way through, as the WC interface provides no roll-back mechanism to recover cleanly in such a case.

A more sophisticated client might want to offer the option of skipping unsupported changes instead of completely refusing to shelve anything. However, skipping some changes can lead to further difficulties. For example, if we skip shelving a “rename directory”, that makes it awkward to shelve and unshelve any changes that occurred inside that directory. Therefore keeping the options simple and crude was the most expedient decision to get something useful working.

This implementation was released as an experimental feature in Subversion 1.10.

Shelving v2: Binary Snapshots

Patch files are all very well for text changes, but a large class of customers who pay for Subversion hosting are games developers who use Subversion because of its support for large binary files such as images and videos. What can we do to enable shelving those?

We could upgrade the patch file format to include binary data, perhaps encoded into some text-compatible form.

Or we could store file blobs as separate files.

There are trade-offs. Speed is a concern, as “binary” files tend to be large. Using whole files brings the possibility of moving files on the filesystem when the user does not need to keep a copy, and perhaps using linking or copy-on-write semantics when they do. Using a patch, we would be encoding and decoding the binary blobs into a patch stream, which means linear processing on every use. The patch format is not natively indexed for quick seeking, so that limits all accesses to the patch, even accesses to small files, unless we invent and add an index.

A second concern was patch application versus 3-way merging. A patch is a context diff. It stores the before and after version of each changed hunk, and a little context around each hunk, but not the whole original file. Applying a patch works reliably when the file we’re applying it to is the same original file as when the change was shelved. If not, perhaps because we updated and fetched changes that somebody else committed in the mean time, then applying a patch works only so far. It works when the changes being unshelved neither overlap nor functionally interact with changes that were brought in by the update. There is no way to convert the patch application to a 3-way merge and review the two sets of changes independently, unless we were to read the original version numbers from the patch file and fetch those versions from the repository.

A better position to be in, when resolving conflicts, is to have immediate access to all the versions we need for a three-way merge. The original base version of the change, and the whole changed file (which we can either store explicitly, or reconstruct from base + patch), and the new base version onto which we want to apply the patch.

The three-way merge enables (1) more robust merging of text changes, because the whole context is available to inspect, not just bits of context; and (2) the possibility of merging non-text files such as images, or arbitrary document file types, by using an external three-way merge tool that understands those file types.

We changed the shelf storage implementation to store the ‘before’ and ‘after’ version of each changed file, for all files. (Also for property changes.) No more patch file.

To apply changes, it uses a 3-way merge per file. It does not yet use a 3-way merge for tree changes. It is not yet integrated with the merge conflict handling mechanism that ‘update’ and ‘merge’ commands use.

This version was released in Subversion 1.11.

Shelving v3: Piping Changes between WC and Shelf

For the next version of shelving I determined to address the architectural problem. What problem? The storage for a shelf needs the same core semantics as the WC: a base layer plus some committable modifications. The WC had no generic API for “read the committable changes”, nor for “write these changes to the WC”. The core operations of shelving should be as simple as two pipes:

The core of “shelve”:

  • wc.read_changes() | shelf.write_changes()

The core of “unshelve”:

  • shelf.read_changes() | wc.write_changes()

What was it like before?

Shelving v2 walked the tree of WC paths “manually”, calling separate API functions to read and write each little bit of user data (file text, properties, directories). The API for the shelf storage was completely different from the WC API. For one conceptually simple job, copying changes this way or that way, there were two large chunks of manually written code, with no re-use and lots of room for bugs.

Shelving v1 used the API of “svn diff” and of “svn patch”:

  • wc.diff() > shelf_patch_file
  • wc.patch() < shelf_patch_file

That had the desired properties of re-using generic APIs and using a single API entry point to process the whole tree, but it did not have the right semantics for the stored changes. As a context diff, it lacked the base version, and the patch format lacked support for binary data, mkdir/rmdir, copies and moves.

Shelving v3 adds APIs to libsvn_wc that are analogous to “diff” and “patch” but instead of streaming the changes to or from patch file format they connect to the existing “delta editor” API which is what Subversion uses to transfer changes to and from the repository, in commit and update operations. The delta editor API is the “pipe” in the core operations of shelve and unshelve:

  • wc.read_changes() | shelf.write_changes()
  • shelf.read_changes() | wc.write_changes()

Providing the new wc.read_changes() involved a relatively simple extraction of part of the existing WC “commit” logic, removing all the repository-specific details to leave just the core function of sending changes to a delta editor. (This new API is actually named “svn_client__wc_replay”.)

The new wc.write_changes() API was written from scratch. (This new API is actually named “svn_client__wc_editor”.) Ideally this might have been extracted from some existing code such as “merge”, “patch”, or “update”; but “merge” and “patch” both work with context diffs rather than the delta editor interface, and the implementation within “update” was not readily re-usable. One place was found where this new API could be re-used as a drop-in replacement: in the implementation of repository-to-WC copy, which previously incorporated a dedicated delta editor that only knew how to apply “add” operations.

Next we needed to re-implement shelf storage in such a way that we could store the base of the changed files/paths, and write and read changes on top of that base via a delta editor API.

Shelving v3: Storing the Base

The quickest ways to create a WC base layer are the same ways we currently get a WC: either “checkout”, or copy an existing WC and revert the local modifications. Neither is ideal, but both work.

Ideally, what we want for the shelf base storage is some sort of skeleton tree that includes only the paths that have shelved changes, and storing for each such path just a reference to the existing WC base storage. This would take very little space and time. The APIs to accomplish this have not yet been developed. (They are needed not only for shelving but also for “view specs”, a requested feature for saving and restoring the WC base shape.)

In order to prove the concept of using the delta editor API for shelving, “checkout” is initially used to create a WC base for each shelf. The down-sides to this are of course that unless there is a very fast connection to the server this can take a long time, and it is heavy on disk space because it is not constrained to just the shelved paths.

Shelving: Work Needed

Currently Subversion the project has insufficient developer resources to continue these developments. If and when a volunteer or paid developer wants to take up the development, here are some suggestions.

Improving Shelving v3

Making a copy of the WC base state more efficiently is the first priority. The WC design does not seem amenable to cleanly creating a WC whose base layer is a reference to the base layer in another WC, nor to cleanly creating a “shelf” working layer inside the main WC that refers to the existing base layer. A quick-and-dirty approach could be to copy the whole WC and revert all the changes. To do the same but copying only the metadata “.svn” directory without copying the working files, might require modifying the revert code to make it work in fully “metadata-only” mode, which would be a useful upgrade anyway.

Shelving v3 currently fails some of its tests. Some are related to “mkdir”, which may require a relatively minor fix. Some are related to merging when applying changes into a WC that no longer matches the original base state. This is because the “apply” code does not attempt merging. Somehow the “apply” code needs to be hooked up to a merge. Subversion’s main merge code is unfortunately not in a good state to be re-used like this. It may be desirable to implement a “merge” alternative to the simple “apply” API (svn_wc__editor).

Shelving: Current State

[UPDATED 2020-03-24] Subversion 1.14 LTS is due to be released soon. Subversion 1.14 looks set to include shelving v2 and v3, disabled by default and opt-in by setting an environment variable. The reason for making them disabled by default is that 1.14 is intended to be a long-term support release focusing on stability rather than experimental features.


Also published at: https://cwiki.apache.org/confluence/x/MxbcC

Matrix Bridges Working Together

I am testing SmsMatrix, an SMS-to-Matrix bridge on Android. It is one of many Matrix bridges. But this post is not mainly about SmsMatrix, it is mainly the start of a train of thought about how we can manage communications with one person bridged over different channels.

This is a write-up of thoughts I first posted to the SmsMatrix discussion room, which would be better continued on #bridges:matrix.org.

Let’s say my friend has telephone number 123456 and is in my phone’s address book as “Ray”.

When my phone receives an SMS from Ray, SMSMatrix sees that message arrive in the Android SMS subsystem, and (if it hasn’t previously done so) it creates a new Matrix direct-chat room which it will associate with the SMS sender’s phone number, and invites me to join the room.

The other participant in this room is the bridge bot. Its username is @sms-bot:my_hs, and it sets its own display name to “Ray”. I assume that is a per-room display name, so it will show as a different name in each SMS conversation. The bot does not set a room name, and so Riot displays it as “Ray” (the other participant’s display name), and the bot sets the room “topic” field to “123456”, which Riot displays next to the room name. The idea is that in the user interface it looks like I am having a chat with “Ray” and it works like I am having a chat with Ray.

In Riot-web’s message view, the sender of this first SMS message shows as (precisely) “sms-bot”. The display name of @sms-bot does seem to be correctly set to “Ray” — I can see that in the “users” panel. Perhaps that’s a Riot bug that it doesn’t always display the user’s display name.

This bot works without admin privileges. It can do so because it doesn’t try to create and control a new Matrix user for each SMS sender, nor does it try to find and puppet an existing “real” matrix user account corresponding to the sender.

I wonder if it would be nice to upgrade it to a puppeting bridge if I one day have enough time to devote to it. I’m wondering exactly what we could achieve if we did so.

I also have some other bridges, and let’s say Ray also has her own Matrix account on her own homeserver, where she calls herself by her full name “Radiator”. Now I have Ray’s messages coming in to me in these Matrix rooms/users:

  • @sms-bot:my_hs “Ray”
    • in room (unnamed) displayed as “Ray 123456”;
  • @whatsapp_123456:my_hs “Ray (WA)”
    • in room (unnamed) displayed as “Ray (WA) WhatsApp private chat”;
  • @telegram_123456:my_hs “Ray (TG)”
    • in room (unnamed) displayed as “Ray (TG)”;
  • @ray:rays_hs “Radiator”, Ray’s real Matrix user account
    • in room named “Ray”

The mautrix-telegram and mautrix-whatsapp bridges each create a new user id for Ray in their own namespace of matrix user-ids, which is different from the SMS bridge.

I am interested in researching how we could improve this situation. Would we want to make all bridges to the same person somehow bridge to the same Matrix user?

@pwr22 asked, “How would the bridge know which method to use when sending a message?” That’s a good UX question. The first thought coming to my mind is that’s easy if we dedicate a separate room for each bridge for that user. (Ray’s SMS chat room, Ray’s Telegram chat room, …).

“In that case what do we gain from having one puppet user instead of separate ones per service?” We gain things like,

  • “search for all messages from @ray [containing the word ‘zoo’]”,
  • “search for the rooms that @ray is in”.

I expect users would want to choose, by their own preferences and perhaps per contact, whether a contact’s messages are all grouped in the same room or a separate room per bridge.

If I want to group all Ray’s messages into one room, then the sending UX issue must be solved a different way. Perhaps the client would allow me to set a preference. (That could be stored as client-managed metadata, like how Riot stores some settings in “account data” under “im.vector.riot.xxx” keys.) And/or there could be bot commands so I can tell a bridge "!sms-bot: enable [or disable] sending outgoing messages in this room through SMS".

In response to my thought about making all bridges bridge to the same Matrix user, @swedneck replied, “i’m not sure that’d make sense. i would probably have all those bridges in the same room if possible. you’d probably want his bots to ignore each other though.” So:

  • one room (“My direct chat room with Ray”) containing
    • myself,
    • the virtual user @sms-bot:my_hs (“Ray”),
    • the virtual user @telegram_123456:my_hs (“Ray”),
    • …, and
    • the Matrix user @ray:rays_hs (“Radiator”).

I see how that would work well for incoming messages: I would be able to see which bridge each message came through, if I wanted, while keeping all the display names the same if I mostly didn’t care to see the difference.

We would need a way to make the bots know about each other. When Ray sends me a message through one bridge, we (probably) don’t want all the other bridges to send him copies of it. They’re all “my” bots running against my Matrix home server (HS), so I could add something to their configs easily enough. Alternatively, we could define common metadata that they could each set, so they could detect the situation automatically. The latter would be more flexible, in case some of them are not configured directly alongside my HS.

What about when I send messages to Ray? If I just type without a mention, then the same applies as I said before: some way of configuring in advance which bridge(s) is/are to send it.

When I Mention Ray, there are different matrix usernames I might mention, depending how I do it: if I click on one of the received msgs, I might select a bridge username that isn’t the currently configured preferred send method; how should that all work?

Maybe bridged usernames should be specially treated: all treated as something like aliases, and all resolved to the same single “real” matrix username?

Needs more thought.

MSC2346: Bridge information state event

As part of a solution, we will need a mechanism to allow multiple bridges in a room to communicate with each other. This proposal may achieve it:

Going to build a Matrix

I am leaving my latest job and moving on to my new passion, Matrix.

Recently I have become passionate about the need for modern communications systems that are Open in the sense of freedom to talk to anyone. The currently dominant silos like WhatsApp only let me talk to the friends who are willing and able to subscribe to that company’s terms and restrictions, with no way to get around them when they decide to display advertising to me or stop supporting my mother’s not-terribly-old iPhone. To me, that is like some historical feudal system in which I live as a tenant and I must obtain agreement from the lord of the estate if I want to invite any of my friends to visit me. We need and deserve better than that: an Open way to communicate to our friends and business contacts. Email served that role for the first part of the 21st century, and Matrix now serves that role for the era of instant messaging.

My software development in recent years has been mostly on the open source Subversion version-control system, and I have particularly enjoyed helping to create something so widely used and appreciated. Nowadays its popularity is eclipsed by Git in small to medium sized projects, while Subversion still enjoys a strong following in certain fields such as games development due to its strengths in versioning large data sets and simplicity of usage. Participating in the development of Open Source software has given me the greatest satisfaction in my professional life, and I intend to keep it that way. That is why developing the Matrix communication system is so exciting.

p.s. I am still contracting on Subversion support work, so get in touch if you need any bug fixing or problem diagnosis.

Svn Code Developments in my Head

Thoughts on where Subversion should be going, code-wise.

A companion page to What’s In My Head which is about community and project management aspects of Subversion.

General remarks:

  • Progress is slow: attempting to do anything with svn code is slow going. Reasons?
  • Technical debt: low level C code doing business logic, compatibility (almost bug-for-bug) trumps rewrites.
  • Functionality that may not suit all users (diff, patch, merge, externals, keywords, eol-style, WC storage, repo storage, …) has usually been built in, where pluggable interfaces would encourage modularity. Pluggable “diff” is crude, incomplete. Pluggable RA layers are our best example.
  • Writing third-party software that uses Svn is not as easy as it should be. See: Bindings; etc.
  • Companies writing commercial systems do not interact much; how can we improve that?
  • Rewrite clients (svn) and high level libs (libsvn_client) in a more suitable language?

Internal improvements needed:

  • WC APIs. The WC layer needs API that reflects operational units. For example, “a tree in the repository” should function interchangeably as the base of a copy or of a checkout; and “the modifications to a tree”; but not “merge” and “update” which are client level concepts; thus both higher and lower level than current APIs. See: SVN-4786 “WC working-mods editor”, for a start.
  • Diff in particular shows up many inconsistencies due to bitty implementation. Once generic changes APIs are in place, rewrite ‘diff’ to use them: diff(WC, repo): {a=tree_open(WC), b=tree_open(repo); diff_trees(a,b)}.
  • Diff: distinguish “diff a commit” from “diff two trees”.
  • Bindings. State is poor, inconsistent. What’s needed is an OO generic interface, implemented in several languages. An implementation technique: write C++ OO layer on top of existing C API, then wrap it in other languages. Problem: underlying svn API design is not OO. Opportunity: design the APIs we would like to provide; reimplement libs to provide them. See Brane’s current “svncxx” work.
  • Repository API: Eliminate no-op changes.
  • Repository API: Copy content without recording ‘copy-from’.

Feature improvements:

  • Server-side “pull request” merging.
  • Renames. TODO: Write up how my strict element-id scheme (see svnmover) can improve handling renames in update and merge, even in a primitive form without server-side support. To integrate it in existing svn, depends on: WC APIs; rewrite of merge code. Develop a variation where directories are ephemeral, for better real-life semantics (like splitting and joining dirs).
  • Query language (Hg revsets; Paul Hammanthttps://graphql.org/learn/GraphQL).
  • Repo syncing (revprops and locks as versioned/sequenced events; Merkle trees).

General Remarks

Subversion’s client side code is particularly obtuse.

The server side is considerably cleaner. Its user is the network protocols, it is focused on doing its job quickly and accurately, and its API surface is relatively compact and stable. The API semantics are not perfectly clean and the implementation is complicated by a lot of optimization and caching which sometimes introduces bugs, but overall it is a lot better than the client.

Rewrite clients (svn) and high level libs (libsvn_client) in a more suitable language? This code consists of simple logic, filters, configuration, hashes, composition, etc., which in C is tedious, error-prone, inconsistent, contributing to buggy inconsistent UX. Python would make sense for the client itself. For the libs, difficulty is how to provide bindings; C is good base language for that. Rewrite might not make a lot of sense for the existing client, because it has too many idiosyncrasies which would have to be either recreated in excruciating detail or smoothed out losing compatibility. However, a rewrite dropping a lot of oddities and little-used features such as externals might make make a lot of sense if we want a new client for e.g. Android and IOS.

WC APIs

The WC layer needs an API that reflects the granularity and shape of the operations we want to perform on it. Usually these operations are reading or modifying a given subtree to a given depth, with criteria such as selecting items with a certain set of statuses.

When the client commit code is scanning a WC, looking for items to commit according to the user’s given criteria, it should not itself have to iterate over a list of children, deciding which ones fit the criteria, looking up the child’s schedule status and deciding whether to recurse into a subdirectory based on its schedule and other options. Instead it should be able to invoke a suitable tree walker which does all that, and just perform commit processing on each item found. This same tree walker should be shared with other client functions such as diff and shelve that also want to operate on a similar selection.

The base of a checkout and the base of a copy are both “a tree in the repository”, more or less. There should be APIs to put such a tree into the WC storage, read it out, use it as the base of a checkout, use it as the base of a copy, use it as the base of a diff, and so on, and these operations should all use a common, interchangeable “tree” object. At present there is no such “tree” object concept in the APIs, and each of these operations is implemented implicitly by code that is specific to each use case, all different from each other, so that there is no chance of obtaining consistent results and no easy way to spin up a new instance of such behaviour in the implementation of shelving.

To improve this, I plan to create a set of interfaces for accessing the WC and other WC-like data such as shelves. One part of this interface is dedicated to accessing the local modifications that can be committed; this must be compatible with the interface used to commit changes to a repository and read them out. Another part is dedicated to accessing the base state of the WC; this will have something in common with the repository access interfaces too, but also will have notions such as “switched”. A third part is dedicated to accessing the uncommittable WC state such as changelists, conflicts, obstructed, missing, and unversioned items.

For the “committable changes” interface, svn_delta_editor_t is a good starting point as by definition it must support the necessary operations for a commit.

For the “uncommittable changes” interface, something new must be invented, perhaps starting with the WC “status” APIs.

For the “base state” API, the recent experimental “viewspec” output feature built on top of the WC “reporter” shows how we can describe the base state. To that must be added some way of transferring the necessary content. The present implementation directly contacts the repository URL through the RA layer when it needs to fetch base content; this needs to be abstracted out to the new interface.

Subversion’s “copy” operation comprises not just local changes but also a request for a new base subtree to be fetched. Therefore there is some dependency between the “committable changes” interface and the “base state” interface. Either the “committable changes” interface will need to depend directly on “base state” in order to fetch more base content when it needs to, or the caller will need to ensure that the necessary base content has been provided in advance and the “copy” method will need to be able to find it.

WC Tree-of-Changes Walker

(May be needed by: “Revert”, “unshelve”, …)

Need a tree-of-changes walker that visits nodes in a depth-first order with: (1) visit a dir before its children and also afterwards; and (2) visit a replaced subtree twice, once for the delete and then again for the add. Useful especially but not only for WC.

Diff a Commit vs. Diff Two Trees

Repository API: Eliminate No-op Changes

A curse of inconsistency. Some operations in the commit API, such as “open a file”, leave a trace in the FS even if no change in content is made, which then manifests in a similar API method call sequence when the commit is replayed. We often call these “no-op changes”, though “touches” may be more descriptive.

Arguing from a written description of the issue, without concrete details, it has proven hard to convince people that this is a bug: they can argue that storing and retrieving an extra flag that “the object was touched” is a feature. Using two approaches I am confident we can show it is a bug. (1) Show that the behaviour is inconsistent. (2) Argue that the Subversion data model is supposed to be based on stored state, and this amounts to “hidden state”.

  1. There are different potential cases where a no-op change might be considered: open a file and close it, open and apply a null text-delta, copy and apply a null text-delta, change existing property to its same value, delete nonexistent property, etc. TODO: test many cases and report the inconsistencies between them.
  2. There are different methods for information retrieval: list changes, replay revision, delta-dirs, update editor, etc. There are different API levels to try: FS, repos, RA. TODO: test retrieval methods and report the inconsistencies between them.

(2) Data model. In the general model of versioning a series of snapshots of user’s data, we should be able to extract any two stored states and calculate the difference between them, but a “touched” flag does not work like that; it is a function of some hidden state.

Repository API: Copy Content Without ‘Copy-From’

Should be able to:

  • commit a reintegrate merge from WC to repo efficiently, in O(1) time and size.
  • perform a reintegrate merge server-side, in O(1) time and size (new feature)

When committing changes, it is sometimes the case that the target content (of a file, subtree, or branch) is a content that already exists somewhere else in the repository. This is common in merging, where the merged content of a file (or subtree) on the target branch is identical to a content that exists on the source branch. In a reintegrate (or “copy up”) merge, the entire content of the target branch becomes equal to the current content of the source branch.

It is inefficient for the client to transmit changes against its working copy base to recreate the desired content on the server, when in principle it could instead tell the server where else that content may be found.

The existing “copy” method in the editor API copies the referenced content. At the same time it rewrites the history pointer to point to the “copied from” location. What we need is a variant of “copy” which keeps the default “natural predecessor” history.

(There is a “link” method in the FS API, which is not exposed in the RA API. It appears to be somehow related to this. TODO: investigate whether that is of any use.)

Server-side “pull request” merging

Thoughts about Svn Design and Code

A companion page to What’s In My Head.

Some specific code enhancements

  • WC subtree read/write APIs. The WC layer needs API that reflects operational units such as “a tree in the repository” to function consistently as the base of a copy or of a checkout; and “the modifications to a tree”; but not “merge” and “update” which are client level concepts; thus both higher and lower level than current APIs. See: SVN-4786, for a start.
  • Diff in particular shows up many inconsistencies due to bitty implementation. Once generic changes APIs are in place, rewrite ‘diff’ to use them: diff(WC, repo) := {a=open(WC), b=open(repo); diff(a,b)}.

System-level topics

  • Renames
    • My strict element-id scheme (see svnmover) can improve handling renames in update and merge. Write it up, esp. how it can be used primitively without server-side support. To integrate it in existing svn, depends on: WC subtree read/write APIs. Develop a variant where directories are ephemeral, for better real-life semantics (like splitting and joining dirs).
  • Query language
  • Repo syncing
    • The design of repo syncing (svnsync) is incomplete: especially revprops and locks
    • revprops and locks as versioned/sequenced events; Merkle trees?
  • Rewrite high level libs (libsvn_client) and clients (svn) in Python
    • They consist of simple logic, filters, config, hashes, composition, etc., which in C is tedious, error-prone, inconsistent, contributing to buggy inconsistent UX. For libs, difficulty is how to provide bindings; C is good base language for that.
  • OO Bindings
    • Bindings state is poor, inconsistent. What’s needed is an OO generic interface, implemented in several languages. An implementation technique: write C++ OO layer on top of existing C API, then wrap it in other languages. Problem: underlying svn API design is not OO. Opportunity: design the APIs we would like to provide; reimplement libs to provide them.
  • Plug-ins
    • Functionality is built-in first, where pluggable interfaces would be better (diff, patch, merge, externals, keywords, eol-style, WC storage, repo storage, …)
    • example: ‘diff’ plug-in support exists but is weak: want different diff tools for different file types, want to configure an external tool for tree diffs
  • A new ‘svn’ client
    • Consider writing an alternative ‘svn’ client from scratch with consistent functionality built from layers of blocks, not attempting to emulate CVS, and taking inspiration from git/hg/bzr. For example it should provide a function to output a tree (or file) which should encompass all of the existing ‘ svn list’, ‘svn cat’, ‘svn proplist’, ‘svn export’, ‘svn diff -r0:REV’, ‘svnrdump dump’; and a function to input a tree (or file) which should provide the inverse of all of those output modes, encompassing ‘svn propset’, ‘svnmucc put’, the hypothetical ‘svn addremove’, etc.

General observations

Progress seems to be stifled. Why?

  • It’s a “maturing” project with decreasing volunteer activity.
  • Technical debt: low level C code doing business logic, compatibility (almost bug-for-bug) trumps rewrites.
  • Writing third-party software that uses Svn is not as easy as it should be. See: Bindings; etc.
  • Companies writing commercial systems do not interact much; how can we improve that?