Why cut-copy-paste is hard

[This note is primarily a response to a recent LWN thread. I'm posting it here because it got a bit long for a discussion forum.]

Several years ago, I joined the X.Org Foundation Board of Directors. One of my campaign pledges was to organize a project to finally "fix" X cut-copy-paste (CCP).

It has long been acknowledged by anyone paying attention that the user experience around CCP on the X desktop is horrible (although it has improved a bit in the last couple of years). I'm smart, I could fix it, right?

Uh, no. Here's some reasons why CCP is hard, especially on X…

  1. Users want a seamless multimedia cut-copy-paste experience. Heck, they expect to CCP audio. This implies being able to identify the media type of CCPed content, including distinguishing direct content from content references. It also implies a good system for coercing and converting media types during CCP.

    The best means we have for identifying media type right now is MIME-types. Unfortunately, they are really too incomplete and disorganized for CCP purposes. Their ontology is only two levels deep and highly incomplete. Heck, they can't even deal with compressed or packaged content reasonably.

    Even for text, UTF-8 doesn't solve the problem. Users expect to be able to CCP formatted text with the formatting preserved. Sometimes. Maybe.

  2. Persistence and related semantics have been an issue since the 128K Mac introduced CCP to the world in the mid-1980's. The Mac had a literal software "clipboard" app, with pages and everything. This was a nice metaphor, but gave little support for the kind of "quick-transfer" CCP that users wanted; the model has since been minimized. At any rate, CCP-ing content that took a substantial fraction of storage was and is a problem for persistent models. Managing long-persisting state is a problem for the user as well—sooner or later, everyone inadvertently pastes something saved and forgotten long ago into a really bad place.

    X took things in a different direction with its "pasting is between apps" semantic. This has the advantage that the transfers are quite efficient and typically require little or no extra storage. It has the disadvantage that when an app exits its state is no longer available for pasting. Although this protects the user from long-gone state.

    So we have two models, both of which have problems and issues. Awkward.

  3. Users have strong UI expectations around CCP. These expectations are similar but not the same between Mac, Windows and X desktop users. There's a resulting "uncanny valley" effect that's only exacerbated by the fact that each of these platforms (but especially X) has individual applications that also behave a little different from the norm.

    On the Mac, CCP behavior is at least codified by the Mac HIG. I don't know for sure the situation for Windows, but I've never found anything that looked definitive from Microsoft. On X, Gnome has an HIG document that says some things about CCP, but they're pretty incomplete and not too useful. Worse yet, X has two almost-but-not-quite-peacefully coexisting CCP models. Old people like me expect one; people coming from Windows or Mac expect the other.

    For example, how many selections should be allowed to be active (ready-to-CCP) at the same time? One? One per display? One per application? One per window? When the user tries to copy, which one should be the source?

    Another example: Does your favorite app support making selections using the keyboard? Do different apps handle this slightly differently? (Hing: yes.)

    Standardizing UI behavior that nobody knows about or agrees on is hard.

  4. CCP is full of state. This is related to the points above, but different. To get a solid specification, you really want to specify states and state transitions of the whole CCP environment. "When the selections are like this, and the clipboard is like this, and the user performs this CCP action, then here's what happens."

    Such specifications are notoriously hard to write. To my knowledge, it has never been done for CCP.

  5. Imagine that you have the perfect solution to the above problems. You know exactly what should be done, and everybody who matters agrees with you. (This should take about…one day, maybe two. Right?)

    Now what? It has taken 10 years and full backward-compatibility to start to get XCB into the world as a replacement for Xlib. By the time we're close to being done, if ever, X will be long dead.

    CCP is embedded in everything. Most people who have a "working" implementation will not bother to upgrade it just because yours is better. You probably have to wait for the turnover of every application on the desktop, at least to a new major version.

    It has been only a few years since Xemacs abandoned a CCP scheme that was outdated 20 years ago. Why? Because no one could be bothered, as far as I can tell.

    You of course will have to have an easy-to-use library. It will have to provide a C interface, since that's the closest thing to a lingua franca that we have for the desktop in 2010. But this won't be enough. You'd better make it easy for the common toolkits on your desktop—at least Qt and Gtk for X—by providing even more mechanism. Don't even think about going without some kind of browser support and JavaScript library support; client-side JavaScript has no access to your C library.

    You're getting started now. Barely. Enjoy.

Of course, X CCP could be vastly improved. A "CCP strike force" would be a good idea, and I would support it. However, it would be a lot of slow, incremental work, without much of any payoff for the participants in terms of glory, wealth or even user acknowledgment. Until I find a band of folks excited by that description, I think I'll spend my time doing other things. Friend of Bart

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Requirements are always hard

Thanks for the nice examples! I'd argue that these are situations where it's clear that the standard specification of CCP is incomplete: it doesn't tell the user what will happen. Mostly, I think the user would be happy enough with any predictable behavior, especially if it's maximally information-preserving. What drives the user nuts is getting a behavior that wasn't expected, then being unable to correct the behavior. Friend of Bart

Yes, user input is sometimes needed

If you look at Excel and its various clones and workalikes, for example, you'll see a lot of options for pasting that have to do with how the paste should be handled; see in particular the "Paste Special" commands.

However, many programs offer a lot of possible source formats for a given copy command, and so there are still issues. Assume that the user has somehow successfully indicated what they want to do, and assume that there are multiple source formats to support this.

First of all, one problem with a persistent CCP model like Mac and Windows use is that potentially the app has had to leave all those formats on the clipboard. If the object is a giant image copied from Gimp, for example, that is a lot of clipboard storage. Presumably, one could get by with a single lossless format, as long as it was truly lossless, but this would require some sort of ontology to decide that there need be only one, and to choose which one it is. For lossy formats, they each lose in different ways, so it's hard to pick a winner here.

Things are better in the X non-persistent model, but some kind of ontology is still desirable; once the user has indicated what the destination format should be, they probably want the system to work out which of the offered source formats will be best for this. This is still hard.

In any case, deciding what the source and destination programs should "understand" is hard. As noted above, MIME types really aren't sufficient for many common cases in practice. Apparently Apple has something better, but I haven't looked at it much.

Thanks much for your comments! Friend of Bart

I think Windows CCP was entirely persistent

The prompt you mention occurred as far back as the Mac 128K. As far as I know, it was never intended to indicate an alternate copy mechanism, but merely to let you know that you might be leaving a humongous pile of droppings on the persistent clipboard, and give you a chance not to do that.

I'm not an expert, but I don't think Windows has any non-persistent clipboard mechanism; multitasking is too recent there. Indeed, this is why it happened originally on X, but not on Mac or Windows; X was already operating on top of multitasking UNIX, without which there's no real graceful way to do handoffs. Note that the Mac could always hand off to applets IIRC—that's how they got the clipboard to work.

Corrections welcome. Friend of Bart

Thanks much for the information!

The URLs given above were quite informative, thanks!

So Microsoft solves the persistence problem in a semi-sane way as long as the supplying application doesn't put any giant formats fully on the clipboard and both applications are up at the time of the transfer.

Note that Microsoft's spec requires an exiting application to "place valid memory handles on the clipboard for all clipboard formats that it provides. This ensures that these formats remain available after the clipboard owner is destroyed." Thus, the persistence problem post-source-exit remainds. I haven't found any real guidance about what formats need to remain, although I haven't looked too hard yet.

In any case, this looks like the sort of model X could adopt pretty painlessly. With a proper ontology, one could even leave only one format on the clipboard in the normal case, which would make it plausible given modern disk and memory sizes in relation to application object sizes. (Please don't copy your database and exit. Friend of Bart)

Interesting stuff. Friend of Bart

Still thinking cut-copy-paste is hard

Thanks much for the detailed reply here!

I think we can agree to disagree on how much of what needs to be part of the cut-copy-paste system. Like I say, I've tried to go there in the past, which is what motivated my original comments. Maybe I'm just not bright enough to figure out how to do it.

I've told you a little about my background; I'm really curious about yours. Anything you'd like to share?