[This note is primarily a response to a recent LWN thread. I'm posting it here because it got a bit long for a discussion forum.]
Several years ago, I joined the X.Org Foundation Board of Directors. One of my campaign pledges was to organize a project to finally "fix" X cut-copy-paste (CCP).
It has long been acknowledged by anyone paying attention that the user experience around CCP on the X desktop is horrible (although it has improved a bit in the last couple of years). I'm smart, I could fix it, right?
Uh, no. Here's some reasons why CCP is hard, especially on X…
Users want a seamless multimedia cut-copy-paste experience. Heck, they expect to CCP audio. This implies being able to identify the media type of CCPed content, including distinguishing direct content from content references. It also implies a good system for coercing and converting media types during CCP.
The best means we have for identifying media type right now is MIME-types. Unfortunately, they are really too incomplete and disorganized for CCP purposes. Their ontology is only two levels deep and highly incomplete. Heck, they can't even deal with compressed or packaged content reasonably.
Even for text, UTF-8 doesn't solve the problem. Users expect to be able to CCP formatted text with the formatting preserved. Sometimes. Maybe.
Persistence and related semantics have been an issue since the 128K Mac introduced CCP to the world in the mid-1980's. The Mac had a literal software "clipboard" app, with pages and everything. This was a nice metaphor, but gave little support for the kind of "quick-transfer" CCP that users wanted; the model has since been minimized. At any rate, CCP-ing content that took a substantial fraction of storage was and is a problem for persistent models. Managing long-persisting state is a problem for the user as well—sooner or later, everyone inadvertently pastes something saved and forgotten long ago into a really bad place.
X took things in a different direction with its "pasting is between apps" semantic. This has the advantage that the transfers are quite efficient and typically require little or no extra storage. It has the disadvantage that when an app exits its state is no longer available for pasting. Although this protects the user from long-gone state.
So we have two models, both of which have problems and issues. Awkward.
Users have strong UI expectations around CCP. These expectations are similar but not the same between Mac, Windows and X desktop users. There's a resulting "uncanny valley" effect that's only exacerbated by the fact that each of these platforms (but especially X) has individual applications that also behave a little different from the norm.
On the Mac, CCP behavior is at least codified by the Mac HIG. I don't know for sure the situation for Windows, but I've never found anything that looked definitive from Microsoft. On X, Gnome has an HIG document that says some things about CCP, but they're pretty incomplete and not too useful. Worse yet, X has two almost-but-not-quite-peacefully coexisting CCP models. Old people like me expect one; people coming from Windows or Mac expect the other.
For example, how many selections should be allowed to be active (ready-to-CCP) at the same time? One? One per display? One per application? One per window? When the user tries to copy, which one should be the source?
Another example: Does your favorite app support making selections using the keyboard? Do different apps handle this slightly differently? (Hing: yes.)
Standardizing UI behavior that nobody knows about or agrees on is hard.
CCP is full of state. This is related to the points above, but different. To get a solid specification, you really want to specify states and state transitions of the whole CCP environment. "When the selections are like this, and the clipboard is like this, and the user performs this CCP action, then here's what happens."
Such specifications are notoriously hard to write. To my knowledge, it has never been done for CCP.
Imagine that you have the perfect solution to the above problems. You know exactly what should be done, and everybody who matters agrees with you. (This should take about…one day, maybe two. Right?)
Now what? It has taken 10 years and full backward-compatibility to start to get XCB into the world as a replacement for Xlib. By the time we're close to being done, if ever, X will be long dead.
CCP is embedded in everything. Most people who have a "working" implementation will not bother to upgrade it just because yours is better. You probably have to wait for the turnover of every application on the desktop, at least to a new major version.
It has been only a few years since Xemacs abandoned a CCP scheme that was outdated 20 years ago. Why? Because no one could be bothered, as far as I can tell.
You of course will have to have an easy-to-use library. It will have to provide a C interface, since that's the closest thing to a lingua franca that we have for the desktop in 2010. But this won't be enough. You'd better make it easy for the common toolkits on your desktop—at least Qt and Gtk for X—by providing even more mechanism. Don't even think about going without some kind of browser support and JavaScript library support; client-side JavaScript has no access to your C library.
You're getting started now. Barely. Enjoy.
Of course, X CCP could be vastly improved. A "CCP strike force" would be a good idea, and I would support it. However, it would be a lot of slow, incremental work, without much of any payoff for the participants in terms of glory, wealth or even user acknowledgment. Until I find a band of folks excited by that description, I think I'll spend my time doing other things. (B)