Why can't free software GUIs be empowering instead of limiting?

Why can't free software GUIs be empowering instead of limiting?


It's one of the more popular culture wars in the free software community: GUI versus CLI (graphics versus the command-line). Programmers, by selection, inclination, and long experience, understandably are attracted to textual interactions with the computer, but the text interface was imposed originally by technological limitations. The GUI was introduced as a reply to those problems, but has undergone very little evolution from 1973 (when it was invented at Xerox PARC) to today. So why can't we do better than either of these tired old systems?

How we got where we are today... the CLI versus the GUI

Back in the early 1980s we did not use command line interfaces 'because we were macho'. We did it because we had no choice. Back then, color was only a property of terminals in the sense of choosing between "amber on black" and "green on black" (I favored the green, even though it was going out of style—to this day I still set my terminal windows to green-on-black for nostalgia's sake).

We did not use command line interfaces because we were macho. We did it because we had no choice

But when graphical user interfaces finally did become available, it was a fantastic improvement. With a well-designed GUI, you don't have to memorize a whole micro-language of commands and options to get things done. The trade-off, at least with the classic "Windows-Icons-Menus-Pointers" (WIMP) GUI, however is that it isn't as expressive: it's much easier to say the common things you need to say, but much harder to say things that the programmer didn't expect you to need. The surface simplicity comes at a terrific price in underlying complexity, and that creates practical limits on how flexible the system can be.

The typical desktop GUI hasn't changed much in 30 years!The typical desktop GUI hasn't changed much in 30 years!

However, it's been thirty years! In all this time, there's been a lot of computer science innovation, graphics, programming libraries, and artificial intelligence have all improved drastically. But very little improvement has been made to the most fundamental interaction level between humans and machines. We still have quaint little arguments over "which" is better: the GUI or the CLI. It's become a kind of knee-jerk battle over computer professionalism—the crotchety "geek" versus the clueless "luser" (and who says name-calling and cliquishness ends in high school?). Which conveniently whitewashes the point that, by now, we really ought to have more than just these two choices.

We still have quaint little arguments over "which" is better: the GUI or the CLI

We have the original motivation behind the GUI concept (as explained on Wikipedia); we have "Master Foo Discourses on the Graphical User Interface"; and we have a recent article claiming that GUIs are anti-sharing (a rather poorly-written blog, but with a good point at its core).

I think perhaps the last one gets to the core of the problem for me, especially as regards free software desktop systems: GUI interfaces as we have hitherto conceived them do not provide the same induction into the world of real power over the computer that you get when you are forced to interact with the system in the same was as the people who programmed it.

And yet, it's undeniable that free software would be dead in the water without GUIs to make software accessible to new users. And let's not kid ourselves that newbies are the only ones who benefit: I've edited my share of arcane configuration files, and unless it's something you do everyday, it is never pleasant. GUI configuration is a big improvement.

Why can't we do better than either?

Which raises a very good question, to which I have yet to see a good answer: why don't we fix it?

If CLIs are so impenetrable, why don't we use some GUI technology to make them less so? Empowered users are supposed to be the life-blood of free software development, so why do we cling so desperately to the barrier created by crusty old text interfaces? Are we afraid that some of the "lusers" might just be able to hack it if only things were a little more inviting?

Empowered users are supposed to be the life-blood of free software development

On the other side, if GUIs are so excluding, why don't we re-think them and introduce features to make them invite users into the development experience? Free software is better when it's easy to contribute, right? So let's pull some of the barriers down and build interfaces that empower users rather than wall them in. Or are we really that terrified of knowing all that "geek" stuff about how the system works?

Or are we just insanely conservative, clinging to the way things were for no better reason than that they've always been that way? The present system was designed by proprietary industry professionals. We know they had a stake in protecting their jobs and isolating the developers from the users. Supposedly, we don't have that problem, but for some reason we're still playing the same game. Why? Free software developers are supposed to have more freedom to try out new ideas, right?

Some ideas of my own...

Well, there's a lot of ideas about improving user interaction. I recommend Wikipedia's PostWIMP page for a starting point on that. But I'd like to suggest some very minor ideas that might go some way towards finding a middle ground between the "impenetrable" CLI and the "excluding" GUI.

First of all, let's tackle this from the CLI side. It's been a long time since I used an actual terminal. Most CLI work today is really done through a GUI widget called a "terminal emulator": you know, something like Gnome Terminal (which IMHO is the best available one nowadays). It's not just a plain terminal. It even has "tabbed" terminals, color schemes to keep track of which session you're in, and if you're really into the eye-candy, you can even set the background to an image or set it up with transparency.

Which is all cool, and I use it pretty heavily. But can't we do more? Terminal interfaces are easier to use, for example, when you have an experienced user nearby. Maybe we could provide an intelligent agent on screen to follow what you're typing and fill in some useful information. I originally had in mind writing something like this for kids, so I'll hope you'll pardon the cuteness (and yes I have heard of the fiasco called Microsoft Bob, but I'm also of the opinion that just because Microsoft can't do something right doesn't mean it can't be done). With appropriate provision of settings and options, it could be set up to provide just the right amount of help so that it is neither incessantly annoying, nor cryptically unhelpful.

A concept for new-user-friendly terminal environment. I original thought of this for kids, but with appropriately flexible settings, it could be useful even for fairly experienced users, as it would keep man pages and other system information at your fingertips. The avatar would probably be optional and changeableA concept for new-user-friendly terminal environment. I original thought of this for kids, but with appropriately flexible settings, it could be useful even for fairly experienced users, as it would keep man pages and other system information at your fingertips. The avatar would probably be optional and changeable

Most of the data, of course, already exists in the form of man pages, help options, and so on. It wouldn't be hard to write an expert system that "knows" the basic ways to tease help out of your system, and then just provide that in a helper window so you can keep track of, for example, all the picky little options that "ls" can take. Or remember recent commands you typed, or commands you use a lot, or look up which command it was that you were trying to remember (maybe by a keyword search).

And then again, we could come at the problem from the GUI side. One of my biggest pet peeves about GUI applications is that they don't tell you what they are called. I mean sure, I call my mail client "Thunderbird", but what you actually have to type to get it to start from the command line is "mozilla-thunderbird". Which is okay once you know, but was a real pain to figure out. I would so like it if every application had a "Help" menu option to show you the command line that actually invoked the program.

There are, of course, some programs that definitely push the envelope with the flexibility of a GUI interface. Blender's interface is absolutely fearless in its innovation. This has cost it some points with new users, but as it is really intended to be optimized for serious "power users", this isn't so much of a problem. The idea behind the Blender interface is definitely empowerment of the user: experienced users can get a lot out of the interface, even if they still rely heavily on hotkeys. The Blender interface isn't really all that unique, but it is huge, and also very consistent internally, which helps the user keep track of the zillions of menus that the program needs.

Blender, showing the results of my first go at the Gingerbread man tutorial, and a whopping lot of menus and buttons. This is definitely not a Blender, showing the results of my first go at the Gingerbread man tutorial, and a whopping lot of menus and buttons. This is definitely not a "luser" interface, it's designed for power users only

Of course, it would be nice if the graphical environment provided some of the combinatorial flexibility that the command line gives you. For example, consider the power of Unix "pipes": you can string together a series of applications, combining the output of one program as the input of another. This can lead to some nifty idioms, like this one:

# find . -name "*.py" -exec grep "FIXME" {} \;

or this

# du -sk * | sort -n

or this

# mpg123 foo.mp3 | oggenc -q 7 > foo.ogg

and so on.

But the commandline is depressingly linear. Obviously, with the 2D space of the GUI environment, we ought to be able to express more complex pipe relationships. So how about a "stream/flow" interface concept, where the flow of information is literalized by "pipes" or "wires" on the screen? There have been a couple of programming environments like this, so why not use it as a desktop GUI environment? It should be quite powerful.

A GUI based on streams of information and a A GUI based on streams of information and a "component architecture" with pins and wires connecting different applications so that they can be chained to do certain kinds of combined tasks, much as you would with Unix pipes, but with more flexibility

As a side benefit, such an arrangement would even give you a kind of graphical analog to writing scripts. You'd just group a collection of filter operators into a container object, then you could name and use that container as another GUI object.

A side-benefit of such a system is that there is an obvious way to create script objects using itA side-benefit of such a system is that there is an obvious way to create script objects using it

Let your imagination go

Of course, these are just a couple of goofy ideas off the top of my head. I'm sure there are many other alternative ways that the desktop user environment could work. You could take advantage of new input devices, for example, such as digitizer pens (much more affordable than they used to be), space manipulators, virtual reality input devices (i.e. gloves), and so on. Or, you could just use the good old mouse and keyboard, but in new ways.

We should be empowered by what has gone before, not limited by it!

Category: 

Comments

BenedictArf's picture

HCI is an area that I'm very interested. I've read a few books and essays that other may find useful in understanding the problems and how to approach them.

http://worrydream.com/MagicInk/ This article discusses many things of interested to HCI design but the area that stuck with me is the analysis of the forms in which the HCI interaction occurs. It then proposes how to approach the different types of interaction.

The Design of Everyday Things, D Norman This book discuss how people interact with objects. It doesn't deal with HCI specifically but all the points made are relevant to good interface design. This book really has changed the way I look at everything. For example, I now know why the handle to raise and lower my car window is annoying.

The Humane Interface, J Raskin This book is incredible. It first explains the physiological background to HCI and then explains the concepts for good interface design. The last section of the book explains ZUIs (zooming user interfaces). Personally I love the idea of ZUIs and would like to see some working examples. However, I fear that they are fatally flawed as I can't see how they can support multitasking without undermining the whole premise of the ZUI.

Cheers Benedict
2008-02-21 edited by FSM to tidy and liven links

Terry Hancock's picture

By the way, since your reference is a book, here's the Wikipedia entry on Zooming User Interfaces.

I observe that Blender implements something like this internally, becasue every window, even the buttons windows are zoomable, using the mouse wheel on scrolling mice (I'm sorry, I've forgotten what the alternative is, but there is one, it may be pressing the middle mouse button and moving up and down, but I'm not sure).

That would absolutely work with virtual desktops, and with OpenGL graphics, it shouldn't be all that hard to do.

Virtual desktops is one of the features I really like in KDE (I assume Gnome has something similar). I have 20 desktops configured on this computer, and regularly use the desktop selector to move between different tasks that I'm working on in parallel.

ISTM that the ZUI would be a more fluid version of that basic concept.

Beatbug's picture
Submitted by Beatbug on

The idea of using graphic blocks to represent discrete sections of code in a stream/flow interface is quite well established and already in place...for musicians working with synthesizers. A Swedish friend of mine has been using a simple-looking synthesizer hooked up to his Dell and Apple laptops in order to construct instruments entirely within a standard GUI interface. Different visual blocks are assembled on a field and attached with wires -- output to input -- and each block's function is tweaked with sliders to represent the desired effect on the audio signal. These definitions can be plugged into one another in any manner Magnus desires. Once programmed, the synthesizer can be entirely disconnected from the laptops and carried to a gig, ready to go.

We absolutely should already have this sort of interface available to us. How hard would it be to integrate something like this with KDevelop, perhaps as an alternative view to the standard code? I would find it useful to be able to have the code available alongside a graphical representation of it, especially for object-oriented programming. A class that was broken (say, missing a closing }) could be indicated by rendering the associated graphic object in red; or the graphic view could be manipulated to verify that a certain set of object methods are private or public.

A good read! Thanks!

Purple Hexagon's picture

Your wire programming reminds me of Scratch (http://scratch.mit.edu/).

You say "With a well-designed GUI, you don’t have to memorize a whole micro-language of commands and options to get things done" but it simply isn't true.

We don't carry around everything we want to communicate printed on little cards using instantly recognisable glyphs. Because we would need so many we build them from words. Where we use images is because they convey information well. Simple truth proven over thousands of years of recorded history is words versus pictures has no clear winner.

We don't draw commands to convey complexity to the computer because we need precision. But we do use a graphic label that captures the whole operation as a label.

But reality is memorising the micro-language to build more complex actions from is much more powerful than dragging around lots of little pictures to link up. Again going back to drawing the representation to the computer we can with auto-completion and typing link commands and achieve much higher flexible code density than a dragged box. Any dragged box can be represented by typing in a function name or dragging one in from the same palette you'd use for a diagram.

The final factor is level of experience. Yes the GUI rules for my mum. She uses the machine less frequently, the time spent learning runes is wasted because she doesn't use it often enough to make it stick. Plus the actions she wants are few and limited meaning you can catch those few and attach convenient labels. But she doesn't draw pictures of her day to send me, she types because she can convey her message more accurately and quickly that way.

Terry Hancock's picture

A set of "little cards using instantly recognisable glyphs" is pretty much what a character set is, isn't it? Writing has certainly been a successful alternative to speech for thousands of years. And ultimately, writing derives from iconic representations.

And just as there are many concepts which are difficult to communicate without language, there are an enormous number which are easier to depict with graphics (ask any engineer about that one!).

Another piece of literature to consider when thinking about this: Understanding Comics by Scott McCloud (it's a print book, I don't think there's an online version).

"GUI" has come to mean not just a "graphical interface", but a very specific and very limited graphical interface. "CLI" has a similarly narrow definition. My point is that the distinction is artificial and there's plenty of room to explore in between these extremes.

As for the benefit of learning verbal or textual languages to communicate and why we do that with natural languages instead of, say, drawing pictures to communicate (let's not forget that some people do just that!), the reasons are complex.

One point is frequency of use. For a professional programmer, learning a programming language is an obvious requirement, and not onerous, because if you use it everyday, you don't have to worry about forgetting it. Natural language is like this.

But there are many applications which are infrequently used (e.g. how many times a week do you set up a printer? Or a web server?). For these applications, having to learn a specialized language is extremely limiting.

Some initiatives, like standardizing on XML syntax, are useful in this regard, because they reduce the chore of learning new language rules. You are essentially in the position of having to learn new vocabulary, but not new grammar, which is a big step in the right direction. But even XML can be really obtuse.

Of course, the rules are complex. I think that one of the attractions of Python for engineers and scientists (as opposed to professional programmers) is that, because most of the commands and operators are words instead of symbols (unlike Perl, for example), it's easier to look up code that you don't understand when you see it. That certainly is an interesting contrast with the general theory that graphic/symbolic modes are more "intuitive" and therefore easier for infrequent users.

Ryan Cartwright's picture

One of the downsides--for me anyway--of GUIs is the lack of auto-completion and the inability to use them entirely from the keyboard. I find it a lot easier to type two characters and hit tab than scan through a list of commands on a menu. Also I find it frustrating to get so far with GUI keyboard control only to have use the mouse to make the final selection or option.

All is not lost though and a proper combination is possible--just perhaps not for an entire window manager. A CAD software firm I once worked for had a dual interface. the main drawing window had a point and click menu of cascading and hierarchical items. The same commands could be reached by typing in a shell window which always remained behind but two or three rows below the graphic window.

As you typed the menu would follow you, similarly as you clicked the commands appeared in the shell window. E.g. the command to put a line on the screen was "create line" often aliased to "crl". If you typed "crl", the create menu and the line sub-menu both appeared with a host of options. Most users would type "crl" and use trhe mouse to place the end-points on the drawing (co-ordinates were displayed on screen). It was not unknown for users to spend the whole time with one hand on the keyboard and the other on the mouse. It made for very speedy working and helped users learn the syntax very quickly.

andrewg's picture
Submitted by andrewg on

Terry, you and your commenters have ploughed up an issue that's been bugging me for years. There are many examples of programs that duplicate functionality between CLI and GUI versions, often developed independently. While it is easy to write a GUI wrapper to a CLI program, the reverse is far from trivial. Even when CLI and GUI alternatives exist there is rarely a feature-for-feature correspondence.

In a similar vein, attempts have been made to provide event-driven scripting interfaces to GUI software, most notably Applescript. The disadvantage is that Applescript adds yet another inconsistent interface to the developer's workload, and thus is rarely implemented outside of Apple's own software.

The example of the CAD program given above illustrates what can be achieved when alternative interfaces are developed hand-in-hand and self-consistently. This is of course more easily done within a monolithic application than across the cacophony of different software installed on a typical desktop system.

Wouldn't it be great if there was a way for application developers to write their code against an abstracted HCI, and then have the Gnomes and KDEs of the world provide a selection of prebuilt but flexible interfaces (CLI, GUI, CAD-like, event-driven) which would each be applicable to any conformant application? This would mean that each GUI action would always have a corresponding, predictable, CLI option or IPC event. This would have the added advantage that new interface systems could be easily grafted onto existing software.

inputexpert's picture

gui’s and cli’s are limiting, because of a hardware problem.

Because of the stand-alone keyboard and mouse, the debate between gui vs. cli was a big issue. The user had to repetitively move their hand from keyboard to mouse.

When the features and performance of the mouse are integrated into the keyboard, the debate no longer exists. The gui and cli can peacefully coexist.

I have developed and been using an advanced keyboard for the last three years, which enables me to point, click, type, and scroll instantaneously, all while my fingers are on the home row. I have total control over the computer screen.

Having solved the hardware problem to the gui vs. cli issue, I am now working on advanced interfaces for a PhD in advanced input, interface, and interaction technology.

The interface of the future I am working on is an interface that is fully customizable by the user using gui, cli, and a search box equally. It sits on top of all applications. It is a master interface/personal interface.

We all work differently on the computer, so an advanced interface should be fully customizable to the user’s preferences and skill level.

Conclusion

The limiting factor of gui cli advances was a hardware problem.

from the “father of the perfect keyboard”

Terry Hancock's picture

I'm not certain how your "perfect keyboard" is designed, but your description sounds a lot like IBM's "touch point" on their early "Thinkpad" laptops.

I had one of those, and I loved it. It was much more comfortable than either a mouse or the touch pad.

I was really sad when they retired the idea to go with the standard "touch pad" arrangement on later laptop models (I understand the reason was that they tended to wear out quickly).

Ryan Cartwright's picture

I also love the touchpoint and where possible try to use laptops with one instead of the glidepoint--although they are getting harder to find these days.

Much as I love it though I'm not sure it resolves the problem you discuss in this piece. A touchpoint is just another mouse-like device--albeit one which requires less movement of your hands away from the keyboard. A lot of GUIs tend to favour the mouse user (obviously). Some give keyboard shortcuts but the end-result feels more like an afterthought. Mostly this is because things like auto-completion are missing. There are of course some apps which are the other way around--where the GUI feels more like an accommodation of mouse lovers. GVim springs to mind.

My experience of the CAD app I mention above (and this was in the late 80s) has left me wanting more of the same ever since. Perhaps some clever sorts could build a WM which has a permanent shell window open and which enables compatible shell commands/GUI apps to be operated using both keyboard and mouse. For example if you start typing a series of piped shell commands, the GUI could give you a visual representation of them and enable you to right click and select further options/view man pages etc.--handy if you can't remember all of the options for the command.

@inputexpert: I would love to see more about this keyboard of yours - sounds very interesting. Al ittle mean to tempt us with snippets like that and not provide links to further info /images(!) though :o)

cheers Ryan

Terry Hancock's picture

"Much as I love it though I’m not sure it resolves the problem you discuss in this piece."

No, I don't think so either. I think the problems are more conceptual than physical.

But one thing it does do is to make the transition between mouse and keyboard smoother. This should be a big benefit in something like Blender, for example, where you use both a lot. With a conventional setup, this means using your keyboard with your left (or less dextrous) hand while using the mouse with your right (or more dextrous) hand. That can be a little awkward, though. With the touchpoint, though, the transition is smoother, because your fingers never get far from their touch-typing "home" position, so you can continue to type two-handed.

Of course, this remains hypothetical because my TP-380D had nowhere near the CPU and video hardware to run Blender!

But I often wished I had that touchpoint keyboard on my desktop system. I'd seriously consider buying one if I could find it (well, and if the drivers would work).

mish's picture
Submitted by mish on

You may also want to check out the hotwire shell project - they are working on a shell that is built from GUI elements - for example if you do "ls" then you can double click on the listing to open the file.

There's lots more to it, so go read

http://hotwire-shell.org/

They also have a list of related projects

http://code.google.com/p/hotwire-shell/wiki/RelatedProjectsAndIdeas

Author information

Terry Hancock's picture

Biography

Terry Hancock is co-owner and technical officer of Anansi Spaceworks. Currently he is working on a free-culture animated series project about space development, called Lunatics as well helping out with the Morevna Project.