Digital archaeology of the microcomputer, 1974-1994

Digital archaeology of the microcomputer, 1974-1994


(Or, how to prevent the Dark Ages of computing through free software)

In a few years time, it will be impossible to study the history of home computers since everything at the time was proprietary; both in terms of the physical hardware, and all the software that ran upon it since most of it is encumbered by software “protection” to prevent copying.

To compound the problem, the hardware is dying (literally) and (being proprietary) can’t be rebuilt in any equivalent manner. In some cases the software is physically disintegrating too since, in the case of many 8-bit micros from the 1980’s, the storage medium was cassette tape; a temperamental mechanism at the time, let alone now. It’s not that no computer innovation took place in the 1980’s, just that none of it will be recorded.

What follows is a ten-point plan outlining the primary issues of digital archaeology, the methods necessary to preserve the legacy, and how free software can lead this endeavour.

1. Keep the ROM image

Amstrad were very good in making the (proprietary) Spectrum ROM available for all emulator users, and other companies should be encouraged to follow suit. Most developers of that era would write code that directly called machine code routines at specific places in ROM (or even jump halfway into an instruction) that, as a consequence, makes the development of a binary-compatible equivalent ROM very difficult. This is contrary to software developed under DOS and later operating systems that have indirect access through interrupts, or dynamically linked API calls, which allows an interface-compatible version to be developed, such as Free-DOS or Wine.

The ROM images must be made publicly available, along with the documents describing the file format in which they’re stored, and into which memory location(s) it should be loaded. An MD5 checksum is also useful to differentiate between the different versions of seemingly identical machine.rom files that exist.

2. Document the audio structure

This means both the baud rate of the cassette recording, and the method used of encoding each byte. Knowing the frequency of the intended recording, and how it sounds will help a lot. This is especially true if the only audio recording available is a copy made on a broken tape recorder that is going slightly too fast/slow and suffering from wow and flutter. Software can only fix this if the operator knows what to fix.

Complete sampled audio files must exist for each known platform. If the platform has a different save format for BASIC and machine code programs, for example, then two audio files should be available. Each sampled file should have annotations describing how the header and data block markers are stored, and what the resultant data should look like.

3. Use the highest resolutions

Obvious to most, but always store data in the highest resolution that exists. Don’t try and save disc space by compressing the 400 KB WAV file of a program down to an MP3 file, since that introduces loss. And information loss costs future generations more than the few pence saved in disk space. Every ZX Spectrum game ever written will fit onto a single DVD, so no one has an excuse.

Additionally, back-up the data to a remote site should the worst happen. So if you must hoard, give others access to your hoard.

Oh, and if you use compression to store files, make sure the compressor/decompressor has a free software implementation of it, and that it works on each file.

4. Document each piece of software

A cassette tape bearing the name “Jet Set Willy” will mean little to future generations. Furthermore, any historian will not know of which myriad versions of Jet Set Willy this recording is: ZX Spectrum, Commodore 64, or even Dragon 32.

At a bare minimum, all software should describe the platform, memory requirements, language, and method by which it should be loaded, as some machines load programs differently if they were BASIC or Machine Code.

5. Use network-aware content management systems

Information is only of use if it can be read. Store all such information in accessible systems, connected with an open standard such as TCP/IP. In that way, even an old, proprietary system, can hold the information if necessary. In this way, it is easy to find out if a PlayStation 2-Linux kit, or ZX81, needs a special sync-on-green display in order to work.

Free software has provided many good solutions to this problem already, and I shall start no flame wars by recommended any specific one.

6. Circumvent copy protection

Imagine the faces of generation now+n when they’ve finally got their emulator working, loaded a snapshot of software X, and discovered a screen saying “Now enter the fourth word on page 27 of the manual”. Most people don’t have the manual to their VCR, let alone a computer game from 20 years ago.

Also, think of all the clever techniques used in copy “protection” through the 1980’s; missing file headers, double speed loading, and so on. Perhaps there are 5 people currently in the world that know how any specific variation of copy protection worked: the one who wrote it, and the four who cracked it for pirate disks or to make it emulator-friendly. In a few years, all this knowledge will be lost.

Save all snapshots in a position after any copy protection methods, and/or indicate how the system works so it can be bypassed.

7. Free software emulators

In addition to helping port the emulator to the latest and greatest platforms, this stops bit rot from setting in when the only emulator available for platform X is for MS-DOS 5.0, and has not been updated since 1998. The adage of “the only complete software is obsolete software” does not hold true when it comes to software archiving.

Having the source code also helps users when it comes to the methods of loading software into the emulator, as documentation can be scant.

If you have emulator source code, release it under a free license now before it’s too late! (It’s also easier than writing documentation!)

8. Free software tools

No emulator stands alone. There’s always a selection of conversion and snapshot management tools to help get the data into the right format. If no one knows what this format is intended to be, the problem of deciphering it is no different to understanding the original format.

Information should also be provided on suggested settings for the conversion processes. To include “how” as well as “why”, describing both the host and target machines. This becomes more time-critical for the present as MS-DOS can still be coaxed into working. Because the timings on DOS machines were often implemented with hacks, based on processor speed, problems can and will occur in this area. No matter how last century Free-DOS might appear, it serves a useful purpose in digital archaeology.

9. Latest is not greatest

As most historians will tell you, go back to the original documents. So, at any time you find a new version of some old software, remember to keep the original. Bugs may have been introduced, or features might have been removed, or any number of other issues.

Keep two credos in mind: “keep everything” and “if it ain’t broke—don’t fix it”

10. The source chain

When considering how to archive material for the future consider “the source chain”. That is, you must have the source for every step of the archival process for it to be available to future generations. If the entire chain is not available in source form, store the material up to the most direct tool that includes source. The typical chain is:

  • Cassette tape (physical)
  • Sampled audio (WAV file)
  • Virtual cassette file (VCF), for a specific emulator
  • Emulator-specific format, as a program
  • Emulator-specific snapshot, as a memory dump

So, if you have the source for the conversion tool that translates WAV into Virtual Cassette File, but not the source for the emulator that uses it, the Virtual Cassette File is the last future-safe format. If you do not have that, and the emulator (or its snapshot format) is closed, then the last safe format is that of sampled audio. Preferably in WAV, because there’s lots of source available to describe it, and some modern editors do not support RAW, AU or SND format.

So there you have it. Simple, obvious, and common sense ideas to ensure there is a history of computing to look back upon. After all, if the machines of our past are truly held in such high esteem, shouldn’t we expend the effort to preserve it?

Category: 

Comments

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

While the old TRS-80 is not directly mentioned by this entry, I would note that much emulator information is available, and an open-source emulator, xtrs, is available, under a free license. See Tim Mann's TRS-80 page at http://www.tim-mann.org/trs80.html and www.trs-80.com for many more details.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

You're right, there are a number of Free and open source emulators available (I've written them myself), but there are several large gaps in the "product line." Furthermore, there is no correlation between the machine's importance, originality, or design.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

I actually ran into this situation once. I supported a group of scientists doing waste management modelling of high level nuclear waste. All very rigorous science, replicable experiments, stress tested, all heavily documented ... in a proprietary format primitive CMS, which I very much doubt still exists.

I've kept all my documentation from that time forward as ASCII text dumps. I can't say the same for others.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Sometimes, when you read an article you get a WOW, what sense, and look at the implications! this is such an article. Goverments archive there email for prosperity, historians look at librarys burnt and wish it had not happened. the value of "missing archives" be it software or any media cannot be thought of just in monetory terms, we owe the future as much, if not more, than we owe the past.

As this article, quite rightly points out "And information loss costs future generations more than the few pence saved in disk space. Every ZX Spectrum game ever written will fit onto a single DVD, so no one has an excuse."

I am not one for a "law for this and a law for that" however it is a duty of us all to preserve for future generations, their past.

Perhaps all proprietary, copy right material should not just be made available, but a duty on the companies be placed that while they own the
rights they agree to preserve for the future, the whole, and not ten or twenty years later say "Oh sorry an intern burnt that part"

My 2 cents worth and thank you for a good article that said WOW

Ian Macdonald
Ian.macdonald@aibh-ye.com

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

I'm on the latter end of the author's timespan: 1994-ish. Last year, I got bitten by the nostalgia bug and decided to recreate the hardware and software experiences of a series of typical early WWW users. So, I kept a log of all activities, scoured ebay for tech books from 1993-1995, and started off with buying a x86 and Mac 68k laptops from the period. They work fine, but it was a nightmare trying to get a NIC to work in a 1995 Compaq desktop that was supposed to become a Linux 1.2 server with CERN httpd. That's when I discovered the joy of VMware player and installed it on my Linux laptop. Now, I can boot up Win 3.11 with Mosaic and Slackware 2.1 and pretend I'm a 1994 websurfer. Getting the OSs and software took many nights of deep google searches and ftp archive hunting. But, its worthy to make an archive of these historic programs and thier OS environments.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

I'm pretty sure the Spectrum was manufactured by Sinclair, not Amstrad.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Sinclair built the original Spectrum, but Amstrad bought the rights in 1986. It was Amstrad who then permitted the use of the ROMs.

Author information

Steven Goodwin's picture

Biography

When builders go down to the pub they talk about football. Presumably therefore, when footballers go down to the pub they talk about builders! When Steven Goodwin goes down the pub he doesn’t talk about football. Or builders. He talks about computers. Constantly...

He is also known as the angry man of open source.

Steven Goodwin a blog that no one reads that, and a beer podcast that no one listens to :)

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!