Linux distributions, focus on a 1gb flashdrive, not on a CD ISO

I’m looking at you Ubuntu.

For some time now, the standard form for distributing a free OS (ie. Linux, *BSD) has been as a CD-ROM or DVD ISO file. You burn it to a CD, and you can boot and install from that, and also use the disk as a live CD.

There are a variety of pages with instructions on how to convert such an ISO into a bootable flash drive, and scripts and programs for linux and even for windows — for those installing linux on a windows box.

And these are great and I used one to make a bootable Ubuntu stick on my last install. And wow! It’s such a much nicer, faster experience compared to using CD that it’s silly to use CD on any system that can boot from a USB drive, and that’s most modern systems. With a zero seek time, it is much nicer.

So I now advocate going the other way. Give me a flash image I can dd to my flash drive, and a tool to turn that into an ISO if I need an ISO.

This has a number of useful advantages:

  • I always want to try the live CD before installing, to make sure the hardware works in the new release. In fact, I even do that before upgrading most of the time.
  • Of course, you don’t have old obsolete CDs lying around.
  • Jumping to 1 gigabyte allows putting more on the distribution, including some important things that are missing these days, such as drivers and mdadm (the RAID control program.)
  • Because flash is a dynamic medium, the install can be set up so that the user can, after copying the base distro, add files to the flash drive, such as important drivers — whatever they choose. An automatic script could even examine a machine and pull down new stuff that’s needed.
  • You get a much faster and easier to use “rescue stick.”
  • It’s easier to carry around.
  • No need for an “alternate install” and perhaps easier as well to have the upgrader use the USB stick as a cache of packages during upgrades.
  • At this point these things are really cheap. People give them away. You could sell them. This technique would also work for general external USB drives, or even plain old internal hard drives temporarily connected to a new machine being built if boot from USB is not practical. Great and really fast for eSata.
  • Using filesystems designed not to wear out flash, the live stick can have a writable partition for /tmp, installed packages and modifications (with some security risk if you run untrusted code.)

The perils of recalls of electronic products

Product recalls have been around for a while. You get a notice in the mail. You either go into a dealer at some point, any point, for service, or you swap the product via the mail. Nicer recalls mail you a new product first and then you send in the old one, or sign a form saying you destroyed it. All well and good. Some recalls are done as “hidden warranties.” They are never announced, but if you go into the dealer with a problem they just fix it for free, long after the regular warranty, or fix it while working on something else. These usually are for items that don’t involve safety or high liability.

Today I had my first run-in with a recall of a connected electronic product. I purchased an “EyeFi” card for my sweetie for valentines day. This is an SD memory card with an wifi transmitter in it. You take pictures, and it stores them until it encounters a wifi network it knows. It then uploads the photos to your computer or to photo sharing sites. All sounds very nice.

When she put in the card and tried to initialize it, up popped a screen. “This card has a defect. Please give us your address and we’ll mail you a new one, and you can mail back the old one, and we’ll give you a credit in our store for your trouble.” All fine, but the product refused to let her register and use the product. We can’t even use the product for a few days to try it out (knowing it may lose photos.) What if I wanted to try it out to see if I was going to return it to the store. No luck. I could return it to the store as-is, but that’s work and may just get another one on the recall list.

This shows us the new dimension of the electronic recall. The product was remotely disabled to avoid liability for the company. We had no option to say, “Let us use the card until the new one arrives, we agree that it might fail or lose pictures.” For people who already had the card, I don’t know if it shut them down (possibly leaving them with no card) or let them continue with it. You have to agree on the form that you will not use the card any more.

This can really put a damper on a gift, when it refuses to even let you do a test the day you get it.

With electronic recall, all instances of a product can be shut down. This is similar to problems that people have had with automatic “upgrades” that actually remove features (like adding more DRM) or which fix you jailbreaking your iPhone. You don’t own the product any more. Companies are very worried about liability. They will “do the safe thing” which is shut their product down rather than let you take a risk. With other recalls, things happened on your schedule. You were even able to just decide not to do the recall. The company showed it had tried its best to convince you to do it, and could feel satisfied for having tried.

This is one of the risks I list in my essays on robocars. If a software flaw is found in a robocar (or any other product with physical risk) there will be pressure to “recall” the software and shut down people’s cars. Perhaps in extreme cases while they are driving on the street! The liability of being able to shut down the cars and not doing so once you are aware of a risk could result in huge punitive damages under the current legal system. So you play it safe.

But if people find their car shutting down because of some very slight risk, they will start wondering if they even want a car that can do that. Or even a memory card. Only with public pressure will we get the right to say, “I will take my own responsibility. You’ve informed me, I will decide when to take the product offline to get it fixed.”

Better forms of Will-Call (phone and photo)

Most of us have had to stand in a long will-call line to pick up tickets. We probably even paid a ticket “service fee” for the privilege. Some places are helping by having online printable tickets with a bar code. However, that requires that they have networked bar code readers at the gate which can detect things like duplicate bar codes, and people seem to rather have giant lines and many staff rather than get such machines.

Can we do it better?

Well, for starters, it would be nice if tickets could be sent not as a printable bar code, but as a message to my cell phone. Perhaps a text message with coded string, which I could then display to a camera which does OCR of it. Same as a bar code, but I can actually get it while I am on the road and don’t have a printer. And I’m less likely to forget it.

Or let’s go a bit further and have a downloadable ticket application on the phone. The ticket application would use bluetooth and a deliberately short range reader. I would go up to the reader, and push a button on the cell phone, and it would talk over bluetooth with the ticket scanner and authenticate the use of my ticket. The scanner would then show a symbol or colour and my phone would show that symbol/colour to confirm to the gate staff that it was my phone that synced. (Otherwise it might have been the guy in line behind me.) The scanner would be just an ordinary laptop with bluetooth. You might be able to get away with just one (saving the need for networking) because it would be very fast. People would just walk by holding up their phones, and the gatekeeper would look at the screen of the laptop (hidden) and the screen of the phone, and as long as they matched wave through the number of people it shows on the laptop screen.

Alternately you could put the bluetooth antenna in a little faraday box to be sure it doesn’t talk to any other phone but the one in the box. Put phone in box, light goes on, take phone out and proceed.

Photo will-call

One reason many will-calls are slow is they ask you to show ID, often your photo-ID or the credit card used to purchase the item. But here’s an interesting idea. When I purchase the ticket online, let me offer an image file with a photo. It could be my photo, or it could be the photo of the person I am buying the tickets for. It could be 3 photos if any one of those 3 people can pick up the ticket. You do not need to provide your real name, just the photo. The will call system would then inkjet print the photos on the outside of the envelope containing your tickets.

You do need some form of name or code, so the agent can find the envelope, or type the name in the computer to see the records. When the agent gets the envelope, identification will be easy. Look at the photo on the envelope, and see if it’s the person at the ticket window. If so, hand it over, and you’re done! No need to get out cards or hand them back and forth.

A great company to implement this would be paypal. I could pay with paypal, not revealing my name (just an E-mail address) and paypal could have a photo stored, and forward it on to the ticket seller if I check the box to do this. The ticket seller never knows my name, just my picture. You may think it’s scary for people to get your picture, but in fact it’s scarier to give them your name. They can collect and share data with you under your name. Your picture is not very useful for this, at least not yet, and if you like you can use one of many different pictures each time — you can’t keep using different names if you need to show ID.

This could still be done with credit cards. Many credit cards offer a “virtual credit card number” system which will generate one-time card numbers for online transactions. They could set these up so you don’t have to offer a real name or address, just the photo. When picking up the item, all you need is your face.

This doesn’t work if it’s an over-21 venue, alas. They still want photo ID, but they only need to look at it, they don’t have to record the name.

It would be more interesting if one could design a system so that people can find their own ticket envelopes. The guard would let you into the room with the ticket envelopes, and let you find yours, and then you can leave by showing your face is on the envelope. The problem is, what if you also palmed somebody else’s envelope and then claimed yours, or said you couldn’t find yours? That needs a pretty watchful guard which doesn’t really save on staff as we’re hoping. It might be possible to have the tickets in a series of closed boxes. You know your box number (it was given to you, or you selected it in advance) so you get your box and bring it to the gate person, who opens it and pulls out your ticket for you, confirming your face. Then the box is closed and returned. Make opening the boxes very noisy.

I also thought that for Burning Man, which apparently had a will-call problem this year, you could just require all people fetching their ticket be naked. For those not willing, they could do regular will-call where the ticket agent finds the envelope. :-)

I’ve noted before that, absent the need of the TSA to know all our names, this is how boarding passes should work. You buy a ticket, provide a photo of the person who is to fly, and the gate agent just looks to see if the face on the screen is the person flying, no need to get out ID, or tell the airline your name.

Making RAID easier

Hard disks fail. If you prepared properly, you have a backup, or you swap out disks when they first start reporting problems. If you prepare really well you have offsite backup (which is getting easier and easier to do over the internet.)

One way to protect yourself from disk failures is RAID, especially RAID-5. With RAID, several disks act together as one. The simplest protecting RAID, RAID-1, just has 2 disks which work in parallel, known as mirroring. Everything you write is copied to both. If one fails, you still have the other, with all your data. It’s good, but twice as expensive.

RAID-5 is cleverer. It uses 3 or more disks, and uses error correction techniques so that you can store, for example, 2 disks worth of data on 3 disks. So it’s only 50% more expensive. RAID-5 can be done with many more disks — for example with 5 disks you get 4 disks worth of data, and it’s only 25% more expensive. However, having 5 disks is beyond most systems and has its own secret risk — if 2 of the 5 disks fail at once — and this does happen — you lose all 4 disks worth of data, not just 2 disks worth. (RAID-6 for really large arrays of disks, survives 2 failures but not 3.)

Now most people who put in RAID do it for more than data protection. After all, good sysadmins are doing regular backups. They do it because with RAID, the computer doesn’t even stop when a disk fails. You connect up a new disk live to the computer (which you can do with some systems) and it is recreated from the working disks, and you never miss a beat. This is pretty important with a major server.

But RAID has value to those who are not in the 99.99% uptime community. Those who are not good at doing manual backups, but who want to be protected from the inevitable disk failures. Today it is hard to set up, or expensive, or both. There are some external boxes like the “readynas” that make it reasonably easy for external disks, but they don’t have the bandwidth to be your full time disks.

RAID-5 on old IDE systems was hard, they usually could truly talk to only 2 disks at a time. The new SATA bus is much better, as many motherboards have 4 connectors, though soon one will be required by blu-ray drives.  read more »

A near-ZUI encrypted disk, for protection from Customs


Recently we at the EFF have been trying to fight new rulings about the power of U.S. customs. Right now, it’s been ruled they can search your laptop, taking a complete copy of your drive, even if they don’t have the normally required reasons to suspect you of a crime. The simple fact that you’re crossing the border gives them extraordinary power.

We would like to see that changed, but until then what can be done? You can use various software to encrypt your hard drive — there are free packages like truecrypt, and many laptops come with this as an option — but most people find having to enter a password every time you boot to be a pain. And customs can threaten to detain you until you give them the password.

There are some tricks you can pull, like having a special inner-drive with a second password that they don’t even know to ask about. You can put your most private data there. But again, people don’t use systems with complex UIs unless they feel really motivated.

What we need is a system that is effectively transparent most of the time. However, you could take special actions when going through customs or otherwise having your laptop be out of your control.  read more »

Windows needs a master daemon

It seems that half the programs I try and install under Windows want to have a “daemon” process with them, which is to say a portion of the program that is always running and which gets a little task-tray icon from which it can be controlled. Usually they want to also be run at boot time. In Windows parlance this is called a service.

There are too many of them, and they don’t all need to be there. Microsoft noticed this, and started having Windows detect if task tray icons were too static. If they are it hides them. This doesn’t work very well — they even hide their own icon for removing hardware, which of course is going to be static most of the time. And of course some programs now play games to make their icons appear non-static so they will stay visible. A pointless arms race.

All these daemons eat up memory, and some of them eat up CPU. They tend to slow the boot of the machine too. And usually not to do very much — mostly to wait for some event, like being clicked, or hardware being plugged in, or an OS/internet event. And the worst of them on their menu don’t even have a way to shut them down.

I would like to see the creation of a master deaemon/service program. This program would be running all the time, and it would provide a basic scripting language to perform daemon functions. Programs that just need a simple daemon, with a menu or waiting for events, would be strongly encouraged to prepare it in this scripting language, and install it through the master daemon. That way they take up a few kilobytes, not megabytes, and don’t take long to load. The scripting language should be able to react at least in a basic way to all the OS hooks, events and callbacks. It need not do much with them — mainly it would run a real module of the program that would have had a daemon. If the events are fast and furious and don’t pause, this program could stay resident and become a real daemon.

But having a stand alone program would be discouraged, certainly for boring purposes like checking for updates, overseeing other programs and waiting for events. The master program itself could get regular updates, as features are added to it as needed by would-be daemons.

Unix started with this philosophy. Most internet servers are started up by inetd, which listens on all the server ports you tell it, and fires up a server if somebody tries to connect. Only programs with very frequent requests, like E-mail and web serving, are supposed to keep something constantly running.

The problem is, every software package is convinced it’s the most important program on the system, and that the user mostly runs nothing but that program. So they act like they own the place. We need a way to only let them do that if they truly need it.

OCR Page numbers and detect double feed

I’m scanning my documents on an ADF document scanner now, and it’s largely pretty impressive, but I’m surprised at some things the system won’t do.

Double page feeding is the bane of document scanning. To prevent it, many scanners offer methods of double feed detection, including ultrasonic detection of double thickness and detection when one page is suddenly longer than all the others (because it’s really two.)

There are a number of other tricks they could do, I think. I think a paper feeder that used air suction or gecko-foot van-der-waals force pluckers on both sides of a page to try to pull the sides in two different directions could help not just detect, but eliminate such feeds.

However, the most the double feed detectors do is signal an exception to stop the scan. Which means work re-feeding and a need to stand by.

However, many documents have page numbers. And we’re going to OCR them and the OCR engine is pretty good at detecting page numbers (mostly out of desire to remove them.) However, it seems to me a good approach would be to look for gaps in the page numbers, especially combined with the other results of a double feed. Then don’t stop the scan, just keep going, and report to the operator which pages need to be scanned again. Those would be scanned, their number extracted, and they would be inserted in the right place in the final document.

Of course, it’s not perfect. Sometimes page numbers are not put on blank pages, and some documents number only within chapters. So you might not catch everything, but you could catch a lot of stuff. Operators could quickly discern the page numbering scheme (though I think the OCR could do this too) to guide the effort.

I’m seeking a maximum convenience workflow. I think to do that the best plan is to have several scanners going, and the OCR after the fact in the background. That way there’s always something for the operator to do — fixing bad feeds, loading new documents, naming them — for maximum throughput. Though I also would hope the OCR software could do better at naming the documents for you, or at least suggesting names. Perhaps it can, the manual for Omnipage is pretty sparse.

While some higher end scanners do have the scanner figure out the size of the page (at least the length) I am not sure why it isn’t a trivial feature for all ADF scanners to do this. My $100 Strobe sheetfed scanner does it. That my $6,000 (retail) FI-5650 needs extra software seems odd to me.

How about standby & hibernate together

PCs can go into standby mode (just enough power to preserve the RAM and do wake-on-lan) and into hibernate mode (where they write out the RAM to disk, shut down entirely and restore from disk later) as well as fully shut down.

Standby mode comes back up very fast, and should be routinely used on desktops. In fact, non-server PCs should consider doing it as a sort of screen saver since the restart can be so quick. It’s also popular on laptops but does drain the battery in a few days keeping the RAM alive. Many laptops will wake up briefly to hibernate if left in standby so long that the battery gets low, which is good.

How about this option: Write the ram contents out to disk, but also keep the ram alive. When the user wants to restart, they can restart instantly, unless something happened to the ram. If there was a power flicker or other trouble, notice the ram is bad and restart from disk. Usually you don’t care too much about the extra time needed to write out to disk when suspending, other than for psychological reasons where you want to be really sure the computer is off before leaving it. It’s when you come back to the computer that you want instant-on.

In fact, since RAM doesn’t actually fail all that quickly, you might even find you can restore from RAM after a brief power flicker. In that case, you would want to store a checksum for all blocks of RAM, and restore any from disk that don’t match the checksum.

To go further, one could also hibernate to newer generations of fast flash memory. Flash memory is getting quite cheap, and while older generations aren’t that quick, they seek instantaneously. This allows you to reboot a machine with its memory “paged out” to flash, and swap in pages at random as they are needed. This would allow a special sort of hybrid restore:

  1. Predict in advance which pages are highly used, and which are enough to get the most basic functions of the OS up. Write them out to a special contiguous block of hibernation disk. Then write out the rest, to disk and flash.
  2. When turning on again, read this block of contiguous disk and go “live.” Any pages needed can then be paged in from the flash memory as needed, or if the flash wasn’t big enough, unlikely pages can come from disk.
  3. In the background, restore the rest of the pages from the faster disk. Eventually you are fully back to ram.

This would allow users to get a fairly fast restore, even from full-off hibernation. If they click on a rarely used program that was in ram, it might be slow as stuff pages in, but still not as bad as waiting for the whole restore.

Virtual machines need to share memory

A big trend in systems operation these days is the use of virtual machines — software systems which emulate a standalone machine so you can run a guest operating system as a program on top of another (host) OS. This has become particularly popular for companies selling web hosting. They take one fast machine and run many VMs on it, so that each customer has the illusion of a standalone machine, on which they can do anything. It’s also used for security testing and honeypots.

The virtual hosting is great. Typical web activity is “bursty.” You would like to run at a low level most of the time, but occasionally burst to higher capacity. A good VM environment will do that well. A dedicated machine has you pay for full capacity all the time when you only need it rarely. Cloud computing goes beyond this.

However, the main limit to a virtual machine’s capacity is memory. Virtual host vendors price their machines mostly on how much RAM they get. And a virtual host with twice the RAM often costs twice as much. This is all based on the machine’s physical ram. A typical vendor might take a machine with 4gb, keep 256mb for the host and then sell 15 virtual machines with 256mb of ram. They will also let you “burst” your ram, either into spare capacity or into what the other customers are not using at the time, but if you do this for too long they will just randomly kill processes on your machine, so you don’t want to depend on this.

The problem is when they give you 256MB of ram, that’s what you get. A dedicated linux server with 256mb of ram will actually run fairly well, because it uses paging to disk. The server loads many programs, but a lot of the memory used for these programs (particularly the code) is used rarely, if ever, and swaps out to disk. So your 256mb holds the most important pages of ram. If you have more than 256mb of important, regularly used ram, you’ll thrash (but not die) and know you need to buy more.

The virtual machines, however, don’t give you swap space. Everything stays in ram. And the host doesn’t swap it either, because that would not be fair. If one VM were regularly swapping to disk, this would slow the whole system down for everybody. One could build a fair allocation for that but I have not heard of it.

In addition, another big memory saving is lost — shared memory. In a typical system, when two processes use the same shared library or same program, this is loaded into memory only once. It’s read-only so you don’t need to have two copies. But on a big virtual machine, we have 15 copies of all the standard stuff — 15 kernels, 15 MYSQL servers, 15 web servers, 15 of just about everything. It’s very wasteful.

So I wonder if it might be possible to do one of the following:

  • Design the VM so that all binaries and shared libraries can be mounted from a special read-only filesystem which is actually on the host. This would be an overlay filesystem so that individual virtual machines could change it if need be. The guest kernel, however, would be able to load pages from these files, and they would be shared with any other virtual machine loading the same file.
  • Write a daemon that regularly uses spare CPU to scan the pages of each virtual machine, hashing them. When two pages turn out to be identical, release one and have both VMs use the common copy. Mark it so that if one writes to it, a duplicate is created again. When new programs start it would take extra RAM, but within a few minutes the memory would be shared.

These techniques require either a very clever virtualizer or modified guests, but their savings are so worthwhile that everybody would want to do it this way on any highly loaded virtual machine. Of course, that goes against the concept of “run anything you like” and makes it “run what you like, but certain standard systems are much cheaper.”

This, and allowing some form of fair swapping, could cause a serious increase in the performance and cost of VMs.

Laptops could get smart while power supplies stay stupid

If you have read my articles on power you know I yearn for the days when we get smart power so we have have universal supplies that power everything. This hit home when we got a new Thinkpad Z61 model, which uses a new power adapter which provides 20 volts at 4.5 amps and uses a new, quite rare power tip which is 8mm in diameter. For almost a decade, thinkpads used 16.5 volts and used a fairly standard 5.5mm plug. It go so that some companies standardized on Thinkpads and put cheap 16 volt TP power supplies in all the conference rooms, allowing employees to just bring their laptops in with no hassle.

Lenovo pissed off their customers with this move. I have perhaps 5 older power supplies, including one each at two desks, one that stays in the laptop bag for travel, one downstairs and one running an older ThinkPad. They are no good to me on the new computer.

Lenovo says they knew this would annoy people, and did it because they needed more power in their laptops, but could not increase the current in the older plug. I’m not quite sure why they need more power — the newer processors are actually lower wattage — but they did.

Here’s something they could have done to make it better.  read more »

Steps closer to more universal power supplies

I’ve written before about both the desire for universal dc power and more simply universal laptop power at meeting room desks.

Today I want to report we’re getting a lot closer. A new generation of cheap “buck and boost” ICs which can handle more serious wattages with good efficiency has come to the market. This means cheap DC to DC conversion, both increasing and decreasing voltages. More and more equipment is now able to take a serious range of input voltages, and also to generate them. Being able to use any voltage is important for battery powered devices, since batteries start out with a high voltage (higher than the one they are rated for) and drop over their time to around 2/3s of that before they are viewed as depleted. (With some batteries, heavy depletion can really hurt their life. Some are more able to handle it.)

With a simple buck converter chip, at a cost of about 10-15% of the energy, you get a constant voltage out to matter what the battery is putting out. This means more reliable power and also the ability to use the full capacity of the battery, if you need it and it won’t cause too much damage. These same chips are in universal laptop supplies. Most of these supplies use special magic tips which fit the device they are powering and also tell the supply what voltage and current it needs.  read more »

A way to leave USB power on during standby

Ok, I haven't had a new laptop in a while so perhaps this already happens, but I'm now carrying more devices that can charge off the USB power, including my cell phone. It's only 2.5 watts, but it's good enough for many purposes.

However, my laptops, and desktops, do not provide USB power when in standby or off. So how about a physical or soft switch to enable that? Or even a smart mode in the US that lets you list what devices you want to keep powered and which ones you don't? (This would probably keep all devices powered if any one such device is connected, unless you had individual power control for each plug.)

This would only be when on AC power of course, not on battery unless explicitly asked for as an emergency need.

To get really smart a protocol could be developed where the computer can ask the USB device if it needs power. A fully charged device that plans to sleep would say no. A device needing charge could say yes.

Of course, you only want to do this if the power supply can efficiently generate 5 volts. Some PC power supplies are not efficient at low loads and so may not be a good choice for this, and smaller power supplies should be used.

Do we need a time delay after password failures?

Most programs that ask for a password will put in a delay if you get it wrong. They do this to stop password crackers from quickly trying lots of passwords. The delay makes brute force attacks impossible, in theory.

But what does it really do? There are two situations. In one situation, you have some state on the party entering the password, such as IP address, or a shell session, or terminal. So you can slow them down later. For example, you could let a user have 3 or 4 quick tries at a password with no delay, and then put in a very long delay on the 5th, even if they close off the login session and open another one. Put all the delay at the end of the 4 tries (or at the start of the next 4) rather than between each try. It's all the same to a cracking program.

Alternately, you have no way to identify them, in which case rather than sit through a delay, they can just open another session. But you can put a delay on that other session or any other attempt to log into that user. Once again you don't have to make things slow for the user who just made a typo. And of course, typos are common since most programs don't show you what you're typing. (This turns out to be very frustrating when logging in from a mobile device where the keyboards are highly unreliable and you can't see what you are typing!)

The dark ages of lost data are over

For much of history, we’ve used removable media for backup. We’ve used tapes of various types, floppy disks, disk cartridges, and burnable optical disks. We take the removable media and keep a copy offsite if we’re good, but otherwise they sit for a few decades until they can’t be read, either because they degraded or we can’t find a reader for the medium any more.

But I now declare this era over. Disk drives are so cheap — 25 cents/gb and falling, that it no longer makes sense to do backups to anything but hard disks. We may use external USB drives that are removable, but at this point our backups are not offline, they are online. Thanks to the internet, I even do offsite backup to live storage. I sync up over the internet at night, and if I get too many changes (like after an OS install, or a new crop of photos) I write the changes to a removable hard disk and carry it over to the offsite hard disk.

Of course, these hard drives will fail, perhaps even faster than CD-roms or floppies. But the key factor is that the storage is online rather than offline, and each new disk is 2 to 3 times larger than the one it replaced. What this means is that as we change out our disks, we just copy our old online archives to our new online disk. By constantly moving the data to newer and newer media — and storing it redundantly with online, offsite backup, the data are protected from the death that removable media eventually suffer. So long as disks keep getting bigger and cheaper, we won’t lose anything, except by beng lazy. And soon, our systems will get more automated at this, so it’s hard to set up a computer that isn’t backed up online and remotely. We may still lose things because we lose encryption keys, but it won’t be for media.

Thus, oddly, the period of the latter part of the 20th century will be a sort of “dark ages” to future data archaeologists. Those disks will be lost. The media may be around, but you will have to do a lot of work to recover them — manual work. However, data from the early 21st onward will be there unless it was actively deleted or encrypted.

Of course this has good and bad consequences. Good for historians. Perhaps not so good for privacy.

Standardize computer access in hotels, and vnc everywhere

Hotels are now commonly sporting flat widescreen TVs, usually LCD HDTVs at the 720p resolution, which is 1280 x 720 or similar. Some of these TVs have VGA ports or HDMI (DVI) ports, or they have HDTV analog component video (which is found on some laptops but not too many.) While 720p resolution is not as good as the screens on many laptops, it makes a world of difference on a PDA. As our phone/PDA devices become more like the iPhone, it would be very interesting to see hotels guarantee that their room offers the combination of:

  • A bluetooth keyboard (with USB and mini-USB as a backup)
  • A similar optical mouse
  • A means to get video into the HDTV
  • Of course, wireless internet
  • Our dreamed of universal DC power jack (or possibly inductive charging.)

Tiny devices like the iPhone won’t sport VGA or even component video out 7 pin connectors, though they might do HDMI. It’s also not out of the question to go a step further and do a remote screen protocol like VNC over the wireless ethernet or bluetooth.

This would engender a world where you carry a tiny device like the iPhone, which is all touchscreen for when you are using it in the mobile environment. However, when you sit down in your hotel room (or a few other places) you could use it like a full computer with a full screen and keyboard. (There are also quite compact real-key bluetooth keyboards and mice which travelers could also bring. Indeed, since the iPhone depends on a multitouch interface, an ordinary mouse might not be enough for it, but you could always use its screen for such pointing, effectively using the device as the touchpad.)

Such stations need not simply be in hotels. Smaller displays (which are now quite cheap) could also be present at workstations on conference tables or meeting rooms, or even for rent in public. Of course rental PCs in public are very common at internet cafes and airport kiosks, but using our own device is more tuned to our needs and more secure (though using a rented keyboard presents security risks.)

One could even imagine stations like these randomly scattered around cities behind walls. Many retailers today are putting HDTV flat panels in their windows instead of signs, and this will become a more popular trend. Imagine being able to borrow (for free or for a rental fee) such screens for a short time to do a serious round of web surfing on your portable device with high resolution, and local wifi bandwidth. Such a screen could not provide you with a keyboard or mouse easily, but the surfing experience would be much better than the typical mobile device surfing experience, even the iPhone model of seeing a blurry, full-size web page and using multitouch to zoom in on the relevant parts. Using a protocol like vnc could provide a good surfing experience for pedestrians.

Cars are also more commonly becoming equipped with screens, and they are another place we like to do mobile surfing. While the car’s computer should let you surf directly, there is merit in being able to use that screen as a temporary large screen for one’s mobile device.

Until we either get really good VR glasses or bright tiny projectors, screen size is going to be an issue in mobile devices. A world full of larger screens that can be grabbed for a few minutes use may be a good answer.

E-mail programs should be time-management programs

For many of us, E-mail has become our most fundamental tool. It is not just the way we communicate with friends and colleagues, it is the way that a large chunk of the tasks on our “to do” lists and calendars arrive. Of course, many E-mail programs like Outlook come integrated with a calendar program and a to-do list, but the integration is marginal at best. (Integration with the contact manager/address book is usually the top priority.)

If you’re like me you have a nasty habit. You leave messages in your inbox that you need to deal with if you can’t resolve them with a quick reply when you read them. And then those messages often drift down in the box, off the first screen. As a result, they are dealt with much later or not at all. With luck the person mails you again to remind you of the pending task.

There are many time management systems and philosophies out there, of course. A common theme is to manage your to-do list and calendar well, and to understand what you will do and not do, and when you will do it if not right away. I think it’s time to integrate our time management concepts with our E-mail. To realize that a large number of emails or threads are also a task, and should be bound together with the time manager’s concept of a task.

For example, one way to “file” an E-mail would be to the calendar or a day oriented to-do list. You might take an E-mail and say, “I need 20 minutes to do this by Friday” or “I’ll do this after my meeting with the boss tomorrow.” The task would be tied to the E-mail. Most often, the tasks would not be tied to a specific time the way calendar entries are, but would just be given a rough block of time within a rough window of hours or days.

It would be useful to add these “when to do it” attributes to E-mails, because now delegating a task to somebody else can be as simple as forwarding the E-mail-message-as-task to them.

In fact, because, as I have noted, I like calendars with free-form input (ie. saying “Lunch with Peter 1pm tomorrow” and having the calender understand exactly what to do with it) it makes sense to consider the E-mail window as a primary means of input to the calendar. For example, one might add calendar entries by emailing them to a special address that is processed by the calendar. (That’s a useful idea for any calendar, even one not tied at all to the E-mail program.)

One should also be able to assign tasks to places (a concept from the “Getting Things Done” book I have had recommended to me.) In this case, items that will be done when one is shopping, or going out to a specific meeting, could be synced or sent appropriately to one’s mobile device, but all with the E-mail metaphor.

Because there are different philosophies of time management, all with their fans, one monolithic e-mail/time/calendar/todo program may not be the perfect answer. A plug-in architecture that lets time managers integrate nicely with E-mail could be a better way to do it.

Some of these concepts apply to the shared calendar concepts I wrote about last month.

A Linux takeover distro pushed as anti-virus

Here’s a new approach to linux adoption. Create a linux distro which converts a Windows machine to linux, marketed as a way to solve many of your virus/malware/phishing woes.

Yes, for a long time linux distros have installed themselves on top of a windows machine dual-boot. And there are distros that can run in a VM on windows, or look windows like, but here’s a set of steps to go much further, thanks to how cheap disk space is today.  read more »

  • Yes, the distro keeps the Windows install around dual boot, but it also builds a virtual machine so it can be run under linux. Of course hardware drivers differ when running under a VM, so this is non-trivial, and Windows XP and later will claim they are stolen if they wake up in different hardware. You may have to call Microsoft, which they may eventually try to stop.
  • Look through the Windows copy and see what apps are installed. For apps that migrate well to linux, either because they have equivalents or run at silver or gold level under Wine, move them into linux. Extract their settings and files and move those into the linux environment. Of course this is easiest to do when you have something like Firefox as the browser, but IE settings and bookmarks can also be imported.
  • Examine the windows registry for other OS settings, desktop behaviours etc. Import them into a windows-like linux desktop. Ideally when it boots up, the user will see it looking and feeling a lot like their windows environment.
  • Using remote window protocols, it’s possible to run windows programs in a virtual machine with their window on the X desktop. Try this for some apps, though understand some things like inter-program communication may not do as well.
  • Next, offer programs directly in the virtual machine as another desktop. Put the windows programs on the windows-like “start” menu, but have them fire up the program in the virtual machine, or possibly even fire up the VM as needed. Again, memory is getting very cheap.
  • Strongly encourage the Windows VM be operated in a checkpointing manner, where it is regularly reverted to a base state, if this is possible.
  • The linux box, sitting outside the windows VM, can examine its TCP traffic to check for possible infections or strange traffic to unusual sites. A database like the siteadvisor one can help spot these unusual things, and encourage restoring the windows box back to a safe checkpoint.

A Package packager to compartmentalize my system changes

First, let me introduce a new blog topic, Sysadmin where I will cover computer system administration and OS design issues, notably in Linux and related systems.

My goal is to reduce the nightmare that is system administration and upgrading.

One step that goes partway in my plan would be a special software system that would build for a user a specialized operating system “package” or set of packages. This magic package would, when applied to a virgin distribution of the operating system, convert it into the customized form that the user likes.

The program would work from a modified system, and a copy of a map (with timestamps and hashes) of the original virgin OS from which the user began. First, it would note what packages the user had installed, and declare dependencies for these packages. Thus, installing this magic package would cause the installation of all the packages the user likes, and all that they depend on.

In order to do this well, it would try to determine which packages the user actually used (with access or file change times) and perhaps consider making two different dependency setups — one for the core packages that are frequently used, and another for packages that were probably just tried and never used. A GUI to help users sort packages into those classes would be handy. It must also determine that those packages are still available, dealing with potential conflicts and name change concerns. Right now, most package managers insist that all dependencies be available or they will abort the entire install. To get around this, many of the packages might well be listed as “recommended” rather than required, or options to allow install of the package with missing 1st level (but not 2nd level) dependencies would be used.  read more »

A Posix (universal API) for package management

As part of my series on the horrors of modern system administration and upgrading, let me propose the need for a universal API, over all operating systems, for accessing data from, and some control of the package management system.

There have been many efforts in the past to standardize programming APIs within all the unix-like operating systems, some of them extending into MS Windows, such as Posix. Posix is a bit small to write very complex programs fully portably but it’s a start. Any such API can make your portability easier if it can’t make it trivial the way it’s supposed to.

But there has been little effort to standardize the next level, machine administration and configuration. Today a large part of that is done with the package manager. Indeed, the package manager is the soul (and curse) of most major OS distributions. One of the biggest answers to “what’s the difference between debian and Fedora” is “dpkg and apt, vs. rpm and yum.” (Yes you can, and I do, use apt with rpm.)

Now the truth is that from a user perspective, these package managers don’t actually look very different. They all install and remove packages by name, perform upgrades, handle dependencies etc. Add-ons like apt and GUI package managers help users search and auto-install all dependencies. To the user, the most common requests are to find and install a package, and to upgrade it or the system.  read more »

Virtual Machine Image library at EC2

The use of virtual machines is getting very popular in the web hosting world. Particularly exciting to many people is’s EC2 — which means Elastic Compute Cloud. It’s a large pool of virtual machines that you can rent by the hour. I know people planning on basing whole companies on this system, because they can build an application that scales up by adding more virtual machines on demand. It’s decently priced and a lot cheaper than building it yourself in most cases.

In many ways, something like EC2 would be great for all those web sites which deal with the “slashdot” effect. I hope to see web hosters, servers and web applications just naturally allow scaling through the addition of extra machines. This typically means either some round-robin-DNS, or a master server that does redirects to a pool of servers, or a master cache that processes the data from a pool of servers, or a few other methods. Dealing with persistent state that can’t be kept in cookies requires a shared database among all the servers, which may make the database the limiting factor. Rumours suggest Amazon will release an SQL interface to their internal storage system which presumably is highly scalable, solving that problem.

As noted, this would be great for small to medium web sites. They can mostly run on a single server, but if they ever see a giant burst of traffic, for example by being linked to from a highly popular site, they can in minutes bring up extra servers to share the load. I’ve suggested this approach for the Battlestar Galactica Wiki I’ve been using — normally their load is modest, but while the show is on, each week, predictably, they get such a huge load of traffic when the show actually airs that they have to lock the wiki down. They have tried to solve this the old fashioned way — buying bigger servers — but that’s a waste when they really just need one day a week, 22 weeks a year, of high capacity.

However, I digress. What I really want to talk about is using such systems to get access to all sorts of platforms. As I’ve noted before, linux is a huge mishmash of platforms. There are many revisions of Ubuntu, Fedora, SuSE, Debian, Gentoo and many others out there. Not just the current release, but all the past releases, in both stable, testing and unstable branches. On top of that there are many versions of the BSD variants.  read more »

Syndicate content