I’ve been writing recently about the linux upgrade nightmares that continue to trouble the world. The next in my series of ideas is a suggestion that we try to measure how well upgrades go, and make a database of results available.
Millions of people are upgrading packages every day. And it usually goes smoothly. However, when it doesn’t, it would be nice if that were recorded and shared. Over time, one could develop an idea of which upgrades are safer than others. Thus, when it’s time to upgrade many packages, the system could know which ones always go well, and which ones might deserve a warning, or should only be done if you don’t have something critical coming up that day.
We already know some of these. Major packages like Apache are often a chore, though they’ve done a lot more by using a philosophy of configuration files I heartily approve of — dividing up configuration to put config by different people in different files.
Some detection is automated. For example, the package tools detect if a configuration file is being upgraded after it’s been changed and offer the user a chance to keep the new one, their old one, or hand-mix them. What choice the user makes could be noted to measure how well the upgrades go. Frankly, any upgrade that even presents the user with questions should get some minor points against it, but if a user has to do a hand merge it should get lots of negative points.
Upgrades that got no complaint should be recorded, and upgrades that get an explicit positive comment (ie. the user actively says it went great) should also be noted. Of course, any time a user does an explicit negative comment that’s the most useful info of all. Users should be able to browse a nice GUI of all their recent upgrades — even months later — and make notes on how well things are going. If you discover something broken, it should be easy to make the report.
Then, when it comes time to do a big upgrade, such as a distribution upgrade, certain of the upgrades can be branded as very, very safe, and others as more risky. In fact, users could elect to just do only the safe ones. Or they could even elect to automatically do safe upgrades, particularly if there are lots of safety reports on their exact conditions (former and current version, dependencies in place.) Automatic upgrading is normally a risky thing, it can generate the risk of a problem accidentally spreading like wildfire, but once you have lots of reports about how safe it is, you can make it more and more automatic.
Thus the process might start with upgrading the 80% of packages that are safe, and then the 15% that are mostly safe. Then allocate some time and get ready for the ones that probably will involve some risk or work. Of course, if everything depends on a risky change (such as a new libc) you can’t get that order, but you can still improve things.
There is a risk of people gaming the database, though in non-commercial environments that is hopefully small. It may be necessary to have reporters use IDs that get reputations. For privacy reasons, however, you want to anonymize data after verifying it.