On the refutation of Metcalfe's law

Topic: 

Recently IEEE Spectrum published a paper on a refutation of Metcalfe's law -- an observation (not really a law) by Bob Metcalfe -- that the "value" of a network incrased with the square of the number of people/nodes on it. I was asked to be a referee for this paper, and while they addressed some of my comments, I don't think they addressed the principle one, so I am posting my comments here now.

My main contention is that in many cases the value of a network actually starts declining after a while and becomes inversely proportional to the number of people on it. That's because noise (such as flamage and spam) and unmanageable signal (too many serious messages) rises with the size and eventually reaches a level where the cost of dealing with it surpasses the value of the network. I'm thinking mailing lists in particular here.

You can read my referee's comments on Metcalfe's law though note that these comments were written on the original article, before some corrections were made.

Comments

I can't really agree with you as spam and nuisance tends to be controlled. I'd rather approach a refutation of Metcalf law by considerreing how many people someone can handle on a network---i. e. introduce time constraints---but the question is quite dauting: I've been working for eight mouunths now exclusively on this question for my PhD, and my work are still pretty much in progress.

As I noted, large networks break down not just because of noise, but also because of too much signal. To make them useful, you must discard signal, sometimes arbitrarily. If you do, they stop being networks and are more like a central publishing channel, where some engine (even if it differs slightly for each user) is deciding what tiny subset of the messages in the network you will see.

Eventually you're like the letters to the editor column in a magazine, which at least does not drop in value as the magazine grows, but really isn't growing a great deal either.

The internet as a whole has grown in value as more are on it, because more people are innovating on it and find audiences for their innovation. But it's no nLog(n). And nobody can consume all the innovations on the internet in any event, you can only use a small subset and hope you have a way to find the best subset.

In fact, I believe the real tools of value are the ones that show you less of the network, not more. My reading time is already overfull. A good tool can't help by finding me more inputs.

Would you say there is too much signal in a phone network (with a strong regulation against spam: this exists, outside USA) ? A mail system (same remark)?

Interesting question. There are stronger regs against phone solicitors outside the USA? The ones in the USA are pretty strong and effective, so I am curious about what the ones you refer to are. Junk mail is still most of mail in the USA for residences.

But anyway, no, the too much signal problem does not occur in all networks. I said it occurs in some networks.

However, I do believe that with the advent of cell phones, we are starting a too much signal problem. People are identifying that there are times they want to turn off the phone. In public places, people are starting to revolt against the constant ringing of phones. There are solutions to this, but we are touching on the too much signal problem.

I first heard the n^2 from Prof. Staelin, also MIT, with a much clearer definition of "value". This was some time ago, and in the context of the business case for SBS (Satellite Business Systems). The definition of "value" was the probability that one network subscriber would be able to connect to another organization (subscriber or not), times the number of network subscribers. If you assume that the likelihood of attempted relationships is independent of subscriber status, then the probability of success is (S/M), where S is the number of subscribers and M the number of potential subscribers. There being S subscribers, the aggregated value becomes S*(S/M), hence proportional to S^2.

But even then the flaws in this were promptly pointed out. First, the random connectivity assumption is highly flawed. Connectivity is related to relationships, relationships cluster, and the clustered relationships influence subscription probability. So the S^2 was already recognized as a very crude approximation.

What you are pointing out is that the ability to connect is not your primary metric for value. This is true for most people. There are plenty of times when you wish to not be available, and there is great value in being able to schedule availability. The answering machine, call forwarding, caller identification, etc. are highly valuable because they control availability. Your point on the value of the signal once connected is another important one.

The Metcalfe quote leaves the definition of value completely vague. (I don't know whether the very original was the Staelin definition, but the quote has since lost that specificity). The problem with IEEE article is that their definition of "value" is still fuzzy. Similarly your definition of "value" is rather fuzzy. So far only Staelin's definition is clear, and he was equally clear that about the inadequacies of his metric as a general value metric. It was relevant as a metric for the purposes of designing SBS and similar telephone system alternatives.

I wish IEEE had paid more attention to your review, Brad.

The real issue is what's measured on the X-axis and in what type of environment. If you're indeed measuring "compatibly communicating devices" on a smallish network, then O(n^2) can make sense. As the network grows a bit, O(nlogn) may come into play. For very large networks, there definitely will be friction introduced by high costs of discovery, identity management, trust establishment, spam, etc. However, some of that can be mitigated by the addition of appropriate infrastructure.

If you're measuring users as opposed to "compatibly communicating devices" then all bets are off. More at my blog.

Add new comment