[1] Luciano da F. Costa, Francisco A. Rodrigues, Gonzalo Travieso, P. R. Villas Boas. *Characterization of complex networks: A survey of measurements*. Advances in Physics, Volume 56, pages 167 – 242, Issue 1 (2007)

Posted in complex networks, network analysis ]]>

***

Let me begin by discussing a paper of mine that just got accepted at SMC 2009. The title of the paper is ‘Classes of Optimal Network Topologies under Multiple Efficiency and Robustness Constraints’. In this paper, we have looked at the problem of designing optimal network topologies under efficiency, robustness and cost trade-offs. Efficiency, robustness and cost are defined in terms of structural properties: diameter, average path length, closeness centrality (efficiency); degree centrality, node betweenness, connectivity (robustness); number of edges (cost). A slider $\alpha$ is used to indicate the emphasis on robustness (on a scale of [0, 1]), and thus decides the trade-off between efficiency and robustness. Another parameter $\beta$ (in [0, 1]) decides how cheap or expensive edges are. For example, if $\beta$ is 0, we might make use of the full budget allocated, whereas if $\beta$ is 1, we want to “squeeze out” the most cost-effective topology by removing superfluous edges. Finally, there is also a constraint on the maximum permissible degree (indegree and outdegree, in case of digraphs) on a node in the resulting graph. Maximum degree can be thought of as a constraint on the “bookkeeping” (for example, size of DHT finger tables) a node has to do, or as constraints on maximum inflow and outflow through a node.

Different efficiency and robustness metrics are useful in different application contexts. For example, in case of traffic flow networks, congestion is a major challenge. Hence, node betweenness is a useful robustness metric, along with diameter as the efficiency metric which relates to upper bounds on the communication delay. In a Network Centric Warfare (NCW) scenario, it might be important to have alternate communication paths in case targeted attacks occur on nodes or links. Connectivity is a useful robustness metric there. And so on. Thus, what we do next is an ‘exploration of the parameter space’ involving efficiency and robustness metrics, $\alpha$, $\beta$, cost and maximum degree. We conduct genetic algorithm based experiments to evolve the “fittest” or the “most optimal” topologies under given trade-offs.

The main results in the paper are the following:

- Two prominent classes of topologies that emerge as optimal: (1) star-like or scale-free networks with small diameters, low cost, high resilience to random failures and very low resilience to targeted attacks (2) Circular skip lists (CSL) which are highly resilient in the face of both random failures and targeted attacks, have small diameters under moderate costs.
- Further, we observe that star-like networks are optimal only when both the following design requirements are satisfied: (1) very low emphasis on robustness (or very high emphasis on efficiency) and (2) severe restrictions on cost. In all other cases, CSL are optimal topologies in terms of balancing efficiency, robustness and cost.

Thus, we observe a sharp transition from star-like networks to CSL as emphasis on robustness ($\alpha$) increases or cost restrictions become less severe. - Circular skip lists are highly homogeneous wrt many structural properties such as degree, betweenness, closeness, pairwise connectivity, pairwise path lengths. Further, they have structural motifs such as Hamiltonicity, (near) optimal connectivity, low diameter, which are optimal under varied application requirements. We argue that CSLs are potential underpinnings for optimal network design in terms of balancing efficiency, robustness and cost.

Posted in complex networks, distributed systems, graph theory, network design ]]>

The powers of graphs are interesting structures. The p^th power, G^p of a graph G is obtained by adding edges between nodes in G that are separated by a pathlength (greater than 1 and) less than or equal to p. Specifically, the square of a graph is the graph resulting from joining all nodes that are separated by a pathlength 2. A cube of a graph is obtained by short-circuiting all 2 length and 3 length paths in the graph.

An important consequence of raising the graph to a power is diameter reduction. If G has diameter d, then graph G^p has diameter \ceil{d/p}. There are several other interesting properties of graph powers that we can observe, which are important in the design of optimal topologies. I’d be interested in knowing your ideas.

]]>

One of the results there is – If *G* is an *s-connected* graph with at least 3 vertices, and has a maximum independence number of *s*, then *G* has a hamiltonian circuit.

It is an interesting result connecting robustness and hamiltonicity. A graph is *s-connected* if there are *s* edge independent paths between any two nodes in the graph. In other words, it is the size of the minimum edge cut i.e. the minimum number of edges whose deletion increases the number of components in the graph.

The *maximum independence number* of a graph is the size of the biggest *independent set*. An independent set is a set of vertices of the graph such that no two vertices in the set are adjacent. In other words, there is no edge between any pair of vertices in the independent set. The bigger the independence number, the easier it is to fragment the network.

As you can see both these concepts can be used to measure the robustness of a graph. Also, it is not unnatural that connectivity and hamiltonicity are related. Intuitively, the more independent paths, the greater the chances of a graph being hamiltonian. And, a hamiltonian circuit is at least 2-connected: a circle is the simplest hamiltonian circuit and it is 2-connected.

***

Well, but the point of this post is something else really. The paper I referred to in the beginning is just a 3-page note. And it has an interesting footnote on the 1st page. The footnote says: *This note was written in Professor Richard K. Guy’s car on the way from Pullman to Spokane, Wash. The authors wish to express their gratitude to Mrs. Guy for smooth driving.*

]]>

How can this sum start reducing and converge to 0? One has to look more carefully to see what is happening. Let us start with a smaller number, say 3.

Iteration 1 (hereditary base-2):

(1) 3 = 2 + 1

(2) 3 + 1

(3) 3 + 1 – 1 = 3

Iteration 2 (hereditary base-3):

(1) 3 = 3

(2) 4

(3) 4 + 1 – 1 = 3

Iteration 3 (hereditary base-4)

(1) 3 = 3

(2) 3

(3) 3 – 1 = 2

See what’s happening? Firstly, note that in step(2) of iteration 1, the last term, which is 1, does not get increased, because the number is less than the current base. Say the base is n. Then increasing the base of numbers 1 through (n – 1), does not change them. For example, 2 is base 4 is 4^{0} + 4^{0}. And 2 in base 5 is, 5^{0} + 5^{0}. Correct? So, in this case, the last term is not increased and there is a subtraction following that.

What happens is, even when you are increasing the bases of the terms, the subtraction by 1 operation is slowly “eating” into the last term of the sum. This process may be arbitrarily slow, but eventually, it so happens that, the power of the last term in the sum “falls off”, and the last term falls into a lower base than the ongoing one, hence ceasing to increase. So then, the subtraction can now eat away the last term easily, before it starts attacking the next term (which will now be the last term).

So, you can now see that, that the sum does in fact start reducing, albeit after numerous steps, even for small numbers. In fact, you can only check this for the numbers 1, 2 and 3 by hand. You can probably write a program to check this for bigger numbers, though it might take extremely long.

***

Now, another important thing about Goodstein’s theorem is that it is one of Godel type theorems. The truth of Goodstein’s theorem cannot be proved using the axioms of first order arithmetic (Peano arithmetic). There is a proof for this, as well as a proof for Goodsteins’s theorem using techniques that are outside of Peano arithmetic.

]]>

The sequence needs some explanation of a notation called *hereditary base-n notation*. Let me explain this with an example. Let us write the number 26 in its hereditary base-2 notation. First we start by writing 26 as the sum of powers of 2.

26 = 2^{4} + 2^{3} + 2^{1}

Next, the powers are written as sums of powers of 2 as well.

So, 26 = 2^{22} + 2^{21 + 1 } + 2^{1}

Similarly, the hereditary base-3 notation of 1000 is the following.

1000 = 3^{6} + 3^{4} + 3^{3} + 1 = 3^{3 + 3 } + 3^{3 + 1} + 3^{3} + 1

Note that the bases and the powers cannot be bigger than *n. *Also, we can write the terms as a product of a base power *n* and a number smaller than *n*. For example – 26 can be written in hereditary base-3 notation as, 2.3^{2} + 2.3 + 2.

Now, let us take a number, say 26, and do the following:

- Take the number. Start with base, n = 2.
- Express number in hereditary-n notation
- The next number is formed by changing all the n’s to n+1’s. That is, “increase” the base of the sequence by 1. [n = n + 1]
- Subtract 1 from the above number and goto 2. [number = number – 1]

Let’s take an example. Let’s start with 4.

4 = 2^{2} (step 2)

3^{3} (step 3)

3^{3} – 1 = 26 (step 4) — Iteration 1

26 = 2.3^{2} + 2.3 + 2 (step 2)

2.4^{2} + 2.4 + 2 (step 3)

2.4^{2} + 2.4 + 2 – 1 = 41 (step 4) — Iteration 2

41 = 2.4^{2} + 2.4 + 1 (step 2)

2.5^{2} + 2.5 + 1 (step 3)

2.5^{2} + 2.5 + 1 – 1 = 60 (step 4) — Iteration 3

60 = 2.5^{2} + 2.5 (step 2)

2.6^{2} + 2.6 (step 3)

2.6^{2} + 2.6 – 1 = 83 (step 4) — Iteration 4

83 = 2.6^{2} + 6 + 5 (step 2)

2.7^{2} + 7 + 5 (step 3)

2.7^{2} + 7 + 5 – 1 = 109 (step 4) — Iteration 5

And so on. Now the question is, does this sequence converge (or terminate)? If it converges, what does it converge to? Or if you think it does not converge, explain why.

As always, people who already know this result may please defer commenting.

]]>

]]>

After numerous false starts, I have finally learnt (just) enough Perl to do the essential Unix-like file processing operations like head, tail, grep etc. on Windows. I am in my “Windows usage phase”, and there is absolutely nothing you can do through the command line. (And Windows Vista sucks even otherwise, which means soon I will be back to my “Ubuntu phase”.) I do use cygwin, but Perl is fun.

]]>

]]>

Now, it is evident that the adjacency matrix A also represents all the paths of length 1. Each entry indicates whether there is a 1-length path between the corresponding nodes or not. It also tells us how many 1-length paths are there between the two nodes. (Of course, it is either 0 or 1.)

Interesting things happen when we multiply the adjacency matrix by itself. Let’s take some examples to see what happens.

We take the graph on the left and multiply its adjacency matrix by itself. The results are on the right. (Sorry about the bad formatting; could not figure out an easy way to align the figures properly.) The matrix ‘mat2’ is the matrix A^{2 }. The entries a_{ii} show the number of 2-length paths between the nodes i and j. Why this happens is easy to see: if there is an edge ij and an edge jk, then there will be a path ik through j. The entries ii are the degrees of the nodes i.

What happens if we compute A^{3}? Let’s hold it for now and see an example directed graph.

Here, again the entries in mat2 show the number of 2-length paths. The diagonal entries are 0’s unlike the case of undirected graphs where they show the degrees. Next, if we continue this process, the next set of entries show the number of 3-length paths. In the case of digraphs, this can be generalised. By repeated multiplications we can all paths up to lengths n – 1. If there are some non-diagonal entries that have not taken a value greater than 0 even once in the entire process, the graph is not strongly connected.

***

Now consider mat3 in either of the above cases, which is the matrix A^{3}. The trace of this matrix shows an important structural property. The trace of a matrix is the sum of the diagonal entries. Trace = sum(a_{ii}). The trace of A^{3} has a relationship with the number of triangles in the graph.

In case of undirected graphs, Trace = 6 * no. of triangles in the graph = 6 * no of K_{3}‘s

In case of directed graphs, Trace = 3 * no. of triangles in the graph

Below are two more examples to illustrate the above point.

***

We can also note that the above procedure can be used to find the diameter of graphs. We have to find the minimum number of times the adjacency matrix has to be multiplied by itself so that each entry has taken a value greater than 0 at least once. The maximum is, of course, n – 1. Now, the complexity of this procedure is O(n * n^{3}). This is an order bigger than finding the diameter by first finding the all pairs shortest paths. However, in the average case, the former fares better. Also, if we can use methods of fast matrix multiplication, it further improves the complexity.

***

Are there more interesting properties of adjacency matrices? I think so. It would be a good exercise to explore.

]]>