Reboot and a paper

May 25, 2009

Over the last few days I received several spam comments on this blog (written using Cyrillic alphabet, incidentally). That made me visit this blog for the first time in several months, to turn off comments on old posts. Inadvertently, that also made me consider rebooting this. (Spam can be useful!) At the least, I can use this as a space where I put down comments about interesting papers I read. I might also write about some of the things I am working on.
***

Let me begin by discussing a paper of mine that just got accepted at SMC 2009. The title of the paper is ‘Classes of Optimal Network Topologies under Multiple Efficiency and Robustness Constraints’. In this paper, we have looked at the problem of designing optimal network topologies under efficiency, robustness and cost trade-offs. Efficiency, robustness and cost are defined in terms of structural properties: diameter, average path length, closeness centrality (efficiency); degree centrality, node betweenness, connectivity (robustness); number of edges (cost). A slider $\alpha$ is used to indicate the emphasis on robustness (on a scale of [0, 1]), and thus decides the trade-off between efficiency and robustness. Another parameter $\beta$ (in [0, 1]) decides how cheap or expensive edges are. For example, if $\beta$ is 0, we might make use of the full budget allocated, whereas if $\beta$ is 1, we want to “squeeze out” the most cost-effective topology by removing superfluous edges. Finally, there is also a constraint on the maximum permissible degree (indegree and outdegree, in case of digraphs) on a node in the resulting graph. Maximum degree can be thought of as a constraint on the “bookkeeping” (for example, size of DHT finger tables) a node has to do, or as constraints on maximum inflow and outflow through a node.

Different efficiency and robustness metrics are useful in different application contexts. For example, in case of traffic flow networks, congestion is a major challenge. Hence, node betweenness is a useful robustness metric, along with diameter as the efficiency metric which relates to upper bounds on the communication delay. In a Network Centric Warfare (NCW) scenario, it might be important to have alternate communication paths in case targeted attacks occur on nodes or links. Connectivity is a useful robustness metric there. And so on. Thus, what we do next is an ‘exploration of the parameter space’ involving efficiency and robustness metrics, $\alpha$, $\beta$, cost and maximum degree. We conduct genetic algorithm based experiments to evolve the “fittest” or the “most optimal” topologies under given trade-offs.

The main results in the paper are the following:

  • Two prominent classes of topologies that emerge as optimal: (1) star-like or scale-free networks with small diameters, low cost, high resilience to random failures and very low resilience to targeted attacks (2) Circular skip lists (CSL) which are highly resilient in the face of both random failures and targeted attacks, have small diameters under moderate costs.
  • Further, we observe that star-like networks are optimal only when both the following design requirements are satisfied: (1) very low emphasis on robustness (or very high emphasis on efficiency) and (2) severe restrictions on cost. In all other cases, CSL are optimal topologies in terms of balancing efficiency, robustness and cost.
    Thus, we observe a sharp transition from star-like networks to CSL as emphasis on robustness ($\alpha$) increases or cost restrictions become less severe.
  • Circular skip lists are highly homogeneous wrt many structural properties such as degree, betweenness, closeness, pairwise connectivity, pairwise path lengths. Further, they have structural motifs such as Hamiltonicity, (near) optimal connectivity, low diameter, which are optimal under varied application requirements. We argue that CSLs are potential underpinnings for optimal network design in terms of balancing efficiency, robustness and cost.

Graph Powers

August 7, 2008

Earlier I had talked about multiplying the adjacency matrix of a graph by itself and how it leads to some interesting results. Multiplying the adjacency matrix by itself essentially corresponds to raising the graph by some integer exponent while maintaining the same number of nodes.

The powers of graphs are interesting structures. The p^th power, G^p of a graph G is obtained by adding edges between nodes in G that are separated by a pathlength (greater than 1 and) less than or equal to p. Specifically, the square of a graph is the graph resulting from joining all nodes that are separated by a pathlength 2. A cube of a graph is obtained by short-circuiting all 2 length and 3 length paths in the graph.

An important consequence of raising the graph to a power is diameter reduction. If G has diameter d, then graph G^p has diameter \ceil{d/p}. There are several other interesting properties of graph powers that we can observe, which are important in the design of optimal topologies. I’d be interested in knowing your ideas.


A note on Hamiltonian circuits (and Uncle Paul)

June 22, 2008

Sometime back I read a short paper by Chvatal and Erdos that presents a couple of sufficiency conditions for hamiltonicity.

One of the results there is – If G is an s-connected graph with at least 3 vertices, and has a maximum independence number of s, then G has a hamiltonian circuit.

It is an interesting result connecting robustness and hamiltonicity. A graph is s-connected if there are s edge independent paths between any two nodes in the graph. In other words, it is the size of the minimum edge cut i.e. the minimum number of edges whose deletion increases the number of components in the graph.

The maximum independence number of a graph is the size of the biggest independent set. An independent set is a set of vertices of the graph such that no two vertices in the set are adjacent. In other words, there is no edge between any pair of vertices in the independent set. The bigger the independence number, the easier it is to fragment the network.

As you can see both these concepts can be used to measure the robustness of a graph. Also, it is not unnatural that connectivity and hamiltonicity are related. Intuitively, the more independent paths, the greater the chances of a graph being hamiltonian. And, a hamiltonian circuit is at least 2-connected: a circle is the simplest hamiltonian circuit and it is 2-connected.

***

Well, but the point of this post is something else really. The paper I referred to in the beginning is just a 3-page note. And it has an interesting footnote on the 1st page. The footnote says: This note was written in Professor Richard K. Guy’s car on the way from Pullman to Spokane, Wash. The authors wish to express their gratitude to Mrs. Guy for smooth driving.


Seminar on Complex Networks

May 14, 2008

A series of seminars on Complex Systems has been scheduled through the summer over here. I had been asked to give the first seminar. I gave a talk yesterday. I gave an overview of the philosophy, design and analyses of  complex networks. Among other things, I talked about how several graph theoretic measures can be used in both design and analyses of complex networks. There was a lot of lively discussion. The total seminar spanned about 2.5 hours, which probably means the seminar went off quite well. In case you are interested, here are the slides I had used.


Graph invariants, “identifiers” and reconstruction

March 25, 2008

Graph invariants are properties associated with a graph that do not change across all possible isomorphisms of the graph. In other words, if a set of graphs are isomorphic, then they all have the same values for certain properties which are called invariants. Examples of graph invariants are the following:

  • no of nodes
  • no of edges
  • the degree distribution (or degree sequence)
  • vertex (edge) chromatic number: the minimum no. of colours needed to colour all vertices (edges) such that adjacent vertices (edges) do not have the same colour
  • vertex (edge) covering number: the minimum no. of vertices (edges) needed to cover all edges (vertices)

As you can see, the invariants are numbers. A graph invariant describes a graph in terms of a simple number. Given a graph, it has a unique number of nodes, number of edges, degree sequence etc.. They find application in chemistry where large chemical compounds are indexed based on these numbers.

It should be noted that the incidence and adjacency matrices are not graph invariants. This is because, when isomorphism is computed, if for every pair of adjacent nodes in graph-1, you can find a pair of adjacent nodes in graph-2, and vice versa, the two graphs are isomorphic; labels are not important.

Also, for a given graph, there are unique invariants. However, the converse need not hold. For example, consider the invariant degree sequence. Shown below are two graphs that have the same degree sequence, but are non-isomorphic.

***

What I am interested in finding what can be called graph “identifiers”. Graph identifiers are properties or metrics that uniquely identify a graph. Examples of identifiers are adjacency and incidence matrices. Given an adjacency matrix, there is only one graph corresponding to it. Similarly, are there more identifiers? It need not be one property; it can be a combination of a set of properties like the no. of edges, degree sequence, diameter, centrality sequence and so on. What is perhaps also interesting is to measure the “uniqueness” of sets of properties. Uniqueness indicates in percentage terms “the extent” to which a property or a set of properties identify a graph. For instance, the adjacency matrix has a uniqueness value of 1. Again, this can be measured in terms of, say edit distances.

I am not sure there is much work in this direction. Do you think this is an interesting problem?

A consequence of this is graph reconstruction. If we have a set of properties that identify a graph, it is possible to construct an isomorphic graph using those properties. It can be used to recover networks that are subject to disruptions. (My usage of reconstruction is similar to that in literature, but slightly different.)


More on Regularity Preservation

March 16, 2008

As sri rightly pointed out in the comments section of the previous post, clique insertion is just one way to extend an r-regular graph into a bigger r-regular graph. We should be able to extend the graph by inserting any (r – 2)-regular graph (provided the insertion is possible — refer sri’s comment). In fact, clique insertion is the minimal case: the smallest connected (r-2)-regular graph has r -1 nodes.

There are interesting questions here. Can we insert any (r – 2)-regular graph? What is the procedure to do so? Are there some (r – 2)-regular graphs that cannot be inserted?

Let’s use some notations for ease of reference. Let Ro be the r-regular graph we start with, Ri be the (r – 2)-regular graph we want to insert and Rn be the graph resultant of the insertion.

A simple but important thing to be noted here is that, pairs of nodes that are adjacent in Ri, cannot be inserted in a single edge of Ro because that will result in an overlapping edge (we don’t allow multiedges between nodes); adjacent nodes in Ri have to be inserted in different edges. Even for small values of r, this becomes pretty complicated to visualize. I tried to insertions for small graphs, and it was possible to do. But I don’t know if we can always avoid creating a multigraph.

But then in the above discussion we have assumed Ri to be a connected graph. What if Ri is disconnected? If Ri is disconnected we can easily prove that there is no bound on the size of Ri that can be inserted. The simplest way to show this is by saying that our Ri has an arbitrarily large number of (r – 2)-cliques as disconnected components (disconnected edges, disconnected cycles, disconnected cubic graphs etc.). Now, you can either imagine this as adding an arbitrary number of cliques (like in the previous post) or as adding one large disconnected graph.

Of course, cliques are just one way to insert. We can simply insert nodes and add edges between the new nodes in such a way that all the new nodes have a degree of r.

Note that the resulting graph Rn is, of course, a connected graph.

***

Now, since we are talking about p2p networks, what is more important is the minimum number of nodes and edges that we need to add when nodes join or leave. Or more generally, the minimum change in topology that is required. When a new node comes in, let’s say we need to accommodate it without disrupting symmetry; so we may need to add dummy nodes and edges. The thing is – in a p2p network, most of the time, unrelated nodes join or leave unrelated parts of the network. So, essentially, we are dealing with disconnected components.


Regular Graphs: “Clique Insertion”

March 12, 2008

After talking about the basics of regular graphs, let me now introduce something more interesting. When we make changes to a regular graph topology, like addition or removal of edges or nodes, the resulting topology might not remain regular. An interesting question then is the following: are there ways to modify a given regular graph topology, while maintaining its regularity?

Here I introduce a simple operation called “clique insertion” that “extends” a graph (or makes a graph bigger by adding more nodes) while preserving the regularity of the graph. The basic idea of this is due to sri. Later on I developed it a bit more and now it looks like it is a useful result.

Given an r-regular graph of n nodes, the graph can be extended to an r-regular graph of n + (r – 1) nodes by a process of “node insertions” (or “clique insertion”).

Explanation: Let us start with the minimally connected non-trivial regular graph: a circle with n nodes. We can see that by “inserting” a node between any two nodes, we can extend the graph into a bigger graph while preserving the regularity. In other words, cut an arbitrary edge into two, and make their loose ends incident on a newly added node. We can go on inserting any number of nodes in this fashion. We just get bigger circles, all of which have a regularity of 2.

 

It is pretty simple to see in a circle. Now, let’s go to a slightly more complicated topology. Consider a 3-regular graph. We can get a bigger 3-regular graph by inserting 2 nodes (as explained above), and adding another edge between the 2 nodes.

 

It is equivalent to inserting a clique of exactly r – 1 (where r is the regularity) nodes in the existing graph (refer the figure below). For an arbitrary n and arbitrary r, we can construct an n + (r – 1) node graph by inserting exactly r – 1 nodes, and adding edges among them to form a r – 1 node clique.

 

                                                               

In the figure, we have a 4-regular graph (incidentally, it is the complete graph K5). We now do a “clique insertion” operation on the graph to get a new 4-regular graph of 8 nodes. The nodes marked in brown form the clique.

 

The above result can be extended thus: given an r-regular graph of n nodes, we can construct a graph of n + m * (r – 1), where m is a positive integer. So, we can easily construct arbitrarily large regular graphs of a given regularity r by inserting any number of cliques of size exactly r – 1.

 

 

 

Usefulness: This has an important connotation in distributed systems, where nodes joining and failures are frequent. Several distributed hash tables (DHTs) require a symmetric degree distribution. They also need to accommodate for node joining and leaving.

  • For one, this result can be developed to be a measure the amount of disruption in symmetry that can be expected when changes in topology occur.
  • Also, it can be used to design topology changes so as to preserve symmetry. Perhaps, by using dummy nodes.

Question: Is there a similar way to make a graph smaller while preserving the regularity?

That was one example of preserving a structural property of a graph. It is an interesting to find out if we can have a whole set of such operators. We can look at several other properties as well: diameter, average path length, centrality distributions, connectivity etc.. I shall define this problem more formally in another post later on.