Theory and Practice: Strongly connected components, part 1

[edit: This post is garbage. See if you can find the mistake.]

Problem: Find the strongly connected comonents (SCCs) of a given digraph. An SCC is a maximal subgraph inside which all nodes are reachable from any node.

Idea:

[sort] Sort the nodes such that if m is not reachable from n but n is reachable from m then n comes before m.
[extract] Pick the first unseen node m and mark all unseen nodes n reachable from m as seen and belonging to the component identified by m. If there are unseen nodes left, then repeat this step.

Proof: In the extract step we make sure that n is reachable from m. But is m reachable from n? Let's suppose it's not. Then the "if" of step sort is satisfied and n must have been processed in a previous iteration of extract which meens that it is marked as seen and therefore we won't process it now.

Details: The only requirement for extract is that all reachable nodes are visited and therefore either depth first search (DFS) or breadth first search (BFS) will work. The sort step is a little trickier. In DFS, when an unseen node is first visited it gets activated, then all the yet unseen nodes reachable from it are explored recursively, and then the node is made stable. The DFS defines two orders on nodes: the activation order and the stabilization order. Is any of these good? I claim that the stabilization order is what we want. Let's say we activate m and then look recursively at nodes n that it can reach. If n is unseen then it will be stabilized before m; if n is active then it can reach back to m; if n is stable, well, it comes before m in the stabilization order.

Code: Here is how this looks in Python, the latest pseudocodish language.

def scc(g):

  def extract(m, c, component):

    component[m] = c

    for n in g[m]:

      if n not in component:

        extract(n, c, component)

  def sort(m, seen, component):

    seen.add(m)

    for n in g[m]:

      if n not in seen:

        sort(n, seen, component)

    if m not in component:

      extract(m, m, component)

  seen = set()

  component = dict()

  for m in g.keys():

    if m not in seen:

      sort(m, seen, component)

  return component

The graph is assumed to be represented as a dictionary from nodes to list of successor nodes. A node in each SCC is chosen as the representative. The result is a dictionary that maps each node to the proper representative. (As a side note, please excuse the long parameter lists: I'm using Python 2.5.)

Exercise: Implemention using two breadth first traversals.

Next episode: How to do it with only one traversal.

Theory and Practice

28 November 2008

Strongly connected components, part 1

No comments:

Post a Comment