cascade results from the CPAN

Net-OnlineCode
Graphically, the example rule above could be illustrated as follows:

  message           auxiliary           check
 
     M <--------------------------------- C
                                        /
                        A <------------/
 

This diagram could equally have been written with the check blocks on
the left and the message blocks on the right, or turned 90
degrees. It's merely a matter of convention, similar to the two ways
of writing out the equation as either:

 C <- M xor A

or 

 M xor A -> C

For the remainder of the document, I'll go with the convention of
saying that auxiliary blocks are to the right of the message blocks
and check blocks are to the right of both of them. (My code uses a
different convention again and talks about check nodes being higher
than auxiliary and message nodes).

Besides information about edges, the graph also stores a status bit
for each node to indicate whether that node is known (solved) or
unknown. Check nodes are always taken to be solved since the encoder
sends the value of that block, whereas message and auxiliary nodes are
all initially unknown/unsolved.

In the explanation of the algebraic interpretation, I talked about
finding an equation that had just a single unknown and rearranging it
so that the single unknown value moves to one side and all the other
knowns move to the other side. There is an analoguous operation on the
graph, and this is named the "propagation rule".

The propagation rule involves finding a known node which has exactly
one unsolved neighbour on the left. In the above example, if we are
considering whether to propagate from node C (which is known) both M
and A are unknown, so the rule does not match. If, on the other hand,
one of M or A are known, the rule does match.

When the propagation rule matches, the solution for the newly-solved
node on the left becomes the XOR of the node on the right plus all the
other nodes emanating from that (right) node. When a node is solved in
this way, all edges from the node on the right are removed from the
graph.

In my code, the propagation rule is handled in a routine called
resolve().

=head2 Cascades

When matched, the propagation rule solves one extra node somewhere to
the left of the starting node. In the algebraic interpretation, I
talked about substituting a newly-solved variable into all other
equations where the variable appeared. There is an analogous procedure
in the graph-based implementation, which is implemenented in the
cascade() routine.

For the sake of discussion, let's assume that the message block M was
solved by the propagation rule and that it had the solution:

 M <- A xor C

To simulate substituting M into all other equations where it appears,
we need to work backwards (from left to right) from node M and see if
any of those nodes now match the propagation rule. Since there will be
one rightward edge in the graph from that node for each equation the
left node appears in, this effectively reaches all equations that
could could become solvable.

In the case where the left node which has become solved is an
auxiliary block, the cascade() routine also queues up the auxiliary
node itself for checking the propagation rule.

=head2 Special handling for auxiliary nodes

Although in theory the propagation rule could be applied to unsolved
auxiliary nodes, in practice this has proved troublesome, so I have
not implemented it. Instead I have implemented a special "auxiliary
rule" that gives comparable results.

Recall that the propagation rule works with a single known node on the
right and a single unknown node on the left. It is also possible to
devise a rule where there is a message node on the left and an
unsolved aux rule on the right. If the auxiliary node has only one
unsolved left neighbour (ie the message node) and that message node
becomes solved, then the auxiliary block can be solved too.

Initially, each auxiliary block will be composed of some number of
message blocks:

 aux   = msg   xor  msg   xor ...
    x       i          j

When the last unsolved message block on the right becomes solved then
this equation has no more unknowns apart from the aux block
itself. Therefore, it can be marked as solved (with the above
solution) and we can cascade up from that aux block to see if it
solves any more equations.

=head2 Optimising by tracking counts of unsolved left nodes

When aux or check nodes are created, the number of unknown/unsolved
edges that they are comprised of is calculated. Whenever a node
becomes solved, each of the nodes that include that node in its list
of edges has their unsolved count decremented.

This improves performance by avoiding having to scan the node's full
list of leftward edges when it is considered for resolving.

=head2 "Bones" ("Bundles of Node Elements")

In a previous version of the program, edges in the graph were stored
by keeping track of the left (bottom) end of the edge in a hash, while
the right (top) end was stored in a list. I also had a separate array
for storing the solutions of each node. The "Bones" structure
essentially combines the top part of the edge and the solution into a
single fixed-size array. This was done to improve performance by
eliminating lots of list copying as the graph was processed.

The Bones structure is a fixed-sized array with three elements:

=over

=item * count of unsolved left (down) edges

=item * node ids of unsolved left (down) edges

=item * list of known node ids

=back

Bones can also be viewed as encapsulating an algebraic equation of the
form:

 [unknown nodes] <- [known nodes]

At the start of decoding, each auxiliary node has a Bone object
created for it:

 [aux node, message nodes] <- []

That is, the aux node and all its constituent message nodes are all
marked as unknown/unsolved (there are no knowns in the equation).

The "Bone" is attached to the auxiliary node and reciprocal links (the
other end of the edges) are created in each of the component message
nodes. All the nodes in the left hand side except the aux node itself
are considered to be the top parts of edges.

When a check node is created, its Bone is of the form:

 [unsolved msg/aux nodes] <- [ check node, solved msg/aux nodes ]

The check node is placed on the right along with other known nodes
because the decoder knows the value of all received check nodes.
Reciprocal links are only created for unsolved nodes.
( run in 1.242 second using v1.01-cache-2.11-cpan-9581c071862 )