Difference between revisions of "Granule Acceptance"

From DocDataFlow
Jump to: navigation, search
Line 17: Line 17:
 
[[File:Sampleexporter.png|800px]]
 
[[File:Sampleexporter.png|800px]]
  
Note that the ViewAssembler sits at the core of a number of 'adapter loops'. The 'ViewAssembler'
+
Note that the ViewAssembler sits at the core of a number of 'adapter loops'. The 'ViewAssembler' and 'Selector' adapters in this network have been modified to not count visits: they both allow granules to 'pass through' more than once.
 +
 
 +
However, the individual sub-adapters of the Selector do use the default visit counting: they only allow one visit.

Revision as of 22:44, 29 December 2013

An important mechanism in Crawler is the idea of 'granule acceptance' by adapters.

When a granule is presented to any adapter for processing, the adapter can accept or reject the granule based on a number of criteria.

Some of these criteria are part of the default infrastructure of Crawler, and are checked automatically. However, these automatic criteria can always be overruled by a particular type of adapter or adapter network.

The default criteria are only there for convenience: they will be 'the right thing' in most cases. They can be adjusted for the more uncommon cases where the acceptance criteria need to be different.

Visit Counting

The first default criterium: granules are not normally accepted twice by the same adapter: only one 'visit' is allowed.

In some of the more complex personalities, you might see 'adapter loops': networks of adapters where the output of an adapter further down the data flow feeds back into the input of an adapter earlier in the data flow. These loops will often rely on the 'don't accept twice' mechanism to avoid getting caught into endless loops.

For example, here is a schematic representation of the network used for document conversion in Crawler:

Sampleexporter.png

Note that the ViewAssembler sits at the core of a number of 'adapter loops'. The 'ViewAssembler' and 'Selector' adapters in this network have been modified to not count visits: they both allow granules to 'pass through' more than once.

However, the individual sub-adapters of the Selector do use the default visit counting: they only allow one visit.