Difference between revisions of "Assembler"

From DocDataFlow
Jump to: navigation, search
Line 1: Line 1:
 
An assembler is an [[Atomic adapter|''atomic adapter'']].  
 
An assembler is an [[Atomic adapter|''atomic adapter'']].  
  
Assemblers accept granules via their input. They then use these input granules to construct larger granules. Typically, assemblers will rely on the presence of certain 'trigger granules' in the input stream, to decide when a constructed granule is complete and ready to be released via the assembler's output.
+
Assemblers accept granules via their input connection.  
 +
 
 +
They then use these input granules to construct larger granules.  
 +
 
 +
Typically, assemblers will rely on the presence of certain 'trigger granules' in the input stream, to help them decide when they have all the necessary data needed to finish a constructed granule. When the constructed granule is ready, it is released via the assembler's output.
 +
 
 +
Assemblers will often drop the smaller granules they used from the data flow, and only emit the newly constructed granules.
  
 
For example, an assembler could be collecting 'word granules', and string these 'word granules' together into some new 'word group' granule.  
 
For example, an assembler could be collecting 'word granules', and string these 'word granules' together into some new 'word group' granule.  

Revision as of 18:58, 29 December 2013

An assembler is an atomic adapter.

Assemblers accept granules via their input connection.

They then use these input granules to construct larger granules.

Typically, assemblers will rely on the presence of certain 'trigger granules' in the input stream, to help them decide when they have all the necessary data needed to finish a constructed granule. When the constructed granule is ready, it is released via the assembler's output.

Assemblers will often drop the smaller granules they used from the data flow, and only emit the newly constructed granules.

For example, an assembler could be collecting 'word granules', and string these 'word granules' together into some new 'word group' granule.

As time goes, the assembler needs to know when the 'word group' under construction is complete. The presence in the input stream of some other type granule (e.g. a 'text frame' granule) will typically be the trigger to release the newly constructed 'word group' granule, and get ready to construct the next 'word group' granule.

In a typical Crawler workflow, a disassembler will only add to the data flow. It won't take granules away. In other words: the larger granules that are broken apart by disassemblers are not stripped away and remain part of the data flow.

For example, when a disassembler breaks apart a 'paragraph' granule into a series of 'word' granules, the output of the disassembler will typically consist of a stream of word granules, followed by the original paragraph granule from which the word granules were extracted.

An assembler further down the data flow will often mostly ignore such paragraph granule as far as its contents go. Instead it will collect the word granules, and wait for the paragraph granule solely as a terminating trigger to signify the series of word granules is complete.