Assembler

From DocDataFlow
Revision as of 02:47, 28 December 2013 by Kris (Talk | contribs)

Jump to: navigation, search

An assembler is an atomic adapter.

Assemblers accept granules via their input. They then use these input granules to construct larger granules. Typically, assemblers will rely on the presence of certain 'trigger granules' in the input stream, to decide when a constructed granule is complete and ready to be released via the assembler's output.

For example, an assembler could be collecting 'word granules', and string these 'word granules' together into some new 'word group' granule.

As time goes, the assembler needs to know when the 'word group' under construction is complete. The presence in the input stream of some other type granule (e.g. a 'text frame' granule) will typically be the trigger to release the newly constructed 'word group' granule, and get ready to construct the next 'word group' granule.

In a typical Crawler workflow, the larger granules that are broken apart by disassemblers are not stripped away and remain part of the data flow. For example, when a disassembler breaks apart a 'paragraph' granule into a series of 'word' granules, the output of the disassembler will typically consist of a stream of word granules, followed by the original paragraph granule from which the word granules were extracted.

An assembler further down the track will often mostly ignore such paragraph granule as far as its contents go. Instead it will collect the word granules, and wait for the paragraph granule solely as a terminating trigger to signify the series of word granules is complete.