Difference between revisions of "Configuration File"

From DocDataFlow
Jump to: navigation, search
(INI file)
 
(5 intermediate revisions by one user not shown)
Line 1: Line 1:
= Configuration File =
+
Crawler's personalities are often configured by means of configuration files.
  
Crawler's personalities are driven to a great extent by means of configuration files.  
+
There is no hard rule that dictates what format these configuration files should use. They could be text files, [[INI file|''INI files'']], XML-based files,...
  
There is no hard rule that dictates how these configuration files should be created - they could be text files, INI-files, XML-based files,...
+
Because [[INI file|''INI files'']] are easy to understand for end-users, most of the pre-made personalities use INI-based configuration files.
  
Because .INI files are easy to understand for end-users, most of the pre-made personalities use .INI-based configuration files.
+
For more popular personalities, GUI-driven configuration tools might be provided, but in many cases, the GUI-development will lag behind on the functionality, in which case the next easiest method to (re)configure a Crawler personality is to edit one or more configuration files.
 
+
For more popular personalities, GUI-driven configuration tools might be provided, but in many cases, the GUI-development might lag, in which case the next easiest method to (re)configure a Crawler personality is to edit one or more configuration files.
+
 
+
== INI file ==
+
 
+
Basic INI files are a loosely defined de-facto standard; more info can be found [http://en.wikipedia.org/wiki/INI_file ''here''].
+
 
+
== Basic properties ==
+
 
+
The Crawler INI files have the following properties:
+
* Section and entry names are case-insensitive by default (but Crawler has built-in support for case-sensitive INI files should the need arise).
+
* Comment lines are supported. Prefixing a line with a '#' or a ';' makes it a comment line. In-line comments are not supported: a single line is either a comment line or it is not - comments on lines with data are not supported. For example:
+
<pre>
+
# This is a comment line
+
entry = test # test
+
</pre>
+
means to set ''entry'' to ''"test # test"''. The trailing # test is not seen as a comment.
+
 
+
* Blank lines are allowed (and ignored)
+
* If an INI file is missing a section, then any entries without section are assumed to be in a default section ''[main]''
+
* Duplicate names are allowed, and provide an 'override' mechanism. If an entry appears twice, the second appearance will 'win'.
+
* Entry values can be enclosed between double quotes (") in which case backslashes are used as an escape character as defined in JavaScript. If no double quotes are present, backslashes are not interpreted as escapes. When no double quotes are present, leading and trailing spaces are removed. The following entries are all equivalent:
+
<pre>
+
data =    my data
+
data = "my data"
+
data=my data
+
data="my\x20data"
+
data="my\u0020data"
+
</pre>
+
 
+
== Enhancements ==
+
 
+
Crawler INI files have a few Crawler-specific enhancements.
+
 
+
* Parent-child files. In a number of Crawler personalities, INI files are arranged in a parent-child relationship.
+
** It is possible to derive a new personality from an existing personality.
+
** Some personalities have some nested folder structures where INI files in the 'inner' folders use the INI files in the outer folders as parent files.
+
When two INI files have a parent-child relation, the child file 'inherits' all the contents of the parent's INI file. The child-INI can then either override certain entries in the parent INI (by repeating the same entry and section, and providing a different value), or perform string concatenation.
+

Latest revision as of 00:22, 28 December 2013

Crawler's personalities are often configured by means of configuration files.

There is no hard rule that dictates what format these configuration files should use. They could be text files, INI files, XML-based files,...

Because INI files are easy to understand for end-users, most of the pre-made personalities use INI-based configuration files.

For more popular personalities, GUI-driven configuration tools might be provided, but in many cases, the GUI-development will lag behind on the functionality, in which case the next easiest method to (re)configure a Crawler personality is to edit one or more configuration files.