From DocDataFlow
Jump to: navigation, search

Crawler.ID2MD is a Crawler-based software product which provides InDesign to Markdown export. Crawler.ID2MD is a temporary name; we will eventually come up with some better name.

The Markdown syntax definitions can be found here:

A .zip file with some sample documents and the Markdown conversions can be downloaded here:

Screen Shot 2014-06-06 at 12.03.49 AM.png

An early product preview can be made available upon request. Email [email protected] if you want to test it out.

To install, you need to first decompress the .zip archive and get to a folder Crawler.ID2MD which contains a file called Export.jsxbin.

Export.jsxbin is the script you'll need to run to activate Crawler.ID2MD.


Open an InDesign document, and double-click Export.jsxbin on the Scripts panel.

A new file with the same name as the original document should appear. The file name extension of the new file is .md by default. If your InDesign document is called MyDocument.indd you should see appear next to it.

If the InDesign document has graphics in it, these will be exported into a separate folder called images (by default).


Crawler.ID2MD can be configured through a number of configuration files.


There are two files named config.ini. One resides next to Export.jsxbin and configures the Crawler system; you will rarely, if ever need to change this file.

The second config.ini resides in the Personalities/Markdown subfolder and configures the Markdown features; this file will be where most configuration will be done.

These config.ini files can be opened with a standard text editor (e.g. TextWrangler on Macintosh: or Notepad++ on Windows: ).

One trick to quickly navigate to the config.ini file is to disclose it on the Scripts panel, right-click or <Control>-click it, and select 'Reveal in Finder' or 'Reveal in Explorer'.


Once you see the file, use a text editor to open it.

The most relevant configurations are described further down.


The folder Personalities/Markdown/markdownSnippets contains a number of template files that are used to generate the Markdown output.

These files can all be opened with a standard text editor.

For example, contains:

[//]: # (Document markdown file $$MARKDOWN_FILENAME$$ generated by $$SOFTWARE$$ $$VERSION$$)

Any text between two $$ is a placeholder which will be replaced by some calculated strings.

The various snippets represent different concentric layers of complexity. At the inner level is the This snippet is used to format individual style runs from the original document.

The next snippet will take the collated/concatenated output from the snippet.

The output of each type of snippet is concatenated, and 'handed up' to the next level of snippet, until eventually, the output is passed through the where it gets its final shape.


The expressions $$TEXT_RUN_STYLE_FORMAT_PREFIX$$ and $$TEXT_RUN_STYLE_FORMAT_SUFFIX$$ normally are replaced with nothing, except when the style run is bold or italic, when they become **, _ or **_.

Bold style runs are prefixed and suffixed with '**', italic style runs are prefixed and suffixed with _ and bold italic style runs are prefixed and suffixed with _** and **_.

$$INPUT_TEXT$$ is replaced by the text of the style run extracted from the original document. This text will also have some special characters escaped, e.g. ! is converted to \!, # is converted to \# and so on, as to make sure these characters are not mistaken for Markdown syntax.


This snippet is 'one level up' from the previous one. $$LINE_BROKEN_INPUT_TEXT$$ is replaced by the collated text of all the style runs in the paragraph, after which 'soft returns' are prefixed with two extra spaces, so Markdown preserves soft line breaks.

$$LINE_BROKEN_INPUT_TEXT$$ is calculated by a formula based on $$INPUT_TEXT$$, where $$INPUT_TEXT$$ is simply the concatenated/collated text of all text runs in the paragraph.The formula can be found in Personalities/Markdown/formulas/paragraph.jsx.snippet:

/ ****************

    var retVal = null;
        var inputText = $$INPUT_TEXT$$;
        if (! inputText || inputText == "")

        retVal = inputText.replace(/^\s+/,"");
        retVal = retVal.replace(/([\n\r])\s+/g,"$1");
        retVal = retVal.replace(/\s*([\n\r])/g,"  $1");
    while (false);

    return retVal;

// ****************


This snippet is again 'one level out' from the previous snippet. $$INPUT_TEXT$$ is replaced by the collated text of all the paragraphs in the text frame. This particular snippet boils down to a 'do nothing': simply pass on the data received by collating the paragraphs.


This snippet is on the same level - where the previous snippet is used for text frames, this one is used for graphical frames.

$$FRAME_PREFIX$$ and $$FRAME_SUFFIX$$ are replaced by nothing for anchored and inline graphics, so the graphic is displayed in-line in Markdown.

For 'floating' frames, these are instead replaced by a few extra newlines, to separate the graphic from the text.

$$FRAME_IMAGE_NAME$$ is replaced by the name of the image, and $$FRAME_IMAGE_PATH$$ becomes the relative path of the exported graphic.

This is the 'top level' snippet.

[//]: # (Document markdown file $$MARKDOWN_FILENAME$$ generated by $$SOFTWARE$$ $$VERSION$$)

$$INPUT_TEXT$$ is replaced by the collation of all lower-level snippets.

The first line in this snippet is a Markdown comment line with some meta-info about the document.

Constants and Formulas

In the snippets, there are a lot of references to placeholders between two $$.

Many of these placeholders are calculated automatically by Crawler, but it is also possible to define custom placeholders, either as configuration constants or as calculated formulas.


Any entry made in the [appContextData] in the config.ini becomes a placeholder. For example, the entry MARKDOWN_FILENAME_EXTENSION


causes a placeholder $$MARKDOWN_FILENAME_EXTENSION$$ to become available for use in the snippets.

If you were to add a new line, for example

 AUTHOR_NAME = "John Doe"

you could start using a placeholder $$AUTHOR_NAME$$ in the snippets.


These are small bits of ExtendScript (JavaScript)-like code which express how a certain placeholder can be calculated.

The code is not 'pure' ExtendScript - there is some pre-processing to handle 'placeholders' in the scripting code.

These files are stored in Personalities/Markdown/formulas

For example, the $$FRAME_PREFIX$$ placeholder in the is calculated by a formula in graphicframe.jsx.snippet.

// ****************

    var retVal = null;

        var granule = $$RAW_GRANULE$$;
        if (! (granule instanceof G.FrameGranule))

        var frame = granule.getData();
        if (frame.parent instanceof Character)
            retVal = "";
            retVal = "  \n";
    while (false);
    return retVal;

I won't go into too much detail, but essentially this is expressing that if the graphic frame has a Character as its 'parent' in InDesign (which means it is inline or anchored in text), then the return is "" (nothing). In all other cases, the return is " \n" (forced line break in Markdown).

Another useful example can be found in files.jsx.snippet:

// ****************

    return new Date().toString();

// ****************

This defines a placeholder $$TODAYS_DATE$$ which can be used to insert the current date. For example, you could adjust the from

[//]: # (Document markdown file $$MARKDOWN_FILENAME$$ generated by $$SOFTWARE$$ $$VERSION$$)


[//]: # (Document markdown file $$MARKDOWN_FILENAME$$ generated on $$TODAYS_DATE$$ by $$SOFTWARE$$ $$VERSION$$)

and then the Markdown files would contain the conversion date in a comment at the beginning.

Handling the raw text

One formula in particular is of interest. In the run.jsx.snippet you'll find the following formula:

// ****************

$$RAW_TEXT$$ =
    var retVal = undefined;
        retVal = $$RAW_TEXT$$;
        if (! retVal || retVal == "")

        retVal = retVal.replace(/#/g,"\\#")
        retVal = retVal.replace(/\*/g,"\\*")
        retVal = retVal.replace(/_/g,"\\_");
        retVal = retVal.replace(/!/g,"\\!")
    while (false);

    return retVal;

// ****************

This formula is responsible for escaping special characters in the input. If additional characters need to be escaped, this is the place to do it.

Useful config.ini settings


This setting sets the file name extension to use for the output files (default: md).


This is the name of the subfolder to use for storing the exported graphic frames into (default: images).


Where <n> is 1 up to 6. This setting is a comma-separated list of paragraph style names that must be converted to a header of level 1 to 6 (defaults: headingStylesLevel1 is Heading, Title; headingStylesLevel2 - headingStylesLevel6 are empty).


Where <n> is 1 up to 6. This is a number of points from which text will be considered to be of the corresponding heading level. (defaults: minPointSizeLevel1 is 18; minPointSizeLevel2 - minPointSizeLevel6 are empty).

For example, if minPointSizeLevel1 is 18, then any paragraph that starts with a glyph that is 18pt or larger will be considered to be a first-level heading.

These can be left empty. When defining multiple levels, it should be true that minPointSizeLevel1 > minPointSizeLevel2 > ... > minPointSizeLevel6.

In other words, if any, minPointSizeLevel6 has to be the smallest value, and minPointSizeLevel1 has to be the largest value.


The names any paragraph styles that will be converted to blockquotes (i.e. prefixed with '> ') (default: nothing).


The image resolution to use for exporting any graphic frames (default: 72).


This is either PNG or JPEG. It tells Crawler what image format to use for the graphic frames (default: PNG).

PNG is only supported in CS6 and above.


This setting is a list of font style names separated with | characters. The listed font style names will be considered to be bold (default bold|heavy|black)

This translates to prefixing and suffixing the text with two asterisks: **.


This setting is a list of font style names separated with | characters. The listed font style names will be considered to be italic (default italic|oblique|slanted)

This translates to prefixing and suffixing the text with an underscore: _.


This setting is a word you can assign to individual frames in the document via the Script Label panel (default: nothing).

By entering this word onto the Script Label panel you can force selected text frames to be exported as bitmaps instead of as text.


This setting is a word you can assign to individual frames in the document via the Script Label panel (default: ignore).

By entering this word onto the Script Label panel you can force the selected page item to be omitted from the output file.


This setting is either 0 or 1 (default: 0).

Setting it to 1 will force Crawler to omit any frames that are on invisible layers.


This setting is a comma-separated list of layer names. It can be left empty (default: nothing).

Any items on a layer named here will be suppressed from the output.


This setting is either 0 or 1 (default: 0). Setting it to 1 will force Crawler to express master page items in the export.


This setting is either 0 or 1 (default: 0). If it is 0, the export will be done on a story-by-story basis. Setting it to 1 forces Crawler to export on a textframe-by-textframe basis.


To install Crawler.ID2MD, you first need to launch InDesign, and find the Scripts panel. If it is not visible, you can make it appear by means of the Window - Utilities - Scripts menu item.

Once the Scripts panel is visible, right-click or <Control>-click the User entry and select 'Reveal in Finder' (on Mac) or 'Reveal in Explorer' (on Windows).


A window on a folder called Scripts should open.

Inside there should be a folder called Scripts Panel. Double-click its icon to enter it.


Once you're inside the Scripts Panel folder, you can drag the Crawler.ID2MD folder into it.


Now switch back to InDesign, and verify that the Crawler.ID2MD folder has appeared under the User folder on the Scripts panel:


Click the disclosure triangle - you should now see Export.jsxbin.