| 1 |
Writing Extensions for Python-Markdown |
| 2 |
====================================== |
| 3 |
|
| 4 |
Overview |
| 5 |
-------- |
| 6 |
|
| 7 |
Python-Markdown includes an API for extension writers to plug their own |
| 8 |
custom functionality and/or syntax into the parser. There are preprocessors |
| 9 |
which allow you to alter the source before it is passed to the parser, |
| 10 |
inline patterns which allow you to add, remove or override the syntax of |
| 11 |
any inline elements, and postprocessors which allow munging of the |
| 12 |
output of the parser before it is returned. If you really want to dive in, |
| 13 |
there are also blockprocessors which are part of the core BlockParser. |
| 14 |
|
| 15 |
As the parser builds an [ElementTree][] object which is later rendered |
| 16 |
as Unicode text, there are also some helpers provided to ease manipulation of |
| 17 |
the tree. Each part of the API is discussed in its respective section below. |
| 18 |
Additionaly, reading the source of some [[Available Extensions]] may be helpful. |
| 19 |
For example, the [[Footnotes]] extension uses most of the features documented |
| 20 |
here. |
| 21 |
|
| 22 |
* [Preprocessors][] |
| 23 |
* [InlinePatterns][] |
| 24 |
* [Treeprocessors][] |
| 25 |
* [Postprocessors][] |
| 26 |
* [BlockParser][] |
| 27 |
* [Working with the ElementTree][] |
| 28 |
* [Integrating your code into Markdown][] |
| 29 |
* [extendMarkdown][] |
| 30 |
* [OrderedDict][] |
| 31 |
* [registerExtension][] |
| 32 |
* [Config Settings][] |
| 33 |
* [makeExtension][] |
| 34 |
|
| 35 |
<h3 id="preprocessors">Preprocessors</h3> |
| 36 |
|
| 37 |
Preprocessors munge the source text before it is passed into the Markdown |
| 38 |
core. This is an excellent place to clean up bad syntax, extract things the |
| 39 |
parser may otherwise choke on and perhaps even store it for later retrieval. |
| 40 |
|
| 41 |
Preprocessors should inherit from ``markdown.preprocessors.Preprocessor`` and |
| 42 |
implement a ``run`` method with one argument ``lines``. The ``run`` method of |
| 43 |
each Preprocessor will be passed the entire source text as a list of Unicode |
| 44 |
strings. Each string will contain one line of text. The ``run`` method should |
| 45 |
return either that list, or an altered list of Unicode strings. |
| 46 |
|
| 47 |
A pseudo example: |
| 48 |
|
| 49 |
class MyPreprocessor(markdown.preprocessors.Preprocessor): |
| 50 |
def run(self, lines): |
| 51 |
new_lines = [] |
| 52 |
for line in lines: |
| 53 |
m = MYREGEX.match(line) |
| 54 |
if m: |
| 55 |
# do stuff |
| 56 |
else: |
| 57 |
new_lines.append(line) |
| 58 |
return new_lines |
| 59 |
|
| 60 |
<h3 id="inlinepatterns">Inline Patterns</h3> |
| 61 |
|
| 62 |
Inline Patterns implement the inline HTML element syntax for Markdown such as |
| 63 |
``*emphasis*`` or ``[links](http://example.com)``. Pattern objects should be |
| 64 |
instances of classes that inherit from ``markdown.inlinepatterns.Pattern`` or |
| 65 |
one of its children. Each pattern object uses a single regular expression and |
| 66 |
must have the following methods: |
| 67 |
|
| 68 |
* **``getCompiledRegExp()``**: |
| 69 |
|
| 70 |
Returns a compiled regular expression. |
| 71 |
|
| 72 |
* **``handleMatch(m)``**: |
| 73 |
|
| 74 |
Accepts a match object and returns an ElementTree element of a plain |
| 75 |
Unicode string. |
| 76 |
|
| 77 |
Note that any regular expression returned by ``getCompiledRegExp`` must capture |
| 78 |
the whole block. Therefore, they should all start with ``r'^(.*?)'`` and end |
| 79 |
with ``r'(.*?)!'``. When using the default ``getCompiledRegExp()`` method |
| 80 |
provided in the ``Pattern`` you can pass in a regular expression without that |
| 81 |
and ``getCompiledRegExp`` will wrap your expression for you. This means that |
| 82 |
the first group of your match will be ``m.group(2)`` as ``m.group(1)`` will |
| 83 |
match everything before the pattern. |
| 84 |
|
| 85 |
For an example, consider this simplified emphasis pattern: |
| 86 |
|
| 87 |
class EmphasisPattern(markdown.inlinepatterns.Pattern): |
| 88 |
def handleMatch(self, m): |
| 89 |
el = markdown.etree.Element('em') |
| 90 |
el.text = m.group(3) |
| 91 |
return el |
| 92 |
|
| 93 |
As discussed in [Integrating Your Code Into Markdown][], an instance of this |
| 94 |
class will need to be provided to Markdown. That instance would be created |
| 95 |
like so: |
| 96 |
|
| 97 |
# an oversimplified regex |
| 98 |
MYPATTERN = r'\*([^*]+)\*' |
| 99 |
# pass in pattern and create instance |
| 100 |
emphasis = EmphasisPattern(MYPATTERN) |
| 101 |
|
| 102 |
Actually it would not be necessary to create that pattern (and not just because |
| 103 |
a more sophisticated emphasis pattern already exists in Markdown). The fact is, |
| 104 |
that example pattern is not very DRY. A pattern for `**strong**` text would |
| 105 |
be almost identical, with the exception that it would create a 'strong' element. |
| 106 |
Therefore, Markdown provides a number of generic pattern classes that can |
| 107 |
provide some common functionality. For example, both emphasis and strong are |
| 108 |
implemented with separate instances of the ``SimpleTagPettern`` listed below. |
| 109 |
Feel free to use or extend any of these Pattern classes. |
| 110 |
|
| 111 |
**Generic Pattern Classes** |
| 112 |
|
| 113 |
* **``SimpleTextPattern(pattern)``**: |
| 114 |
|
| 115 |
Returns simple text of ``group(2)`` of a ``pattern``. |
| 116 |
|
| 117 |
* **``SimpleTagPattern(pattern, tag)``**: |
| 118 |
|
| 119 |
Returns an element of type "`tag`" with a text attribute of ``group(3)`` |
| 120 |
of a ``pattern``. ``tag`` should be a string of a HTML element (i.e.: 'em'). |
| 121 |
|
| 122 |
* **``SubstituteTagPattern(pattern, tag)``**: |
| 123 |
|
| 124 |
Returns an element of type "`tag`" with no children or text (i.e.: 'br'). |
| 125 |
|
| 126 |
There may be other Pattern classes in the Markdown source that you could extend |
| 127 |
or use as well. Read through the source and see if there is anything you can |
| 128 |
use. You might even get a few ideas for different approaches to your specific |
| 129 |
situation. |
| 130 |
|
| 131 |
<h3 id="treeprocessors">Treeprocessors</h3> |
| 132 |
|
| 133 |
Treeprocessors manipulate an ElemenTree object after it has passed through the |
| 134 |
core BlockParser. This is where additional manipulation of the tree takes |
| 135 |
place. Additionally, the InlineProcessor is a Treeprocessor which steps through |
| 136 |
the tree and runs the InlinePatterns on the text of each Element in the tree. |
| 137 |
|
| 138 |
A Treeprocessor should inherit from ``markdown.treeprocessors.Treeprocessor``, |
| 139 |
over-ride the ``run`` method which takes one argument ``root`` (an Elementree |
| 140 |
object) and returns either that root element or a modified root element. |
| 141 |
|
| 142 |
A pseudo example: |
| 143 |
|
| 144 |
class MyTreeprocessor(markdown.treeprocessors.Treeprocessor): |
| 145 |
def run(self, root): |
| 146 |
#do stuff |
| 147 |
return my_modified_root |
| 148 |
|
| 149 |
For specifics on manipulating the ElementTree, see |
| 150 |
[Working with the ElementTree][] below. |
| 151 |
|
| 152 |
<h3 id="postprocessors">Postprocessors</h3> |
| 153 |
|
| 154 |
Postprocessors manipulate the document after the ElementTree has been |
| 155 |
serialized into a string. Postprocessors should be used to work with the |
| 156 |
text just before output. |
| 157 |
|
| 158 |
A Postprocessor should inherit from ``markdown.postprocessors.Postprocessor`` |
| 159 |
and over-ride the ``run`` method which takes one argument ``text`` and returns |
| 160 |
a Unicode string. |
| 161 |
|
| 162 |
Postprocessors are run after the ElementTree has been serialized back into |
| 163 |
Unicode text. For example, this may be an appropriate place to add a table of |
| 164 |
contents to a document: |
| 165 |
|
| 166 |
class TocPostprocessor(markdown.postprocessors.Postprocessor): |
| 167 |
def run(self, text): |
| 168 |
return MYMARKERRE.sub(MyToc, text) |
| 169 |
|
| 170 |
<h3 id="blockparser">BlockParser</h3> |
| 171 |
|
| 172 |
Sometimes, pre/tree/postprocessors and Inline Patterns aren't going to do what |
| 173 |
you need. Perhaps you want a new type of block type that needs to be integrated |
| 174 |
into the core parsing. In such a situation, you can add/change/remove |
| 175 |
functionality of the core ``BlockParser``. The BlockParser is composed of a |
| 176 |
number of Blockproccessors. The BlockParser steps through each block of text |
| 177 |
(split by blank lines) and passes each block to the appropriate Blockprocessor. |
| 178 |
That Blockprocessor parses the block and adds it to the ElementTree. The |
| 179 |
[[Definition Lists]] extension would be a good example of an extension that |
| 180 |
adds/modifies Blockprocessors. |
| 181 |
|
| 182 |
A Blockprocessor should inherit from ``markdown.blockprocessors.BlockProcessor`` |
| 183 |
and implement both the ``test`` and ``run`` methods. |
| 184 |
|
| 185 |
The ``test`` method is used by BlockParser to identify the type of block. |
| 186 |
Therefore the ``test`` method must return a boolean value. If the test returns |
| 187 |
``True``, then the BlockParser will call that Blockprocessor's ``run`` method. |
| 188 |
If it returns ``False``, the BlockParser will move on to the next |
| 189 |
BlockProcessor. |
| 190 |
|
| 191 |
The **``test``** method takes two arguments: |
| 192 |
|
| 193 |
* **``parent``**: The parent etree Element of the block. This can be useful as |
| 194 |
the block may need to be treated differently if it is inside a list, for |
| 195 |
example. |
| 196 |
|
| 197 |
* **``block``**: A string of the current block of text. The test may be a |
| 198 |
simple string method (such as ``block.startswith(some_text)``) or a complex |
| 199 |
regular expression. |
| 200 |
|
| 201 |
The **``run``** method takes two arguments: |
| 202 |
|
| 203 |
* **``parent``**: A pointer to the parent etree Element of the block. The run |
| 204 |
method will most likely attach additional nodes to this parent. Note that |
| 205 |
nothing is returned by the method. The Elementree object is altered in place. |
| 206 |
|
| 207 |
* **``blocks``**: A list of all remaining blocks of the document. Your run |
| 208 |
method must remove (pop) the first block from the list (which it altered in |
| 209 |
place - not returned) and parse that block. You may find that a block of text |
| 210 |
legitimately contains multiple block types. Therefore, after processing the |
| 211 |
first type, your processor can insert the remaining text into the beginning |
| 212 |
of the ``blocks`` list for future parsing. |
| 213 |
|
| 214 |
Please be aware that a single block can span multiple text blocks. For example, |
| 215 |
The official Markdown syntax rules state that a blank line does not end a |
| 216 |
Code Block. If the next block of text is also indented, then it is part of |
| 217 |
the previous block. Therefore, the BlockParser was specifically designed to |
| 218 |
address these types of situations. If you notice the ``CodeBlockProcessor``, |
| 219 |
in the core, you will note that it checks the last child of the ``parent``. |
| 220 |
If the last child is a code block (``<pre><code>...</code></pre>``), then it |
| 221 |
appends that block to the previous code block rather than creating a new |
| 222 |
code block. |
| 223 |
|
| 224 |
Each BlockProcessor has the following utility methods available: |
| 225 |
|
| 226 |
* **``lastChild(parent)``**: |
| 227 |
|
| 228 |
Returns the last child of the given etree Element or ``None`` if it had no |
| 229 |
children. |
| 230 |
|
| 231 |
* **``detab(text)``**: |
| 232 |
|
| 233 |
Removes one level of indent (four spaces by default) from the front of each |
| 234 |
line of the given text string. |
| 235 |
|
| 236 |
* **``looseDetab(text, level)``**: |
| 237 |
|
| 238 |
Removes "level" levels of indent (defaults to 1) from the front of each line |
| 239 |
of the given text string. However, this methods allows secondary lines to |
| 240 |
not be indented as does some parts of the Markdown syntax. |
| 241 |
|
| 242 |
Each BlockProcessor also has a pointer to the containing BlockParser instance at |
| 243 |
``self.parser``, which can be used to check or alter the state of the parser. |
| 244 |
The BlockParser tracks it's state in a stack at ``parser.state``. The state |
| 245 |
stack is an instance of the ``State`` class. |
| 246 |
|
| 247 |
**``State``** is a subclass of ``list`` and has the additional methods: |
| 248 |
|
| 249 |
* **``set(state)``**: |
| 250 |
|
| 251 |
Set a new state to string ``state``. The new state is appended to the end |
| 252 |
of the stack. |
| 253 |
|
| 254 |
* **``reset()``**: |
| 255 |
|
| 256 |
Step back one step in the stack. The last state at the end is removed from |
| 257 |
the stack. |
| 258 |
|
| 259 |
* **``isstate(state)``**: |
| 260 |
|
| 261 |
Test that the top (current) level of the stack is of the given string |
| 262 |
``state``. |
| 263 |
|
| 264 |
Note that to ensure that the state stack doesn't become corrupted, each time a |
| 265 |
state is set for a block, that state *must* be reset when the parser finishes |
| 266 |
parsing that block. |
| 267 |
|
| 268 |
An instance of the **``BlockParser``** is found at ``Markdown.parser``. |
| 269 |
``BlockParser`` has the following methods: |
| 270 |
|
| 271 |
* **``parseDocument(lines)``**: |
| 272 |
|
| 273 |
Given a list of lines, an ElementTree object is returned. This should be |
| 274 |
passed an entire document and is the only method the ``Markdown`` class |
| 275 |
calls directly. |
| 276 |
|
| 277 |
* **``parseChunk(parent, text)``**: |
| 278 |
|
| 279 |
Parses a chunk of markdown text composed of multiple blocks and attaches |
| 280 |
those blocks to the ``parent`` Element. The ``parent`` is altered in place |
| 281 |
and nothing is returned. Extensions would most likely use this method for |
| 282 |
block parsing. |
| 283 |
|
| 284 |
* **``parseBlocks(parent, blocks)``**: |
| 285 |
|
| 286 |
Parses a list of blocks of text and attaches those blocks to the ``parent`` |
| 287 |
Element. The ``parent`` is altered in place and nothing is returned. This |
| 288 |
method will generally only be used internally to recursively parse nested |
| 289 |
blocks of text. |
| 290 |
|
| 291 |
While is is not recommended, an extension could subclass or completely replace |
| 292 |
the ``BlockParser``. The new class would have to provide the same public API. |
| 293 |
However, be aware that other extensions may expect the core parser provided |
| 294 |
and will not work with such a drastically different parser. |
| 295 |
|
| 296 |
<h3 id="working_with_et">Working with the ElementTree</h3> |
| 297 |
|
| 298 |
As mentioned, the Markdown parser converts a source document to an |
| 299 |
[ElementTree][] object before serializing that back to Unicode text. |
| 300 |
Markdown has provided some helpers to ease that manipulation within the context |
| 301 |
of the Markdown module. |
| 302 |
|
| 303 |
First, to get access to the ElementTree module import ElementTree from |
| 304 |
``markdown`` rather than importing it directly. This will ensure you are using |
| 305 |
the same version of ElementTree as markdown. The module is named ``etree`` |
| 306 |
within Markdown. |
| 307 |
|
| 308 |
from markdown import etree |
| 309 |
|
| 310 |
``markdown.etree`` tries to import ElementTree from any known location, first |
| 311 |
as a standard library module (from ``xml.etree`` in Python 2.5), then as a third |
| 312 |
party package (``Elementree``). In each instance, ``cElementTree`` is tried |
| 313 |
first, then ``ElementTree`` if the faster C implementation is not available on |
| 314 |
your system. |
| 315 |
|
| 316 |
Sometimes you may want text inserted into an element to be parsed by |
| 317 |
[InlinePatterns][]. In such a situation, simply insert the text as you normally |
| 318 |
would and the text will be automatically run through the InlinePatterns. |
| 319 |
However, if you do *not* want some text to be parsed by InlinePatterns, |
| 320 |
then insert the text as an ``AtomicString``. |
| 321 |
|
| 322 |
some_element.text = markdown.AtomicString(some_text) |
| 323 |
|
| 324 |
Here's a basic example which creates an HTML table (note that the contents of |
| 325 |
the second cell (``td2``) will be run through InlinePatterns latter): |
| 326 |
|
| 327 |
table = etree.Element("table") |
| 328 |
table.set("cellpadding", "2") # Set cellpadding to 2 |
| 329 |
tr = etree.SubElement(table, "tr") # Add child tr to table |
| 330 |
td1 = etree.SubElement(tr, "td") # Add child td1 to tr |
| 331 |
td1.text = markdown.AtomicString("Cell content") # Add plain text content |
| 332 |
td2 = etree.SubElement(tr, "td") # Add second td to tr |
| 333 |
td2.text = "*text* with **inline** formatting." # Add markup text |
| 334 |
table.tail = "Text after table" # Add text after table |
| 335 |
|
| 336 |
You can also manipulate an existing tree. Consider the following example which |
| 337 |
adds a ``class`` attribute to ``<a>`` elements: |
| 338 |
|
| 339 |
def set_link_class(self, element): |
| 340 |
for child in element: |
| 341 |
if child.tag == "a": |
| 342 |
child.set("class", "myclass") #set the class attribute |
| 343 |
set_link_class(child) # run recursively on children |
| 344 |
|
| 345 |
For more information about working with ElementTree see the ElementTree |
| 346 |
[Documentation](http://effbot.org/zone/element-index.htm) |
| 347 |
([Python Docs](http://docs.python.org/lib/module-xml.etree.ElementTree.html)). |
| 348 |
|
| 349 |
<h3 id="integrating_into_markdown">Integrating Your Code Into Markdown</h3> |
| 350 |
|
| 351 |
Once you have the various pieces of your extension built, you need to tell |
| 352 |
Markdown about them and ensure that they are run in the proper sequence. |
| 353 |
Markdown accepts a ``Extension`` instance for each extension. Therefore, you |
| 354 |
will need to define a class that extends ``markdown.Extension`` and over-rides |
| 355 |
the ``extendMarkdown`` method. Within this class you will manage configuration |
| 356 |
options for your extension and attach the various processors and patterns to |
| 357 |
the Markdown instance. |
| 358 |
|
| 359 |
It is important to note that the order of the various processors and patterns |
| 360 |
matters. For example, if we replace ``http://...`` links with <a> elements, and |
| 361 |
*then* try to deal with inline html, we will end up with a mess. Therefore, |
| 362 |
the various types of processors and patterns are stored within an instance of |
| 363 |
the Markdown class in [OrderedDict][]s. Your ``Extension`` class will need to |
| 364 |
manipulate those OrderedDicts appropriately. You may insert instances of your |
| 365 |
processors and patterns into the appropriate location in an OrderedDict, remove |
| 366 |
a built-in instance, or replace a built-in instance with your own. |
| 367 |
|
| 368 |
<h4 id="extendmarkdown">extendMarkdown</h4> |
| 369 |
|
| 370 |
The ``extendMarkdown`` method of a ``markdown.Extension`` class accepts two |
| 371 |
arguments: |
| 372 |
|
| 373 |
* **``md``**: |
| 374 |
|
| 375 |
A pointer to the instance of the Markdown class. You should use this to |
| 376 |
access the [OrderedDict][]s of processors and patterns. They are found |
| 377 |
under the following attributes: |
| 378 |
|
| 379 |
* ``md.preprocessors`` |
| 380 |
* ``md.inlinePatterns`` |
| 381 |
* ``md.parser.blockprocessors`` |
| 382 |
* ``md.treepreprocessors`` |
| 383 |
* ``md.postprocessors`` |
| 384 |
|
| 385 |
Some other things you may want to access in the markdown instance are: |
| 386 |
|
| 387 |
* ``md.htmlStash`` |
| 388 |
* ``md.output_formats`` |
| 389 |
* ``md.set_output_format()`` |
| 390 |
* ``md.registerExtension()`` |
| 391 |
|
| 392 |
* **``md_globals``**: |
| 393 |
|
| 394 |
Contains all the various global variables within the markdown module. |
| 395 |
|
| 396 |
Of course, with access to those items, theoretically you have the option to |
| 397 |
changing anything through various [monkey_patching][] techniques. However, you |
| 398 |
should be aware that the various undocumented or private parts of markdown |
| 399 |
may change without notice and your monkey_patches may break with a new release. |
| 400 |
Therefore, what you really should be doing is inserting processors and patterns |
| 401 |
into the markdown pipeline. Consider yourself warned. |
| 402 |
|
| 403 |
[monkey_patching]: http://en.wikipedia.org/wiki/Monkey_patch |
| 404 |
|
| 405 |
A simple example: |
| 406 |
|
| 407 |
class MyExtension(markdown.Extension): |
| 408 |
def extendMarkdown(self, md, md_globals): |
| 409 |
# Insert instance of 'mypattern' before 'references' pattern |
| 410 |
md.inlinePatterns.add('mypattern', MyPattern(md), '<references') |
| 411 |
|
| 412 |
<h4 id="ordereddict">OrderedDict</h4> |
| 413 |
|
| 414 |
An OrderedDict is a dictionary like object that retains the order of it's |
| 415 |
items. The items are ordered in the order in which they were appended to |
| 416 |
the OrderedDict. However, an item can also be inserted into the OrderedDict |
| 417 |
in a specific location in relation to the existing items. |
| 418 |
|
| 419 |
Think of OrderedDict as a combination of a list and a dictionary as it has |
| 420 |
methods common to both. For example, you can get and set items using the |
| 421 |
``od[key] = value`` syntax and the methods ``keys()``, ``values()``, and |
| 422 |
``items()`` work as expected with the keys, values and items returned in the |
| 423 |
proper order. At the same time, you can use ``insert()``, ``append()``, and |
| 424 |
``index()`` as you would with a list. |
| 425 |
|
| 426 |
Generally speaking, within Markdown extensions you will be using the special |
| 427 |
helper method ``add()`` to add additional items to an existing OrderedDict. |
| 428 |
|
| 429 |
The ``add()`` method accepts three arguments: |
| 430 |
|
| 431 |
* **``key``**: A string. The key is used for later reference to the item. |
| 432 |
|
| 433 |
* **``value``**: The object instance stored in this item. |
| 434 |
|
| 435 |
* **``location``**: Optional. The items location in relation to other items. |
| 436 |
|
| 437 |
Note that the location can consist of a few different values: |
| 438 |
|
| 439 |
* The special strings ``"_begin"`` and ``"_end"`` insert that item at the |
| 440 |
beginning or end of the OrderedDict respectively. |
| 441 |
|
| 442 |
* A less-than sign (``<``) followed by an existing key (i.e.: |
| 443 |
``"<somekey"``) inserts that item before the existing key. |
| 444 |
|
| 445 |
* A greater-than sign (``>``) followed by an existing key (i.e.: |
| 446 |
``">somekey"``) inserts that item after the existing key. |
| 447 |
|
| 448 |
Consider the following example: |
| 449 |
|
| 450 |
>>> import markdown |
| 451 |
>>> od = markdown.OrderedDict() |
| 452 |
>>> od['one'] = 1 # The same as: od.add('one', 1, '_begin') |
| 453 |
>>> od['three'] = 3 # The same as: od.add('three', 3, '>one') |
| 454 |
>>> od['four'] = 4 # The same as: od.add('four', 4, '_end') |
| 455 |
>>> od.items() |
| 456 |
[("one", 1), ("three", 3), ("four", 4)] |
| 457 |
|
| 458 |
Note that when building an OrderedDict in order, the extra features of the |
| 459 |
``add`` method offer no real value and are not necessary. However, when |
| 460 |
manipulating an existing OrderedDict, ``add`` can be very helpful. So let's |
| 461 |
insert another item into the OrderedDict. |
| 462 |
|
| 463 |
>>> od.add('two', 2, '>one') # Insert after 'one' |
| 464 |
>>> od.values() |
| 465 |
[1, 2, 3, 4] |
| 466 |
|
| 467 |
Now let's insert another item. |
| 468 |
|
| 469 |
>>> od.add('twohalf', 2.5, '<three') # Insert before 'three' |
| 470 |
>>> od.keys() |
| 471 |
["one", "two", "twohalf", "three", "four"] |
| 472 |
|
| 473 |
Note that we also could have set the location of "twohalf" to be 'after two' |
| 474 |
(i.e.: ``'>two'``). However, it's unlikely that you will have control over the |
| 475 |
order in which extensions will be loaded, and this could affect the final |
| 476 |
sorted order of an OrderedDict. For example, suppose an extension adding |
| 477 |
'twohalf' in the above examples was loaded before a separate extension which |
| 478 |
adds 'two'. You may need to take this into consideration when adding your |
| 479 |
extension components to the various markdown OrderedDicts. |
| 480 |
|
| 481 |
Once an OrderedDict is created, the items are available via key: |
| 482 |
|
| 483 |
MyNode = od['somekey'] |
| 484 |
|
| 485 |
Therefore, to delete an existing item: |
| 486 |
|
| 487 |
del od['somekey'] |
| 488 |
|
| 489 |
To change the value of an existing item (leaving location unchanged): |
| 490 |
|
| 491 |
od['somekey'] = MyNewObject() |
| 492 |
|
| 493 |
To change the location of an existing item: |
| 494 |
|
| 495 |
t.link('somekey', '<otherkey') |
| 496 |
|
| 497 |
<h4 id="registerextension">registerExtension</h4> |
| 498 |
|
| 499 |
Some extensions may need to have their state reset between multiple runs of the |
| 500 |
Markdown class. For example, consider the following use of the [[Footnotes]] |
| 501 |
extension: |
| 502 |
|
| 503 |
md = markdown.Markdown(extensions=['footnotes']) |
| 504 |
html1 = md.convert(text_with_footnote) |
| 505 |
md.reset() |
| 506 |
html2 = md.convert(text_without_footnote) |
| 507 |
|
| 508 |
Without calling ``reset``, the footnote definitions from the first document will |
| 509 |
be inserted into the second document as they are still stored within the class |
| 510 |
instance. Therefore the ``Extension`` class needs to define a ``reset`` method |
| 511 |
that will reset the state of the extension (i.e.: ``self.footnotes = {}``). |
| 512 |
However, as many extensions do not have a need for ``reset``, ``reset`` is only |
| 513 |
called on extensions that are registered. |
| 514 |
|
| 515 |
To register an extension, call ``md.registerExtension`` from within your |
| 516 |
``extendMarkdown`` method: |
| 517 |
|
| 518 |
|
| 519 |
def extendMarkdown(self, md, md_globals): |
| 520 |
md.registerExtension(self) |
| 521 |
# insert processors and patterns here |
| 522 |
|
| 523 |
Then, each time ``reset`` is called on the Markdown instance, the ``reset`` |
| 524 |
method of each registered extension will be called as well. You should also |
| 525 |
note that ``reset`` will be called on each registered extension after it is |
| 526 |
initialized the first time. Keep that in mind when over-riding the extension's |
| 527 |
``reset`` method. |
| 528 |
|
| 529 |
<h4 id="configsettings">Config Settings</h4> |
| 530 |
|
| 531 |
If an extension uses any parameters that the user may want to change, |
| 532 |
those parameters should be stored in ``self.config`` of your |
| 533 |
``markdown.Extension`` class in the following format: |
| 534 |
|
| 535 |
self.config = {parameter_1_name : [value1, description1], |
| 536 |
parameter_2_name : [value2, description2] } |
| 537 |
|
| 538 |
When stored this way the config parameters can be over-ridden from the |
| 539 |
command line or at the time Markdown is initiated: |
| 540 |
|
| 541 |
markdown.py -x myextension(SOME_PARAM=2) inputfile.txt > output.txt |
| 542 |
|
| 543 |
Note that parameters should always be assumed to be set to string |
| 544 |
values, and should be converted at run time. For example: |
| 545 |
|
| 546 |
i = int(self.getConfig("SOME_PARAM")) |
| 547 |
|
| 548 |
<h4 id="makeextension">makeExtension</h4> |
| 549 |
|
| 550 |
Each extension should ideally be placed in its own module starting |
| 551 |
with the ``mdx_`` prefix (e.g. ``mdx_footnotes.py``). The module must |
| 552 |
provide a module-level function called ``makeExtension`` that takes |
| 553 |
an optional parameter consisting of a dictionary of configuration over-rides |
| 554 |
and returns an instance of the extension. An example from the footnote |
| 555 |
extension: |
| 556 |
|
| 557 |
def makeExtension(configs=None) : |
| 558 |
return FootnoteExtension(configs=configs) |
| 559 |
|
| 560 |
By following the above example, when Markdown is passed the name of your |
| 561 |
extension as a string (i.e.: ``'footnotes'``), it will automatically import |
| 562 |
the module and call the ``makeExtension`` function initiating your extension. |
| 563 |
|
| 564 |
You may have noted that the extensions packaged with Python-Markdown do not |
| 565 |
use the ``mdx_`` prefix in their module names. This is because they are all |
| 566 |
part of the ``markdown.extensions`` package. Markdown will first try to import |
| 567 |
from ``markdown.extensions.extname`` and upon failure, ``mdx_extname``. If both |
| 568 |
fail, Markdown will continue without the extension. |
| 569 |
|
| 570 |
However, Markdown will also accept an already existing instance of an extension. |
| 571 |
For example: |
| 572 |
|
| 573 |
import markdown |
| 574 |
import myextension |
| 575 |
configs = {...} |
| 576 |
myext = myextension.MyExtension(configs=configs) |
| 577 |
md = markdown.Markdown(extensions=[myext]) |
| 578 |
|
| 579 |
This is useful if you need to implement a large number of extensions with more |
| 580 |
than one residing in a module. |
| 581 |
|
| 582 |
[Preprocessors]: #preprocessors |
| 583 |
[InlinePatterns]: #inlinepatterns |
| 584 |
[Treeprocessors]: #treeprocessors |
| 585 |
[Postprocessors]: #postprocessors |
| 586 |
[BlockParser]: #blockparser |
| 587 |
[Working with the ElementTree]: #working_with_et |
| 588 |
[Integrating your code into Markdown]: #integrating_into_markdown |
| 589 |
[extendMarkdown]: #extendmarkdown |
| 590 |
[OrderedDict]: #ordereddict |
| 591 |
[registerExtension]: #registerextension |
| 592 |
[Config Settings]: #configsettings |
| 593 |
[makeExtension]: #makeextension |
| 594 |
[ElementTree]: http://effbot.org/zone/element-index.htm |