
This page demonstrates how to use FSharp.Formatting.Markdown
to parse a Markdown
document, process the obtained document representation and
how to turn the code into a nicely formatted HTML.
First, we need to load the assembly and open necessary namespaces:
open FSharp.Formatting.Markdown
open FSharp.Formatting.Common
The F# Markdown parser recognizes the standard Markdown syntax
and it is not the aim of this tutorial to fully document it.
The following snippet creates a simple string containing a document
with several elements and then parses it using the Markdown.Parse method:
let document =
"""
# F# Hello world
Hello world in [F#](http://fsharp.net) looks like this:
printfn "Hello world!"
For more see [fsharp.org][fsorg].
[fsorg]: http://fsharp.org "The F# organization." """
let parsed = Markdown.Parse(document)
The sample document consists of a first-level heading (written using
one of the two alternative styles) followed by a paragraph with a
direct link, code snippet and one more paragraph that includes an
indirect link. The URLs of indirect links are defined by a separate
block as demonstrated on the last line (and they can then be easily used repeatedly
from multiple places in the document).
The F# Markdown processor does not turn the document directly into HTML.
Instead, it builds a nice F# data structure that we can use to analyze,
transform and process the document. First of all the MarkdownDocument.DefinedLinks property
returns all indirect link definitions:
parsed.DefinedLinks
val it : IDictionary<string,(string * string option)> =
dict [("fsorg", ("http://fsharp.org", Some "The F# organization."))]
The document content can be accessed using the MarkdownDocument.Paragraphs property that returns
a sequence of paragraphs or other first-level elements (headings, quotes, code snippets, etc.).
The following snippet prints the heading of the document:
// Iterate over all the paragraph elements
for par in parsed.Paragraphs do
match par with
| Heading (size = 1; body = [ Literal (text = text) ]) ->
// Recognize heading that has a simple content
// containing just a literal (no other formatting)
printfn "%s" text
| _ -> ()
You can find more detailed information about the document structure and how to process it
in the book F# Deep Dives.
The library provides active patterns that can be used to easily process the Markdown
document recursively. The example in this section shows how to extract all links from the
document. To do that, we need to write two recursive functions. One that will process
all paragraph-style elements and one that will process all inline formattings (inside
paragraphs, headings etc.).
To avoid pattern matching on every single kind of span and every single kind of
paragraph, we can use active patterns from the MarkdownPatterns module. These can be use
to recognize any paragraph or span that can contain child elements:
/// Returns all links in a specified span node
let rec collectSpanLinks span =
seq {
match span with
| DirectLink (link = url) -> yield url
| IndirectLink (key = key) -> yield fst (parsed.DefinedLinks.[key])
| MarkdownPatterns.SpanLeaf _ -> ()
| MarkdownPatterns.SpanNode (_, spans) ->
for s in spans do
yield! collectSpanLinks s
}
/// Returns all links in the specified paragraph node
let rec collectParLinks par =
seq {
match par with
| MarkdownPatterns.ParagraphLeaf _ -> ()
| MarkdownPatterns.ParagraphNested (_, pars) ->
for ps in pars do
for p in ps do
yield! collectParLinks p
| MarkdownPatterns.ParagraphSpans (_, spans) ->
for s in spans do
yield! collectSpanLinks s
}
// Collect links in the entire document
Seq.collect collectParLinks parsed.Paragraphs
val it : seq<string> =
seq ["http://fsharp.net"; "http://fsharp.org"]
The collectSpanLinks
function works on individual span elements that contain inline
formatting (emphasis, strong) and also links. The DirectLink
node from MarkdownSpan represents an inline
link like the one pointing to http://fsharp.net while IndirectLink
represents a
link that uses one of the link definitions. The function simply returns the URL associated
with the link.
Some span nodes (like emphasis) can contain other formatting, so we need to recursively
process children. This is done by matching against MarkdownPatterns.SpanNodes
which is an active
pattern that recognizes any node with children. The library also provides a function
named MarkdownPatterns.SpanNode
that can be used to reconstruct the same node (when you want
to transform document). This is similar to how the ExprShape
module for working with
F# quotations works.
The function collectParLinks
processes paragraphs - a paragraph cannot directly be a
link so we just need to process all spans. This time, there are three options.
ParagraphLeaf
represents a case where the paragraph does not contain any spans
(a code block or, for example, a <hr>
line); the ParagraphNested
case is used for paragraphs
that contain other paragraphs (quotation) and ParagraphSpans
is used for all other
paragraphs that contain normal text - here we call collectSpanLinks
on all nested spans.
Finally, the Markdown type also includes a method Markdown.ToHtml that can be used
to generate an HTML document from the Markdown input. The following example shows how to call it:
let html = Markdown.ToHtml(parsed)
There are also methods to generate .fsx
, .ipynb
, .md
and .tex
.
Multiple items
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Formatting
namespace FSharp.Formatting.Markdown
namespace FSharp.Formatting.Common
val document: string
val parsed: MarkdownDocument
type Markdown =
static member Parse: text: string * ?newline: string * ?parseOptions: MarkdownParseOptions -> MarkdownDocument
static member ToFsx: doc: MarkdownDocument * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string
static member ToHtml: doc: MarkdownDocument * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string + 1 overload
static member ToLatex: doc: MarkdownDocument * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) * ?lineNumbers: bool -> string + 1 overload
static member ToMd: doc: MarkdownDocument * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string
static member ToPynb: doc: MarkdownDocument * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string
static member WriteHtml: doc: MarkdownDocument * writer: TextWriter * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> unit + 1 overload
static member WriteLatex: doc: MarkdownDocument * writer: TextWriter * ?newline: string * ?substitutions: (ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) * ?lineNumbers: bool -> unit + 1 overload
<summary>
Static class that provides methods for formatting
and transforming Markdown documents.
</summary>
static member Markdown.Parse: text: string * ?newline: string * ?parseOptions: MarkdownParseOptions -> MarkdownDocument
property MarkdownDocument.DefinedLinks: System.Collections.Generic.IDictionary<string,(string * string option)> with get
<summary>
Returns a dictionary containing explicitly defined links
</summary>
val par: MarkdownParagraph
property MarkdownDocument.Paragraphs: MarkdownParagraphs with get
<summary>
Returns a list of paragraphs in the document
</summary>
union case MarkdownParagraph.Heading: size: int * body: MarkdownSpans * range: MarkdownRange option -> MarkdownParagraph
Multiple items
union case MarkdownSpan.Literal: text: string * range: MarkdownRange option -> MarkdownSpan
--------------------
type LiteralAttribute =
inherit Attribute
new: unit -> LiteralAttribute
--------------------
new: unit -> LiteralAttribute
val text: string
val printfn: format: Printf.TextWriterFormat<'T> -> 'T
val collectSpanLinks: span: MarkdownSpan -> seq<string>
Returns all links in a specified span node
val span: MarkdownSpan
Multiple items
val seq: sequence: seq<'T> -> seq<'T>
--------------------
type seq<'T> = System.Collections.Generic.IEnumerable<'T>
union case MarkdownSpan.DirectLink: body: MarkdownSpans * link: string * title: string option * range: MarkdownRange option -> MarkdownSpan
val url: string
union case MarkdownSpan.IndirectLink: body: MarkdownSpans * original: string * key: string * range: MarkdownRange option -> MarkdownSpan
val key: string
val fst: tuple: ('T1 * 'T2) -> 'T1
module MarkdownPatterns
from FSharp.Formatting.Markdown
<summary>
This module provides an easy way of processing Markdown documents.
It lets you decompose documents into leafs and nodes with nested paragraphs.
</summary>
Multiple items
val SpanLeaf: MarkdownPatterns.SpanLeafInfo -> MarkdownSpan
--------------------
active recognizer SpanLeaf: MarkdownSpan -> Choice<MarkdownPatterns.SpanLeafInfo,(MarkdownPatterns.SpanNodeInfo * MarkdownSpans)>
Multiple items
val SpanNode: MarkdownPatterns.SpanNodeInfo * spans: MarkdownSpans -> MarkdownSpan
--------------------
active recognizer SpanNode: MarkdownSpan -> Choice<MarkdownPatterns.SpanLeafInfo,(MarkdownPatterns.SpanNodeInfo * MarkdownSpans)>
val spans: MarkdownSpans
val s: MarkdownSpan
val collectParLinks: par: MarkdownParagraph -> seq<string>
Returns all links in the specified paragraph node
Multiple items
val ParagraphLeaf: MarkdownPatterns.ParagraphLeafInfo -> MarkdownParagraph
--------------------
active recognizer ParagraphLeaf: MarkdownParagraph -> Choice<MarkdownPatterns.ParagraphLeafInfo,(MarkdownPatterns.ParagraphNestedInfo * MarkdownParagraphs list),(MarkdownPatterns.ParagraphSpansInfo * MarkdownSpans)>
Multiple items
val ParagraphNested: MarkdownPatterns.ParagraphNestedInfo * pars: MarkdownParagraphs list -> MarkdownParagraph
--------------------
active recognizer ParagraphNested: MarkdownParagraph -> Choice<MarkdownPatterns.ParagraphLeafInfo,(MarkdownPatterns.ParagraphNestedInfo * MarkdownParagraphs list),(MarkdownPatterns.ParagraphSpansInfo * MarkdownSpans)>
val pars: MarkdownParagraphs list
val ps: MarkdownParagraphs
val p: MarkdownParagraph
Multiple items
val ParagraphSpans: MarkdownPatterns.ParagraphSpansInfo * spans: MarkdownSpans -> MarkdownParagraph
--------------------
active recognizer ParagraphSpans: MarkdownParagraph -> Choice<MarkdownPatterns.ParagraphLeafInfo,(MarkdownPatterns.ParagraphNestedInfo * MarkdownParagraphs list),(MarkdownPatterns.ParagraphSpansInfo * MarkdownSpans)>
module Seq
from Microsoft.FSharp.Collections
val collect: mapping: ('T -> #seq<'U>) -> source: seq<'T> -> seq<'U>
val html: string
static member Markdown.ToHtml: markdownText: string * ?newline: string * ?substitutions: (FSharp.Formatting.Templating.ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string
static member Markdown.ToHtml: doc: MarkdownDocument * ?newline: string * ?substitutions: (FSharp.Formatting.Templating.ParamKey * string) list * ?crefResolver: (string -> (string * string) option) * ?mdlinkResolver: (string -> string option) -> string