Link Search Menu Expand Document

Markup Languages

A lot of the information here is gleaned from this talk of the History of Markup Languages by Tony Ibbs and his slides.

Introduction: Why We Need Markup Languages

  • Presentational use: Define how text should look.
  • Semantic use: Define the meaning of the text.

Early Markup Languages

RUNOFF (1964)

Capable of pagination, alignment, abbreviation, etc. Used on IBM machines for documents and machine to machine messages.

Example (source: Wikipedia):

When you're ready to order,
call us at our toll free number:
.BR
.CENTER
1-800-555-xxxx
.BR
Your order will be processed
within two working days and shipped

Render:

   When you're ready to order, call us at our toll free number:

                             1-800-555-xxxx

   Your order will be processed within two working days and shipped

roff, nroff, troff, groff (1970s)

Evolved from RUNOFF. These variants can be used on terminals, typesetters, and other devices. Has macros.

TeX (1978) and LaTeX (1983)

Donald Knuth was a computer scientist at Stanford who was sick of the current system for formatting his equations and paragraphs on his articles and books, so he made his own system called TeX.

Example (source: Wikipedia):

The quadratic formula is $-b \pm \sqrt{b^2 - 4ac} \over 2a$
\bye

Quad Formula

LaTeX, a superset of TeX developed by Leslie Lamport, is still used today in academia for typesetting scientific documents.

The Introduction of Tags

(Standard) Generalized Markup Language (GML 1969, SGML 1986)

Also developed at IBM. In this language, you had a Document Type Definition (DTD) which declares the type of tags and what those tags require. Then you use those tags to structure the content. GML did not require closing tags all the time, it depended on the DTD.

Example (source: Wikipedia):

<!ELEMENT chapter - - (title, section+)>
<!ELEMENT title o o (#PCDATA)>
<!ELEMENT section - - (title, subsection+)>

<chapter><title>Introduction to SGML</title>
<section><title>The SGML Declaration</title>
<subsection>
...

TEI (1987)

A purely semantic language, used largely for marking up prose and other poetic texts.

Example (source: Wikipedia):

<div type="sonnet">
 <lg type="quatrain">
  <l>Les amoureux fervents et les savants austères</l>
  <l> Aiment également, dans leur mûre saison,</l>
  <l> Les chats puissants et doux, orgueil de la maison,</l>
  <l> Qui comme eux sont frileux et comme eux sédentaires.</l>
 </lg>
 ...
</div>

HyperText Markup Language (HTML, 1991)

Evolved from SGML. Invented by Tim Berners-Lee at CERN as he was developing the World Wide Web. Instead of having the user define the DTD, HTML came with its own specifications, of which HTML5 is the standard used today on the web.

Example (source: Wikipedia):

<!DOCTYPE html>
<html>
  <head>
    <title>This is a title</title>
  </head>
  <body>
    <div>
      <p>Hello world!</p>
    </div>
  </body>
</html>

DocBook (1991)

Another child of SGML. Originally used by the military and other technical institutions, it is still in use today because of its portability to other document types. The latest version uses XML (see below) as its foundation.

Example (source: Wikipedia):

 <?xml version="1.0" encoding="UTF-8"?>
 <book xml:id="simple_book" xmlns="http://docbook.org/ns/docbook" version="5.0">
   <title>Very simple book</title>
   <chapter xml:id="chapter_1">
     <title>Chapter 1</title>
     <para>Hello world!</para>
     <para>I hope that your day is proceeding <emphasis>splendidly</emphasis>!</para>
   </chapter>
   <chapter xml:id="chapter_2">
     <title>Chapter 2</title>
     <para>Hello again, world!</para>
   </chapter>
 </book>

Extensible Markup Language (XML, 1996)

“The main purpose of XML is serialization, i.e. storing, transmitting, and reconstructing arbitrary data. For two disparate systems to exchange information, they need to agree upon a file format. XML standardizes this process. XML is analogous to a lingua franca for representing information.” (Wikipedia)

XML has replaced SGML. It can be used to define specific markup syntax for any user interface or document. Like the quote above mentions, it can also be used to structure data like JSON and send that data over HTTP (or other protocols) from application to application. There are many variants of XML (like DocBook), as XML supports DTD to define markup. Here are a few examples of documents/user interfaces that use variants of XML.

  • Android and iOS Apps
  • Windows Platform Apps (XAML)
  • Microsoft Office
  • Google Suite

The Anti-Tag Movement

setext (1991)

A markup language originally for newsletters and Macintosh programs, then also for emails and other communications.

Example (source: Wikipedia):

Title
=====
  Body text is indented 2 spaces

  **Bold Words**
  _Underlined_words_
  `Backquoted words` are comments

  * Bulleted list

WikiWikiWeb (1995)

The first wiki, very simple, meant to support limited features like bold, italic, and lists.

Example (source: Wikipedia):

* List Item
** Nested List Item

This text is '''bold'''.

reStructuredText and AsciiDoc (2002)

These are similar languages built with the intent to be easily readable without getting bogged down in tags. reStructuredText was invented to document Python, and AsciiDoc was invented as a terse analog to DocBook.

Markdown (2004)

Markdown is John Gruber’s attempt at a way to write HTML more easily. Has lots of flavors/dialects (i.e. GitHub-flavored Markdown for documenting GitHub repositories, Jekyll Markdown was used to write this page, Discord, Reddit, etc.). Many flavors support HTML tags alongside Markdown syntax. Markdown is processed into HTML and then that HTML can be rendered.

Example (source: Wikipedia)

Heading
=======

Sub-heading
-----------

# Alternative heading #

Paragraphs are separated 
by a blank line.

Two spaces at the end of a line  
produce a line break.  

Text attributes _italic_, **bold**, `monospace`.

Horizontal rule:

---

Bullet lists nested within numbered list:

  1. fruits
     * apple
     * banana
  2. vegetables
     - carrot
     - broccoli

Conclusion

As a developer, you’ll probably only need to focus on these three:

  • HTML for web-based interfaces
  • XML for other non-web interfaces or data serialization
  • Markdown for writing documentation

Good luck!