# The Document Object Model {{quote {author: "Friedrich Nietzsche", title: "Beyond Good and Evil", chapter: true} Too bad! Same old story! Once you've finished building your house you notice you've accidentally learned something that you really should have known—before you started. quote}} {{figure {url: "img/chapter_picture_14.jpg", alt: "Illustration showing a tree with letters, pictures, and gears hanging on its branches", chapter: "framed"}}} {{index drawing, parsing}} When you open a web page, your browser retrieves the page's ((HTML)) text and parses it, much like our parser from [Chapter ?](language#parsing) parsed programs. The browser builds up a model of the document's ((structure)) and uses this model to draw the page on the screen. {{index "live data structure"}} This representation of the ((document)) is one of the toys that a JavaScript program has available in its ((sandbox)). It is a ((data structure)) that you can read or modify. It acts as a _live_ data structure: when it's modified, the page on the screen is updated to reflect the changes. ## Document structure {{index [HTML, structure]}} You can imagine an HTML document as a nested set of ((box))es. Tags such as `` and `` enclose other ((tag))s, which in turn contain other tags or ((text)). Here's the example document from the [previous chapter](browser): ```{lang: html, sandbox: "homepage"} My home page

My home page

Hello, I am Marijn and this is my home page.

I also wrote a book! Read it here.

``` This page has the following structure: {{figure {url: "img/html-boxes.svg", alt: "Diagram showing an HTML document as a set of nested boxes. The outer box is labeled 'html' and contains two boxes labeled 'head' and 'body'. Inside those are further boxes, with some of the innermost boxes containing the document's text.", width: "7cm"}}} {{indexsee "Document Object Model", DOM}} The data structure the browser uses to represent the document follows this shape. For each box, there is an object, which we can interact with to find out things such as what HTML tag it represents and which boxes and text it contains. This representation is called the _Document Object Model_, or _((DOM))_ for short. {{index "documentElement property", "head property", "body property", "html (HTML tag)", "body (HTML tag)", "head (HTML tag)"}} The global binding `document` gives us access to these objects. Its `documentElement` property refers to the object representing the `` tag. Since every HTML document has a head and a body, it also has `head` and `body` properties pointing at those elements. ## Trees {{index [nesting, "of objects"]}} Think back to the ((syntax tree))s from [Chapter ?](language#parsing) for a moment. Their structures are strikingly similar to the structure of a browser's document. Each _((node))_ may refer to other nodes, _children_, which in turn may have their own children. This shape is typical of nested structures, where elements can contain subelements that are similar to themselves. {{index "documentElement property", [DOM, tree]}} We call a data structure a _((tree))_ when it has a branching structure, no ((cycle))s (a node may not contain itself, directly or indirectly), and a single, well-defined _((root))_. In the case of the DOM, `document.documentElement` serves as the root. {{index sorting, ["data structure", "tree"], "syntax tree"}} Trees come up a lot in computer science. In addition to representing recursive structures such as HTML documents or programs, they are often used to maintain sorted ((set))s of data because elements can usually be found or inserted more efficiently in a tree than in a flat array. {{index "leaf node", "Egg language"}} A typical tree has different kinds of ((node))s. The syntax tree for [the Egg language](language) had identifiers, values, and application nodes. Application nodes may have children, whereas identifiers and values are _leaves_, or nodes without children. {{index "body property", [HTML, structure]}} The same goes for the DOM. Nodes for _((element))s_, which represent HTML tags, determine the structure of the document. These can have ((child node))s. An example of such a node is `document.body`. Some of these children can be ((leaf node))s, such as pieces of ((text)) or ((comment)) nodes. {{index "text node", element, "ELEMENT_NODE code", "COMMENT_NODE code", "TEXT_NODE code", "nodeType property"}} Each DOM node object has a `nodeType` property, which contains a code (number) that identifies the type of node. Elements have code 1, which is also defined as the constant property `Node.ELEMENT_NODE`. Text nodes, representing a section of text in the document, get code 3 (`Node.TEXT_NODE`). Comments have code 8 (`Node.COMMENT_NODE`). Another way to visualize our document ((tree)) is as follows: {{figure {url: "img/html-tree.svg", alt: "Diagram showing the HTML document as a tree, with arrows from parent nodes to child nodes", width: "8cm"}}} The leaves are text nodes, and the arrows indicate parent-child relationships between nodes. {{id standard}} ## The standard {{index "programming language", [interface, design], [DOM, interface]}} Using cryptic numeric codes to represent node types is not a very JavaScript-like thing to do. Later in this chapter, we'll see that other parts of the DOM interface also feel cumbersome and alien. This is because the DOM interface wasn't designed for JavaScript alone. Rather, it tries to be a language-neutral interface that can be used in other systems as well—not just for HTML but also for ((XML)), which is a generic ((data format)) with an HTML-like syntax. {{index consistency, integration}} This is unfortunate. Standards are often useful. But in this case, the advantage (cross-language consistency) isn't all that compelling. Having an interface that is properly integrated with the language you're using will save you more time than having a familiar interface across languages. {{index "array-like object", "NodeList type"}} As an example of this poor integration, consider the `childNodes` property that element nodes in the DOM have. This property holds an array-like object with a `length` property and properties labeled by numbers to access the child nodes. But it is an instance of the `NodeList` type, not a real array, so it does not have methods such as `slice` and `map`. {{index [interface, design], [DOM, construction], "side effect"}} Then there are issues that are simply caused by poor design. For example, there is no way to create a new node and immediately add children or ((attribute))s to it. Instead, you have to first create it and then add the children and attributes one by one, using side effects. Code that interacts heavily with the DOM tends to get long, repetitive, and ugly. {{index library}} But these flaws aren't fatal. Since JavaScript allows us to create our own ((abstraction))s, it is possible to design improved ways to express the operations we are performing. Many libraries intended for browser programming come with such tools. ## Moving through the tree {{index pointer}} DOM nodes contain a wealth of ((link))s to other nearby nodes. The following diagram illustrates these: {{figure {url: "img/html-links.svg", alt: "Diagram that shows the links between DOM nodes. The 'body' node is shown as a box, with a 'firstChild' arrow pointing at the 'h1' node at its start, a 'lastChild' arrow pointing at the last paragraph node, and 'childNodes' arrow pointing at an array of links to all its children. The middle paragraph has a 'previousSibling' arrow pointing at the node before it, a 'nextSibling' arrow to the node after it, and a 'parentNode' arrow pointing at the 'body' node.", width: "6cm"}}} {{index "child node", "parentNode property", "childNodes property"}} Although the diagram shows only one link of each type, every node has a `parentNode` property that points to the node it is part of, if any. Likewise, every element node (node type 1) has a `childNodes` property that points to an ((array-like object)) holding its children. {{index "firstChild property", "lastChild property", "previousSibling property", "nextSibling property"}} In theory, you could move anywhere in the tree using just these parent and child links. But JavaScript also gives you access to a number of additional convenience links. The `firstChild` and `lastChild` properties point to the first and last child elements or have the value `null` for nodes without children. Similarly, `previousSibling` and `nextSibling` point to adjacent nodes, which are nodes with the same parent that appear immediately before or after the node itself. For a first child, `previousSibling` will be null, and for a last child, `nextSibling` will be null. {{index "children property", "text node", element}} There's also the `children` property, which is like `childNodes` but contains only element (type 1) children, not other types of child nodes. This can be useful when you aren't interested in text nodes. {{index "talksAbout function", recursion, [nesting, "of objects"]}} When dealing with a nested data structure like this one, recursive functions are often useful. The following function scans a document for ((text node))s containing a given string and returns `true` when it has found one: {{id talksAbout}} ```{sandbox: "homepage"} function talksAbout(node, string) { if (node.nodeType == Node.ELEMENT_NODE) { for (let child of node.childNodes) { if (talksAbout(child, string)) { return true; } } return false; } else if (node.nodeType == Node.TEXT_NODE) { return node.nodeValue.indexOf(string) > -1; } } console.log(talksAbout(document.body, "book")); // → true ``` {{index "nodeValue property"}} The `nodeValue` property of a text node holds the string of text that it represents. ## Finding elements {{index [DOM, querying], "body property", "hard-coding", [whitespace, "in HTML"]}} Navigating these ((link))s among parents, children, and siblings is often useful. But if we want to find a specific node in the document, reaching it by starting at `document.body` and following a fixed path of properties is a bad idea. Doing so bakes assumptions into our program about the precise structure of the document—a structure you might want to change later. Another complicating factor is that text nodes are created even for the whitespace between nodes. The example document's `` tag has not just three children (`

` and two `

` elements), but seven: those three, plus the spaces before, after, and between them. {{index "search problem", "href attribute", "getElementsByTagName method"}} If we want to get the `href` attribute of the link in that document, we don't want to say something like "Get the second child of the sixth child of the document body". It'd be better if we could say "Get the first link in the document". And we can. ```{sandbox: "homepage"} let link = document.body.getElementsByTagName("a")[0]; console.log(link.href); ``` {{index "child node"}} All element nodes have a `getElementsByTagName` method, which collects all elements with the given tag name that are descendants (direct or indirect children) of that node and returns them as an ((array-like object)). {{index "id attribute", "getElementById method"}} To find a specific _single_ node, you can give it an `id` attribute and use `document.getElementById` instead. ```{lang: html}

My ostrich Gertrude:

``` {{index "getElementsByClassName method", "class attribute"}} A third, similar method is `getElementsByClassName`, which, like `getElementsByTagName`, searches through the contents of an element node and retrieves all elements that have the given string in their `class` attribute. ## Changing the document {{index "side effect", "removeChild method", "appendChild method", "insertBefore method", [DOM, construction], [DOM, modification]}} Almost everything about the DOM data structure can be changed. The shape of the document tree can be modified by changing parent-child relationships. Nodes have a `remove` method to remove them from their current parent node. To add a child node to an element node, we can use `appendChild`, which puts it at the end of the list of children, or `insertBefore`, which inserts the node given as the first argument before the node given as the second argument. ```{lang: html}

One

Two

Three

``` A node can exist in the document in only one place. Thus, inserting paragraph _Three_ in front of paragraph _One_ will first remove it from the end of the document and then insert it at the front, resulting in _Three_/_One_/_Two_. All operations that insert a node somewhere will, as a ((side effect)), cause it to be removed from its current position (if it has one). {{index "insertBefore method", "replaceChild method"}} The `replaceChild` method is used to replace a child node with another one. It takes as arguments two nodes: a new node and the node to be replaced. The replaced node must be a child of the element the method is called on. Note that both `replaceChild` and `insertBefore` expect the _new_ node as their first argument. ## Creating nodes {{index "alt attribute", "img (HTML tag)", "createTextNode method"}} Say we want to write a script that replaces all ((image))s (`` tags) in the document with the text held in their `alt` attributes, which specifies an alternative textual representation of the image. This involves not only removing the images but also adding a new text node to replace them. ```{lang: html}

The Cat in the Hat.

``` {{index "text node"}} Given a string, `createTextNode` gives us a text node that we can insert into the document to make it show up on the screen. {{index "live data structure", "getElementsByTagName method", "childNodes property"}} The loop that goes over the images starts at the end of the list. This is necessary because the node list returned by a method like `getElementsByTagName` (or a property like `childNodes`) is _live_. That is, it is updated as the document changes. If we started from the front, removing the first image would cause the list to lose its first element so that the second time the loop repeats, where `i` is 1, it would stop because the length of the collection is now also 1. {{index "slice method"}} If you want a _solid_ collection of nodes, as opposed to a live one, you can convert the collection to a real array by calling `Array.from`. ``` let arrayish = {0: "one", 1: "two", length: 2}; let array = Array.from(arrayish); console.log(array.map(s => s.toUpperCase())); // → ["ONE", "TWO"] ``` {{index "createElement method"}} To create ((element)) nodes, you can use the `document.createElement` method. This method takes a tag name and returns a new empty node of the given type. {{index "Popper, Karl", [DOM, construction], "elt function"}} {{id elt}} The following example defines a utility `elt`, which creates an element node and treats the rest of its arguments as children to that node. This function is then used to add an attribution to a quote. ```{lang: html}
No book can ever be finished. While working on it we learn just enough to find it immature the moment we turn away from it.
``` {{if book This is what the resulting document looks like: {{figure {url: "img/blockquote.png", alt: "Rendered picture of the blockquote with attribution", width: "8cm"}}} if}} ## Attributes {{index "href attribute", [DOM, attributes]}} Some element ((attribute))s, such as `href` for links, can be accessed through a property of the same name on the element's ((DOM)) object. This is the case for most commonly used standard attributes. {{index "data attribute", "getAttribute method", "setAttribute method", attribute}} HTML allows you to set any attribute you want on nodes. This can be useful because it allows you to store extra information in a document. To read or change custom attributes, which aren't available as regular object properties, you have to use the `getAttribute` and `setAttribute` methods. ```{lang: html}

The launch code is 00000000.

I have two feet.

``` It is recommended to prefix the names of such made-up attributes with `data-` to ensure they do not conflict with any other attributes. {{index "getAttribute method", "setAttribute method", "className property", "class attribute"}} There is a commonly used attribute, `class`, which is a ((keyword)) in the JavaScript language. For historical reasons—some old JavaScript implementations could not handle property names that matched keywords—the property used to access this attribute is called `className`. You can also access it under its real name, `"class"`, with the `getAttribute` and `setAttribute` methods. ## Layout {{index layout, "block element", "inline element", "p (HTML tag)", "h1 (HTML tag)", "a (HTML tag)", "strong (HTML tag)"}} You may have noticed that different types of elements are laid out differently. Some, such as paragraphs (`

`) or headings (`

`), take up the whole width of the document and are rendered on separate lines. These are called _block_ elements. Others, such as links (``) or the `` element, are rendered on the same line with their surrounding text. Such elements are called _inline_ elements. {{index drawing}} For any given document, browsers are able to compute a layout, which gives each element a size and position based on its type and content. This layout is then used to actually draw the document. {{index "border (CSS)", "offsetWidth property", "offsetHeight property", "clientWidth property", "clientHeight property", dimensions}} The size and position of an element can be accessed from JavaScript. The `offsetWidth` and `offsetHeight` properties give you the space the element takes up in _((pixel))s_. A pixel is the basic unit of measurement in the browser. It traditionally corresponds to the smallest dot that the screen can draw, but on modern displays, which can draw _very_ small dots, that may no longer be the case, and a browser pixel may span multiple display dots. Similarly, `clientWidth` and `clientHeight` give you the size of the space _inside_ the element, ignoring border width. ```{lang: html}

I'm boxed in

``` {{if book Giving a paragraph a border causes a rectangle to be drawn around it. {{figure {url: "img/boxed-in.png", alt: "Rendered picture of a paragraph with a border", width: "8cm"}}} if}} {{index "getBoundingClientRect method", position, "pageXOffset property", "pageYOffset property"}} {{id boundingRect}} The most effective way to find the precise position of an element on the screen is the `getBoundingClientRect` method. It returns an object with `top`, `bottom`, `left`, and `right` properties, indicating the pixel positions of the sides of the element relative to the upper left of the screen. If you want pixel positions relative to the whole document, you must add the current scroll position, which you can find in the `pageXOffset` and `pageYOffset` bindings. {{index "offsetHeight property", "getBoundingClientRect method", drawing, laziness, performance, efficiency}} Laying out a document can be quite a lot of work. In the interest of speed, browser engines do not immediately re-layout a document every time you change it but wait as long as they can before doing so. When a JavaScript program that changed the document finishes running, the browser will have to compute a new layout to draw the changed document to the screen. When a program _asks_ for the position or size of something by reading properties such as `offsetHeight` or calling `getBoundingClientRect`, providing that information also requires computing a ((layout)). {{index "side effect", optimization, benchmark}} A program that repeatedly alternates between reading DOM layout information and changing the DOM forces a lot of layout computations to happen and will consequently run very slowly. The following code is an example of this. It contains two different programs that build up a line of _X_ characters 2,000 pixels wide and measures the time each one takes. ```{lang: html, test: nonumbers}

``` ## Styling {{index "block element", "inline element", style, "strong (HTML tag)", "a (HTML tag)", underline}} We have seen that different HTML elements are drawn differently. Some are displayed as blocks, others inline. Some add styling—`` makes its content ((bold)), and `
` makes it blue and underlines it. {{index "img (HTML tag)", "default behavior", "style attribute"}} The way an `` tag shows an image or an `` tag causes a link to be followed when it is clicked is strongly tied to the element type. But we can change the styling associated with an element, such as the text color or underline. Here is an example that uses the `style` property: ```{lang: html}

Normal link

Green link

``` {{if book The second link will be green instead of the default link color: {{figure {url: "img/colored-links.png", alt: "Rendered picture of a normal blue link and a styled green link", width: "2.2cm"}}} if}} {{index "border (CSS)", "color (CSS)", CSS, "colon character"}} A style attribute may contain one or more _((declaration))s_, which are a property (such as `color`) followed by a colon and a value (such as `green`). When there is more than one declaration, they must be separated by ((semicolon))s, as in `"color: red; border: none"`. {{index "display (CSS)", layout}} A lot of aspects of the document can be influenced by styling. For example, the `display` property controls whether an element is displayed as a block or an inline element. ```{lang: html} This text is displayed inline, as a block, and not at all. ``` {{index "hidden element"}} The `block` tag will end up on its own line, since ((block element))s are not displayed inline with the text around them. The last tag is not displayed at all—`display: none` prevents an element from showing up on the screen. This is a way to hide elements. It is often preferable to removing them from the document entirely because it makes it easy to reveal them again later. {{if book {{figure {url: "img/display.png", alt: "Different display styles", width: "4cm"}}} if}} {{index "color (CSS)", "style attribute"}} JavaScript code can directly manipulate the style of an element through the element's `style` property. This property holds an object that has properties for all possible style properties. The values of these properties are strings, which we can write to in order to change a particular aspect of the element's style. ```{lang: html}

Nice text

``` {{index "camel case", capitalization, "hyphen character", "font-family (CSS)"}} Some style property names contain hyphens, such as `font-family`. Because such property names are awkward to work with in JavaScript (you'd have to say `style["font-family"]`), the property names in the `style` object for such properties have their hyphens removed and the letters after them capitalized (`style.fontFamily`). ## Cascading styles {{index "rule (CSS)", "style (HTML tag)"}} {{indexsee "Cascading Style Sheets", CSS}} {{indexsee "style sheet", CSS}} The styling system for HTML is called _((CSS))_, for _Cascading Style Sheets_. A _style sheet_ is a set of rules for how to style elements in a document. It can be given inside a `

Now strong text is italic and gray.

``` {{index "rule (CSS)", "font-weight (CSS)", overlay}} The _((cascading))_ in the name refers to the fact that multiple such rules are combined to produce the final style for an element. In the example, the default styling for `` tags, which gives them `font-weight: bold`, is overlaid by the rule in the ` ``` if}} {{hint `Math.cos` and `Math.sin` measure angles in radians, where a full circle is 2π. For a given angle, you can get the opposite angle by adding half of this, which is `Math.PI`. This can be useful for putting the hat on the opposite side of the orbit. hint}}