Can you explain what semantics means in HTML?
I feel like my entire life people have summed it up as "Using HTML tags correctly", "helping with accessibility and SEO", but none of this fully answers the question, does it?
I asked on Twitter how people would explain what semantics is, and except for a few people who had a very strong experience in accessibility, the answers were mostly similar to what I have heard throughout my career.
Tweet saying: I understand semantic HTML as using tags that bring a meaning beyond the generic div, for example. For whom? Apparently for robots. Because for humans I find it of little use.
Semantics and meaning
What is meaning for us? What is meaning for robots?
What we understand as meaning in semantics is built by three pieces of information, which I will summarize as Name, Role, and State to simplify.
Name
These are the naming data of the element - there are several properties that add a name to the elements, such as title
, name
, content, and aria-properties, etc. There is a predefined order that calculates the name according to their priority.
👤 For humans
Name and description are part of the content verbalized by screen readers. In the case of the button, it would say:
"<role>
Button. <name>
Cancel Order"
Button written "Cancel Order" in ClickBus website
🤖 For machines
Without the name
, the element is considered a non-palpable element (palpable content) and its reading can be ignored.
Role
It describes the expected behavior of the element for the user, crawlers, and assistive technologies. These elements not only provide context, but also enable APIs that will support all types of users to interact with the defined functionalities equivalently.
The role="alert"
attribute does not only adhere to "context", it:
- Creates a live-region - a mutation observer of the child element that emits an event to the user-agent
- When there is an error (e.g. form error), the message is dynamically injected into the element, emitting an event
- The event is captured by assistive technologies and announced to the user
All of this without JavaScript! (only the part of dynamically injecting the message uses JS). This role
enables a native API that ensures that the user has access to the error in a way beyond purely visual (red letters with this symbol ❌).
👤 For humans
Demonstrates the semantic description of the element.
🤖 For machines
Enables part of the API required to provide the expected experience.
State
It refers to the DOM API and the methods, getters, setters, and defaults of the element.
The disabled
attribute not only changes the appearance of the element, but also changes a series of informations in the object of the same:
- Interactive element becomes unnavigable
- Non clickable
- Does not emit events
- Is not read by assistive technologies
And this state is recursively propagated to all children of this element.
In the following button example - same element, different attributes, methods, and behaviors changing only the context:
Once in the context of a form, methods related to form submission and even the creation of a FormData
are enabled by the addition of the implicit type="submit"
.
Code extracted from Chrome's rendering engine, Blink (link to the source code).
👤 For humans
Provides the expected behavior for various types of user interaction.
🤖 For machines
Provides various forms of interaction for different user agents and assistive technologies.
What about SEO and content?
Does semantics help with that too?
Tweet saying: Semantics means helping the machine interpret the content, directing context, information priority, etc. Semantic HTML would be if a robot could read your web page knowing where the content is, what the content is about, what is not important... etc.
Semantics and content
If we think of semantics as adhering meaning to something, the order, priority, and relationship of content dramatically change the meaning of the content.
The creation of this hierarchy is the role of another API - the outlines.
Outlines create content sections, such as book indexes or college papers.
The numbering that defines the hierarchy of content of headings (h1 to h6) works similarly to the index numerals that demonstrate headers, titles, and subtitles.
A summary of a paper in MLA standard with headers, titles, and subtitles.
And it's not just headings that have this role, <section>
, <aside>
, <article>
, and <nav>
are sectioning content elements and create a type of outline. These elements can have <header>
and <footer>
elements whose content will be associated with their section.
Do you understand now how this is so relevant to accessibility? Accessibility is not a favor, nor something detached from HTML, CSS, and JS, but rather using the APIs that these technologies offer to provide an equivalent experience to all users.
In the words of Sandyara Peres, an accessibility expert, semantics are:
In the tweet: In my classes/lectures, I say that: it's the identification of the purpose of elements, influencing their behavior, providing a better experience in terms of: Accessibility; Maintainability & Compatibility.
Semantics are not just "using the right tags", as tags alone do not cover all types of components and use cases that the web can offer.
Semantics are a collection of states, attributes, and methods that enable various ways to access and understand content.
To wrap up
No one has ever taught me semantics in this way; it always seemed merely moralistic, like "writing HTML correctly", "using the right tags".
Perhaps now we can see it as:
"Enabling the appropriate tools for content interpretation."
HTML as a markup language should not be seen solely as a vehicle for implementing design (CSS) and functionality (JS).
Design and functionality are just stars that orbit around the content. Valuing the content is valuing the users.
Top comments (0)