Gutenberg in Headless WordPress: Render Blocks as HTML

9 minute read
Kellen Mace
Kellen Mace
Sr. Staff Developer Advocate

The Gutenberg block editor is one of the key features driving WordPress forward. It provides a rich content authoring and editing experience, especially Gutenberg in a Headless WordPress site.

Rendering Gutenberg content in a headless WordPress site isn’t without its share of challenges, however. Jason Bahl’s excellent Gutenberg and Decoupled Applications blog post gives an exhaustive overview of those challenges.

As Jason explains, there is currently no complete server-side registry of blocks. The block.json file for each core block provides some data, but not all. Gutenberg blocks are registered in JavaScript, which is not executed or understood by the WordPress server. As a result, is not possible to query WordPress to get a list of all possible blocks and their data, and add that to the REST API / WPGraphQL schema. For a headless site, this means that you can’t fire off a REST API or WPGraphQL query to get all the blocks for a given post.

Generally speaking, developers want to be able to execute a query to get the Gutenberg blocks for a given post and get structured data back in JSON format. They can then loop over those blocks in their frontend application and render React/Vue/Svelte/other components. Without a complete server-side registry of blocks, however, we don’t currently have that luxury. So where does that leave us? What are our options?

In this two-part blog post series, I outline what I consider to be the two primary, viable options for rendering Gutenberg blocks in headless WordPress sites:

  1. Render Gutenberg Blocks as HTML
  2. Use WPGraphQL Gutenberg

We’ll discuss option #1 in this post.

Before diving in, be sure to review the pros and cons of this approach listed in the Gutenberg and Decoupled Applications post to determine if it’s right for your project.

An example Next.js app that uses the method described below to render Gutenberg blocks content is available here:

https://github.com/kellenmace/gutenberg-demo-blocks-as-html

Render Gutenberg Blocks as HTML

When core Gutenberg blocks are saved to the database, the post content looks something like this:

<!-- wp:paragraph {"dropCap":true,"style":{"elements":{"link":{"color":{"text":"var:preset|color|orange"}}},"typography":{"lineHeight":"1.8"}},"backgroundColor":"gray","textColor":"blue","fontSize":"small"} -->

<p class="has-drop-cap has-blue-color has-gray-background-color has-text-color has-background has-link-color has-small-font-size" style="line-height:1.8">Here is some text.</p>

<!-- /wp:paragraph -->


<!-- wp:image {"id":12,"width":225,"height":150,"sizeSlug":"medium","linkDestination":"none","className":"is-style-rounded path-in-woods"} -->

<figure class="wp-block-image size-medium is-resized is-style-rounded path-in-woods" id="path-in-woods"><img src="http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0-300x200.jpg" alt="Photo of a path through the woods" class="wp-image-12" width="225" height="150" title="Here is a title"/></figure>

<!-- /wp:image -->

You can see here that we have two blocks: Paragraph and Image. The markup is there for each, as well as additional JSON data about the block stored inside of HTML comments.

When you request a post’s content via the REST API or WPGraphQL, WordPress gets that data from the database and runs it through a few filters that parse the blocks and generate the fully rendered HTML that looks like this:

<p class="wp-elements-612777c487d3f has-drop-cap has-blue-color has-gray-background-color has-text-color
  has-background has-link-color has-small-font-size" style="line-height:1.8">Here is some text.</p>
<style>
  .wp-elements-612777c487d3f a {
    color: var(--wp--preset--color--orange) !important;
  }
</style>

<figure class="wp-block-image size-medium is-resized is-style-rounded path-in-woods"
  id="path-in-woods"><img loading="lazy"
    src="http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0-300x200.jpg"
    alt="Photo of a path through the woods" class="wp-image-12" width="225" height="150" title="Here is a
    title"
    srcset="http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0-300x200.jpg
    300w, http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0-1024x682.jpg 1024w,
    http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0-768x512.jpg 768w,
    http://gutenbergdemo.local/wp-content/uploads/2021/08/e1d8a0cd-c00c-3b96-b50e-18f5bd502fb0.jpg 1357w"
    sizes="(max-width: 225px) 100vw, 225px" />
</figure>

So the most straightforward approach is just to render this HTML to the page in your decoupled frontend application. In a React app for example, you can do so using dangerouslySetInnerHTML, like this:

export default function SinglePost({ post }) {
  const { title, content } = post;

  return (
    <article className="blog-post">
      <h1>{title}</h1>
      <div dangerouslySetInnerHTML={{ __html: content }} />
    </article>
  );
}

Fix Internal Links for Headless Gutenberg

Using this method, you’ll notice that internal links still point to the domain where your WordPress backend lives. You can fix that with a PHP filter function like this:

 /**
  * Modify internal link URLs to point to the decoupled frontend app.
  *
  * @param string $content Post content.
  *
  * @return string Post content, with internal link URLs replaced.
  */
function replace_headless_content_link_urls(string $content): string
{
    if (!is_graphql_request() && !defined('REST_REQUEST')) {
        return $content;
    }

    // TODO: Get this value from an environment variable or the database.
    $frontend_app_url = 'http://localhost:3000';
    $site_url         = site_url();

    return str_replace('href="' . $site_url, 'href="' . $frontend_app_url, $content);
}
add_filter('the_content', 'replace_headless_content_link_urls');

A WordPress plugin that contains this code is available here.

The if (!is_graphql_request() && !defined('REST_REQUEST')) guard clause ensures that this URL replacement code only runs during WPGraphQL or REST API requests. This way, if you have any traditional, server-rendered pages, their post content will be unaffected.

You can see that $frontend_app_url = 'https://localhost:3000'; is hardcoded here. Change the frontend app URL is pulled either from an environment variable or from a value in the database. Then you’ll be able to have it set to a different URL depending on the environment (development/staging/production).

With this function in place, an internal link pointing to https://my-wp-backend.local/blog/hello-world in the post content will be rewritten to http://localhost:3000/blog/hello-world, for example. So make sure that your frontend app’s routing is set up properly to accommodate that.

An alternative approach you could take here would be to remove the domain, turning the links into relative URLs, such as /blog/hello-world. If you go that route, be careful to account for all possible URL permutations– those that contain anchor links or query string parameters, those that point to the homepage (/), and so on.

Style Blocks

Just like on traditional, monolithic WordPress sites, you need to write styles for every type of block. This way, regardless of which blocks are present for a given post, you have your bases covered.

WordPress has an NPM package that includes the base styles it uses to style core blocks. You can install it with npm install @wordpress/block-library, then import the stylesheets into your decoupled frontend app like this:

import "@wordpress/block-library/build-style/common.css"
import "@wordpress/block-library/build-style/style.css"
import "@wordpress/block-library/build-style/theme.css"

Once those base styles have been applied to the blocks, you can customize the look and feel of the blocks further by writing custom styles. To see an example, you can reference the custom block styles that WordPress’ default “twentytwentyone” theme includes, here: https://github.com/WordPress/twentytwentyone/tree/trunk/assets/sass/05-blocks

Use a Parser to Convert Some Blocks to Components

Taking the HTML for many types of blocks (Paragraph, List, etc.) and rendering it directly to the page works great. For others, you may need to render an actual React/Vue/Svelte component instead for various reasons. Thankfully, a solution to that exists in the form of a parser.

Example: Convert Anchor tags to Link Components

In this section, we’ll see a good example of when and how to use a parser to replace anchor tags (<a>) with your JS framework’s Link component may be advantageous.

In the “Fix Internal Links” section above, I showed how to filter the post content to replace the domain of internal links with the domain of your decoupled JavaScript app. For single-page app (SPA) frameworks, however, that isn’t quite enough. Although the internal links now point to the correct URL, they’re still just plain ol’ anchor tags (<a>). That means that when the user clicks one, a full-page reload will be triggered rather than a route change using the SPA framework’s router. Let’s see how we can fix that in a Next.js app and turn them into Link components instead.

First, we’ll modify the PHP function we saw earlier like this:

/**
 * Modify internal link URLs to point to the decoupled frontend app.
 *
 * @param string $content Post content.
 *
 * @return string Post content, with internal link URLs replaced.
 */
function replace_headless_content_link_urls(string $content): string
{
    if (!is_graphql_request() && !defined('REST_REQUEST')) {
        return $content;
    }

    // TODO: Get this value from an environment variable or the database.
    $frontend_app_url = 'http://localhost:3000';
    $site_url         = site_url();

    return str_replace('href="' . $site_url, 'data-internal-link="true" href="' . $frontend_app_url, $content);
}
add_filter('the_content', 'replace_headless_content_link_urls');

This is identical to the code snippet in the Fix Internal Links” section above, except that we’re adding a data-internal-link data attribute to our internal links with this bit of code:

data-internal-link="true"

This snippet makes it easy for us to identify internal links in our frontend JS app when the markup is parsed.

You may be wondering why we can’t simply look for anchor tags that start with “http://localhost:3000″ to identify which are internal links. Remember though that the URL of the frontend app will change or be difficult to access depending on the environment (development/staging/production or server/client). So by using this data attribute approach, we can rest assured that we’ll be able to easily identify internal links across all environments.

With that in place, we can implement the parser.

First, we’ll run npm install html-react-parser to install the html-react-parser library we’ll use.

In the Next.js app, we’ll create this /lib/parser.js file:

import parse, { domToReact } from "html-react-parser";
import Link from "next/link";

export default function parseHtml(html) {
  const options = {
    replace: ({ name, attribs, children }) => {
      // Convert internal links to Next.js Link components.
      const isInternalLink =
        name === "a" && attribs["data-internal-link"] === "true";

      if (isInternalLink) {
        return (
          <Link href={attribs.href}>
            <a {...attribs}>{domToReact(children, options)}</a>
          </Link>
        );
      }
    },
  };

  return parse(html, options);
}

The parse() function that html-react-parser provides will parse the string of HTML into nodes. By passing in the options object with a replace() callback function inside, we tell the parser that when it encounters an anchor tag with a data-internal-link data attribute of true (an internal link), replace it with a Next.js Link component.

Now we can then make use of this new parseHtml() function we created in our single blog post React component, like this:

import parseHtml from "../lib/parser";

export default function SinglePost({ post }) {
  const { title, content } = post;

  return (
    <article className="blog-post">
      <h1>{title}</h1>
      <div>{parseHtml(content)}</div>
    </article>
  );
}

As a result of this work, our site visitors will be able to click on an internal link in a blog post. Visitors will experience an instantaneous route change via Next.js’ router, with no more full page reload.

Other Uses

The example above that shows converting internal anchor tag links to Link components is one use-case, but there are many more you may have.

Trade-off: Control vs. Ease of Implementation

I view this approach as the easiest to implement. Working with already-rendered HTML and swapping out some of the nodes with custom components using a parser saves a lot of development time. You don’t have to query for and write components and styles from scratch for each individual type of block. If this approach suits the needs of your project, I would seriously consider it.

This ease of implementation comes at a cost, however: less control. If you need to be able to query for the individual attributes for every type of block, then render them using custom components and custom markup/JSX, consider the WPGraphQL Gutenberg approach discussed in post #2 of this series instead.

As I mentioned at the top of this post, be sure to also review the pros and cons of this approach listed in the excellent Gutenberg and Decoupled Applications post on the WPGraphQL blog to determine if it’s right for your project.

Wrapping Up

I hope this post gave you a good sense of what rendering Gutenberg blocks as HTML looks like in practice. I also hope it’s a helpful piece to reference if you choose this approach in your own Gutenberg headless WordPress projects.

Do you have any questions about this method of rendering Gutenberg blocks content? Please reach out to let us know!