blog.karenying.com
blog.karenying.com

Riding the Struggle Bus that was the Table of Contents

All aboard 🚌 + life update

Feb 16, 2021•7 min read

toggle

Life update

Hi hello, I’m alive!

In the beginning of January I started working for Gather Town. On 1/21, I returned to campus after being away for 10 months. After a week of quarantine, I started my last semester at Princeton, while still working part time for Gather. Read: been vvv busy and this blog has been put on the back burner. But I finally made time to finish a feature that has been 4 months in the making.

If you’re on desktop and you look right ➡️ , you’ll see a shiny new feature — table of contents that link to headings and dynamically indicate location as you scroll!!

ToC initial stab

I had wanted to implement a table of contents since the inception of this blog site. On 10/29, I put up a draft PR with this attempt. Looking back, this was definitely not the right way to go about it.

In that PR, I passed in the HTML of the post content as a string into my ToC component. I created an entirely new document element with it, and grabbed all the heading elements… a big performance yike indeed. Well, it worked (hadn’t implemented the scroll indicators yet) but I knew I had to find a way to generate ToC at build time. Defeated, I gave up on ToC until mid January.

ToC take two

On 1/17, my friend and former COS 426: Computer Graphics TA, Reilly Bova, showed me his newly revamped COS 426 website. He had implemented the exact ToC I wanted so I poked around the source code and found the key idea to generating ToC at build time. You can query the headings inside the post-template markdown GraphQL query:

markdownRemark(fields: { slug: { eq: $slug } }) {
  html
  ...
  headings {
    depth
  }
}

depth is what level heading and value is the title text content. And now I can prop-drill the headings data into my TableOfContents component.

With this out of the way, I ran into a new problem, how do I generate the relative link to the heading? All I have is the heading title, not the slug. To answer this, I looked at the source code for gatsby-remark-autolink-headers. Most Gatsby sites (including mine) use this plugin to generate the heading links inside content. Thus it made sense to use their exact same algo to generate the slugs so that my ToC links would match.

It turns out they use a package called github-slugger which takes in text and returns a kebab case string (what GitHub does to generate your repo name).

Armed with these two implementation details, I could finally code the ToC. Along the way I faced many challenges / bugs. There were A LOT:

Too deep headings

When I write blog posts, I never use level 1 headings (#) because I think the text looks too big and the title of the post serves this purpose. So I use headings 2 - 4. But for the ToC I only wanted headings of level 2 and 3. The solution was obvious yet I had some gross code that checked heading depth.

Solution: use a JS filter function

<TableOfContents headings={headings.filter((h) => h.depth <= 3)} />

Never resetting github-slugger

The slugger keeps track of previous values, appending -n to the nth time the same value is slugged:

const GithubSlugger = require('github-slugger');
const slugger = new GithubSlugger();

slugger.slug('foo');
// returns 'foo'

slugger.slug('foo');
// returns 'foo-1'

It keeps track until slugger.reset() is called.

I didn’t know about this property. Thus every time hot-reloading kicked in while developing, the generated slugs would be preceded by high af numbers. It took me a while to figure out why I had #2-create-a-new-private-repo-21

Solution: call slugger.reset()!

Resetting github-slugger too much

Then I had the opposite problem. Turns out keeping track of prev vals is important when you have headings of the same value in the same post. My Blogmas post had “Overall thoughts” multiple times as a heading. Thus, the slugger needed to generate overall-thoughts, overall-thoughts-1 etc.

Solution: Only call slugger.reset() before all the slugs for a post are generated

Code snippets in heading titles

If you include a formatted code snippet in your heading, the GraphQL query sends back the HTML:

code snippet in heading

I was too lazy to manipulate the string sooo

Workaround: Don’t use formatted code snippets in headings lol

Over optimizing for scroll direction

My first attempt at a scroll algorithm was overly complicated and resulted in so many rerenders.

In order to know which heading to highlight as active in the ToC, we keep track of all the vertical positions of the headings and choose the “closest” one.

I thought it made sense to calculate if the user was scrolling up or down. And depending on the direction, only check headings that are before or after the current active heading.

This resulted in gross code that was actually very inefficient. I couldn’t figure out how to debug the rerendering situation that was going on.

Solution: Don’t optimize 🙂 I axed this approach for the naive solution — just loop through all the headings, regardless of direction. The “optimization” based on scroll direction was prob a negligible improvement anyways.

Checking url to determine active heading

This is probably the dumbest mistake yet. I thought this was a good idea:

// returns the node index based on window.location.href
const getUrlPos = () => {
  slugs.reset();
  const headingSlugs = headings.map((h) => toSlug(h.value));

  return headingSlugs.findIndex((h) => window.location.href.includes(h));
};

// LOL WHY
useEffect(() => {
  setCurrNode(getUrlPos());
}, [window.location.href]);

Since clicking a heading link changes the URL, I was checking the URL in order to know which heading is active…

Solution: Adding onClick={() => setCurrNode(i)} to the heading link in ToC

“window” is not available during server side rendering

This is a common Gatsby build problem. Browser globals like window and document might be unknown during build time.

Solution: Add a check like typeof window !== 'undefined'

Too much top buffer

When looking for the closest heading, I would check if the current position falls between the heading position and a top buffer of 100px. Turns out 100px was too much and I would get this weird behavior where I’d click a heading in the ToC, but the one below it would be active. This was because the space between the two headings was less than 100px.

Solution: Use 50px

Final / current implementation

Normally, I’d walk through every step of the implementation. But I’m very much done with this feature so here’s a commented version of TableOfContents.jsx glhf:

import React, { useEffect, useState, useRef } from 'react';
const slugs = require(`github-slugger`)();

import styles from './TableOfContents.module.scss';

const TOP_BUFFER = 50; // Buffer before every heading
const MIN_SCREEN_SIZE = 1100;

// returns the slug of the passed in value
const toSlug = (value) => {
  return slugs.slug(value, false);
};

const TableOfContents = ({ headings }) => {
  // returns the node index based on window.location.href
  const getUrlPos = () => {
    slugs.reset();
    const headingSlugs = headings.map((h) => toSlug(h.value));

    return headingSlugs.findIndex((h) => window.location.href.includes(h));
  };

  const [currNode, setCurrNode] = useState(
    typeof window !== 'undefined' ? getUrlPos() : -1
  );
  const headerOffetsRef = useRef(); // Array of offsets of headings

  useEffect(() => {
    if (window.screen.width < MIN_SCREEN_SIZE) {
      return;
    }

    slugs.reset();

    // Calculate all the offsets of the headings
    headerOffetsRef.current = headings.map(({ value }) => {
      const element = document.getElementById(toSlug(value));

      return (element && element.offsetTop) || 0;
    });

    const onScroll = () => {
      const currPos = window.pageYOffset;
      const len = headerOffetsRef.current.length;
      const firstHeader = headerOffetsRef.current[0];

      // Iterate through all the heading positions to find the closest one
      for (let i = 0; i < len; i++) {
        // If less than the first heading, then don't set any as active
        if (currPos < firstHeader - TOP_BUFFER) {
          setCurrNode(-1);
          break;
        }

        const currHeader = headerOffetsRef.current[i];

        if (currPos > currHeader - TOP_BUFFER && currPos <= currHeader) {
          setCurrNode(i);
          break; // Return early if found
        }
      }
    };

    window.addEventListener('scroll', onScroll);

    return () => {
      window.removeEventListener('scroll', onScroll);
    };
  }, []);

  // Actually render the headings
  const renderHeadings = () => {
    slugs.reset();

    return headings.map((heading, i) => {
      const { depth, value } = heading;
      const slug = toSlug(value);
      const active = i === currNode ? styles['toc__content-active'] : '';

      return (
        <a
          className={`${styles[`toc__content-h${depth}`]} ${active}`}
          href={`#${slug}`}
          key={slug}
          onClick={() => setCurrNode(i)}
        >
          {value}
        </a>
      );
    });
  };

  return (
    <div className={styles['toc']}>
      <div className={styles['toc__title']}>contents</div>
      <div className={styles['toc__content']}>
        <div className={styles['toc__content-overlay-top']} />
        {renderHeadings()}
        <div className={styles['toc__content-overlay-bottom']} />
      </div>
    </div>
  );
};

export default TableOfContents;

Conclusion

Good job @me. Disclaimer: the ToC might be slightly buggy but it’s usable enough IMO. Leave me a GitHub Issue if it sucks.

I know I promised to post every month of 2021… yeah about that… I set those goals at a time when blogging was literally the most interesting part of my life. Now that I’m back on campus, between staying on track to graduate (passing my classes), work, and seeing my friendos, I’m stretched pretty thin. I can only hope that I put up a post next month. Until then (or whenever I post next), byeeeeee 👋🏼