Back to Blog

Astro Content Collections: The Best Way to Manage Blog and Portfolio Content

Astro Content Collections: The Best Way to Manage Blog and Portfolio Content

Every content-driven site eventually hits the same problem: content management becomes a mess. Markdown files with inconsistent frontmatter. Missing fields that break the build. Dates in three different formats. Images that may or may not exist. And no way to know something is wrong until a user reports a broken page.

Astro Content Collections solve this. They give you typed, validated, queryable content with the simplicity of local Markdown files. No CMS, no database, no API calls. Just files on disk with a schema enforcing correctness at build time.

We use Content Collections for the Threshline website you are reading right now — both the blog and the portfolio. Every post, every project page, every piece of content goes through a schema before it hits the browser. Here is how we set it up and why it works so well.

What Content Collections actually are

A Content Collection in Astro is a directory of content files (Markdown, MDX, JSON, YAML) that share a common schema. Think of it like a strongly-typed folder. Every file in the folder must conform to the schema you define, and Astro validates this at build time.

The structure looks like this:

src/content/
  blog/
    building-a-saas-platform.md
    astro-vs-nextjs-2026.md
    color-system-product-design.md
  portfolio/
    mindhyv.md
    trackelio.md
    vincelio.md
    lancerspace.md
  config.ts

The config.ts file is where you define schemas for each collection. The blog/ and portfolio/ directories are the collections themselves. Every .md file in blog/ is validated against the blog schema. Every .md file in portfolio/ is validated against the portfolio schema.

Defining schemas with Zod

Astro uses Zod for schema validation, which means you get full TypeScript type inference for free. Here is the schema we use for this blog:

// src/content/config.ts
import { defineCollection, z } from 'astro:content';

const blog = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string().max(100),
    description: z.string().max(160),
    pubDate: z.coerce.date(),
    updatedDate: z.coerce.date().optional(),
    author: z.string().default('Threshline'),
    tags: z.array(z.string()).min(1).max(5),
    draft: z.boolean().default(false),
    image: z.string().optional(),
  }),
});

const portfolio = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string(),
    description: z.string().max(200),
    client: z.string().optional(),
    url: z.string().url().optional(),
    stack: z.array(z.string()),
    category: z.enum(['saas', 'marketplace', 'directory', 'tool', 'platform']),
    featured: z.boolean().default(false),
    order: z.number().default(0),
    image: z.string(),
  }),
});

export const collections = { blog, portfolio };

A few things worth highlighting:

z.coerce.date() handles date strings in frontmatter. Whether you write 2026-05-18 or 2026-05-18T00:00:00Z, it gets coerced to a JavaScript Date object. No more inconsistent date formats across posts.

.default() values mean you can omit optional fields. Every blog post defaults author to “Threshline” and draft to false. This reduces frontmatter boilerplate without losing type safety.

.max(160) on descriptions enforces SEO best practices at the schema level. If someone writes a meta description longer than 160 characters, the build fails. This is better than a linting rule because it is impossible to bypass.

z.enum() for categories means portfolio entries cannot have arbitrary category values. If someone types “saas” when the valid option is “saas”, it passes. If they type “webapp”, it fails with a clear error. This is especially useful when categories drive navigation or filtering.

Person writing a blog post in a markdown editor

Querying collections

Astro provides getCollection() to fetch and filter content. It returns typed data that matches your schema exactly.

---
// src/pages/blog/index.astro
import { getCollection } from 'astro:content';

const posts = await getCollection('blog', ({ data }) => {
  // Filter out drafts in production
  return import.meta.env.PROD ? !data.draft : true;
});

// Sort by publication date, newest first
const sortedPosts = posts.sort(
  (a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf()
);
---

<ul>
  {sortedPosts.map((post) => (
    <li>
      <a href={`/blog/${post.slug}`}>
        <h2>{post.data.title}</h2>
        <time datetime={post.data.pubDate.toISOString()}>
          {post.data.pubDate.toLocaleDateString('en-US', {
            year: 'numeric',
            month: 'long',
            day: 'numeric',
          })}
        </time>
        <p>{post.data.description}</p>
        <div>
          {post.data.tags.map((tag) => (
            <span>{tag}</span>
          ))}
        </div>
      </a>
    </li>
  ))}
</ul>

Everything is typed. post.data.title is a string. post.data.pubDate is a Date. post.data.tags is string[]. If you try to access post.data.somethingThatDoesNotExist, TypeScript catches it before the build even runs.

The filter callback in getCollection() is the right place to handle drafts, feature flags, or any conditional content. We filter drafts based on the environment — draft posts are visible in development, hidden in production. Simple and effective.

Dynamic routes for individual posts

Each content entry needs its own page. Astro handles this with dynamic routes and the getStaticPaths pattern:

---
// src/pages/blog/[slug].astro
import { getCollection, render } from 'astro:content';
import BlogLayout from '../../layouts/BlogLayout.astro';

export async function getStaticPaths() {
  const posts = await getCollection('blog', ({ data }) => {
    return import.meta.env.PROD ? !data.draft : true;
  });

  return posts.map((post) => ({
    params: { slug: post.slug },
    props: { post },
  }));
}

const { post } = Astro.props;
const { Content, headings } = await render(post);
---

<BlogLayout
  title={post.data.title}
  description={post.data.description}
  pubDate={post.data.pubDate}
  tags={post.data.tags}
>
  <Content />
</BlogLayout>

The render() function returns the rendered Markdown content as an Astro component, plus extracted headings you can use for a table of contents. This is the same render API whether you are using .md or .mdx files.

For the portfolio, the pattern is identical but with a different layout:

---
// src/pages/portfolio/[slug].astro
import { getCollection, render } from 'astro:content';
import PortfolioLayout from '../../layouts/PortfolioLayout.astro';

export async function getStaticPaths() {
  const projects = await getCollection('portfolio');

  return projects.map((project) => ({
    params: { slug: project.slug },
    props: { project },
  }));
}

const { project } = Astro.props;
const { Content } = await render(project);
---

<PortfolioLayout
  title={project.data.title}
  description={project.data.description}
  stack={project.data.stack}
  url={project.data.url}
>
  <Content />
</PortfolioLayout>

Tags, categories, and filtered views

Content Collections make tag-based navigation straightforward. Here is how we generate tag pages for the blog:

---
// src/pages/blog/tag/[tag].astro
import { getCollection } from 'astro:content';

export async function getStaticPaths() {
  const posts = await getCollection('blog', ({ data }) => {
    return import.meta.env.PROD ? !data.draft : true;
  });

  // Extract all unique tags
  const tags = [...new Set(posts.flatMap((post) => post.data.tags))];

  return tags.map((tag) => ({
    params: { tag },
    props: {
      tag,
      posts: posts
        .filter((post) => post.data.tags.includes(tag))
        .sort((a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf()),
    },
  }));
}

const { tag, posts } = Astro.props;
---

<h1>Posts tagged "{tag}"</h1>
<ul>
  {posts.map((post) => (
    <li>
      <a href={`/blog/${post.slug}`}>{post.data.title}</a>
    </li>
  ))}
</ul>

Because getStaticPaths runs at build time, every tag page is pre-rendered as static HTML. No server, no client-side filtering, no loading states. The page exists or it does not.

For the portfolio, we use the category enum for the same pattern. The enum in the schema guarantees only valid categories exist, so we never generate an empty category page from a typo.

Organized content management dashboard with structured collections

Relating content across collections

One thing we do on the Threshline site is reference portfolio projects from blog posts and vice versa. Content Collections do not have built-in relational queries, but you can compose them:

// src/lib/content.ts
import { getCollection } from 'astro:content';

export async function getRelatedPosts(tags: string[], currentSlug: string) {
  const posts = await getCollection('blog', ({ data }) => {
    return import.meta.env.PROD ? !data.draft : true;
  });

  return posts
    .filter((post) => post.slug !== currentSlug)
    .map((post) => ({
      post,
      relevance: post.data.tags.filter((tag) => tags.includes(tag)).length,
    }))
    .filter(({ relevance }) => relevance > 0)
    .sort((a, b) => b.relevance - a.relevance)
    .slice(0, 3)
    .map(({ post }) => post);
}

export async function getFeaturedProjects() {
  const projects = await getCollection('portfolio');
  return projects
    .filter((p) => p.data.featured)
    .sort((a, b) => a.data.order - b.data.order);
}

These helper functions give us related posts for the “Read more” section at the bottom of each blog post, and featured projects for the homepage. The logic is simple, testable, and runs at build time.

Content validation errors that actually help

The best part of Content Collections is the error experience. When a schema validation fails, Astro tells you exactly what went wrong, in which file, for which field.

If you write a blog post with a missing description:

[ERROR] blog > "my-new-post.md" frontmatter does not match collection schema.
description: Required

If you put an invalid date format:

[ERROR] blog > "my-new-post.md" frontmatter does not match collection schema.
pubDate: Expected date, received string

If you add a sixth tag when the schema says max(5):

[ERROR] blog > "my-new-post.md" frontmatter does not match collection schema.
tags: Array must contain at most 5 element(s)

These errors appear instantly during development. You never ship broken content because the build will not let you. Compare this to a headless CMS where content validation is configured in a separate admin panel, or a raw Markdown setup where nothing is validated and you only discover problems from a broken production page.

Code on a screen showing a static site generator build process

When to use Content Collections vs a headless CMS

Content Collections are not a replacement for a headless CMS in every situation. Here is our decision framework:

Use Content Collections when:

  • Content is managed by developers or technical writers comfortable with Git
  • Content updates ship alongside code changes
  • You want full type safety and build-time validation
  • You need zero runtime dependencies (no API calls, no database)
  • Content volume is manageable in a Git repo (under a few thousand files)

Use a headless CMS when:

  • Non-technical users need to create and edit content
  • Content updates need to happen independently of deployments
  • You need real-time content changes without rebuilding
  • Content volume is very large or media-heavy

For the Threshline website, Content Collections are the obvious choice. We are developers. We write in Markdown. We want type safety. And our content ships alongside our code. For a client project like MindHyv where non-technical users manage business listings, a CMS or database-backed approach makes more sense.

For our comparison of CMS options that pair well with Astro, see our post on choosing a headless CMS for Astro.

Tips from running this in production

After running Content Collections across multiple sites, here are a few things we have learned:

Keep schemas strict. It is tempting to make everything optional for flexibility. Do not. Every required field is a bug you will never have. The stricter the schema, the more reliable the content.

Use z.coerce.date(), not z.date(). Frontmatter dates come in as strings. The coerce variant handles the conversion automatically. Without it, every date field is a type error.

Create a content template. We keep a _template.md file in each collection directory (prefixed with underscore so Astro ignores it). New posts start by copying the template, which ensures correct frontmatter structure.

Build a content helper library. Centralize your getCollection calls with filtering and sorting logic in a src/lib/content.ts file. This avoids duplicating filter logic across multiple pages and makes the sorting behavior consistent.

Run the build in CI. Content validation only happens at build time. If someone merges a post with invalid frontmatter and you do not build in CI, it goes to production broken. Our GitHub Actions workflow builds the full site on every PR. For more on our CI setup, see our post on GitHub Actions for small teams.

Content Collections are one of the features that make Astro our default choice for content-driven sites. They combine the simplicity of Markdown files with the rigor of typed schemas, and the developer experience is excellent.

If you are building a content-heavy site and want it done right, reach out at hello@threshline.com. We have shipped this pattern enough to know the shortcuts and the gotchas.