Skills as Infrastructure: Distributing AI Workflow Knowledge Through MCP Servers

Why Compiled Skill Servers Outperform Flat File Distribution for Enterprise AI Workflow Deployments

Dibblee Industries Artificial Intelligence, Enterprise Workflow Automation, MCP Architecture February 27, 2026 Bob Dibblee

This post describes an architectural pattern we developed and use internally. It is a starting point for discussion — not a product specification. The approach is novel and the trade-offs described here are specific to our deployment context.

0 Files to distribute manually per employee

~500 Tokens loaded on first call vs. full file in context

1 npm publish to update all users simultaneously

MCP Prompts primitive available as the more idiomatic alternative

Executive Summary

Enterprise AI workflows depend on two kinds of knowledge: tools (what the model can do) and skills (how the model should behave in a specific domain). Tools are well-served by MCP servers. Skills have typically been distributed as Markdown files on each user's local filesystem — an approach that loads the entire content into the context window at the start of every conversation, whether it is needed or not.

This paper describes an architectural pattern we developed to address this: embedding procedural skills directly inside an MCP server, serving them on demand via a tiered loading pattern. The result eliminates manual file distribution, synchronises skill updates with tool releases, reduces context consumption for conversations that do not invoke the skill, and provides a practical degree of intellectual property protection. These benefits come with real trade-offs that this paper addresses directly.

1. The Problem: Flat File Skill Distribution

The dominant pattern for distributing workflow instructions to AI assistants is straightforward: create a Markdown file describing the procedure and place it where your AI assistant will read it at startup. In Claude.ai, that typically means /mnt/skills/user/. In Claude Code, it means CLAUDE.md. The file loads into context at the start of every conversation.

For a simple one-page workflow, this is acceptable. For a complex enterprise procedure covering contract parsing, tax mapping, destination-split rules, seven structured item codes, and amendment edge cases — such a file easily exceeds 2,500 tokens. Those tokens are consumed in every conversation, including ones about unrelated tasks. Context compaction occurs earlier. Long conversations lose critical information.

There is a second problem: distribution. If your workflow encodes institutional knowledge — the specific nuances your team has refined over years of operating in a domain — distributing that knowledge to every employee as a readable Markdown file on a local filesystem is neither efficient nor secure.

2. The Approach: Skills as MCP Tools

The Model Context Protocol exposes three primitives: Tools (callable functions), Resources (readable data), and Prompts (reusable prompt templates). Our approach uses the Tools primitive to implement a skill server with three operations:

list_skills

Returns an inventory of available skill names and descriptions. Minimal load — a few dozen tokens. Called first to discover what is available.

get_skill

Returns a compact summary (~500 tokens) containing core rules, workflow steps, and key parameters. Designed to cover the majority of use cases without loading the full reference.

get_skill_detail

Returns the full detailed reference, optionally filtered to a named section. Called only when the compact summary is insufficient for the task at hand.

The skill content is stored as TypeScript string constants inside the MCP server repository — organised into a compact summary and named detail sections. The server compiles to JavaScript and distributes like any other npm package.

The Core Principle

Load the minimum first, load more only if needed. Most conversations invoking a complex workflow only need the compact summary. Full detail — code examples, exhaustive reference tables, edge case rules — is retrieved only when a specific ambiguity demands it.

3. Benefits Analysis

Context Efficiency

In Claude.ai, skill files in /mnt/skills/user/ are injected into the system prompt at conversation start. A 2,500-token file consumes that capacity in every session regardless of the task. With the MCP server, nothing loads upfront — the model calls get_skill only when a domain task is invoked, loading ~500 tokens instead of 2,500.

Zero Distribution Overhead

Every employee who installs the MCP server npm package automatically receives the skills — no separate file copy, no directory to manage, no onboarding process that includes Markdown file installation instructions. The skill and the tools it references travel together in the same package.

Version Synchronisation

Skill content and tool implementation share a version number. When a new Xero tool is added and the workflow skill is updated to reference it, they publish together. Employees running npm update receive both changes atomically. There is no possible state where a skill references a tool that does not yet exist in the installed version.

Intellectual Property Protection

A Markdown file on an employee's filesystem can be read in any text editor. Compiled JavaScript distributed via npm is substantially less readable. The institutional knowledge encoded in the skill logic — the domain-specific nuances your team has developed — is obscured from end users while remaining fully readable by the AI being served.

4. Trade-offs and Weaknesses

This approach carries real disadvantages that any practitioner should evaluate before adopting it.

Dimension	Flat File (Markdown)	MCP Skill Server
Content update	Edit the file directly, no build step	Edit .ts, build, publish
Employee-readable	Fully readable in any editor	Compiled JS — obscured but not encrypted
Efficiency in Claude Code	Loaded via CLAUDE.md or @ reference	Tool call loads same content — identical footprint
Efficiency in Claude.ai	Injected into system prompt at startup	Loaded only on demand, tiered loading
Learning curve	Markdown — everyone can write it	TypeScript, MCP SDK, npm tooling required
Distribution	Manual copy per employee	Automatic via npm install
Version sync	None — tools and skills are decoupled	Atomic — same package version

The Claude Code Caveat

In Claude Code sessions, calling get_skill via the MCP tool loads content into the context window exactly as reading a local file would. The efficiency benefit applies primarily to Claude.ai desktop conversations, where skills were previously pre-loaded into the system prompt from the local filesystem. If your workflow is primarily Claude Code-driven, the token advantage is minimal.

5. The Alternative: The MCP Prompts Primitive

The Model Context Protocol defines three first-class primitives. We implemented our skill server using the Tools primitive, but the Prompts primitive is actually more idiomatic for this use case. MCP Prompts allow servers to expose named, reusable prompt templates that integrate natively into AI assistant interfaces.

In Claude.ai, MCP Prompts surface as slash commands — /dnd-xero-workflow — invocable directly from the chat interface without the model needing to call a discovery tool. The Prompts approach is cleaner from a protocol perspective, but it offers less control over tiered loading. The Tools primitive we use enables the summary/section-detail model that Prompts-based loading makes harder to implement.

Tools Approach (our implementation)

Flexible Tiered Loading

Full control over granularity: compact summary first, specific sections on demand. Requires explicit tool calls for discovery and loading. Works in any MCP-compatible client.

Prompts Approach (idiomatic alternative)

Native Interface Integration

Skills surface as slash commands in Claude.ai. Discovery and invocation without a separate discovery tool. Less flexibility in loading granularity — the skill typically loads in full.

6. Practical Implications

This pattern is most valuable in cases that combine several conditions. If your organisation meets most of these criteria, the implementation investment is likely worth the cost:

Complex, high-volume workflows. One-page skills in CLAUDE.md are over-engineering. Multi-thousand-token procedures that apply to some sessions but not others are the target use case.
Distributed team. One employee. One filesystem. One Markdown file. That is manageable. Ten employees with unsynchronised skill updates is the problem this pattern solves.
Proprietary institutional knowledge. If the workflow logic represents intellectual property you do not want exposed in plain text on employee machines, compiled is preferable to Markdown.
Strong tool-skill coupling. When skills reference specific tools by name — and those tools change with versions — atomic delivery prevents version drift.
Primarily Claude.ai (not Claude Code) usage. The context efficiency benefit only applies to environments where skills were previously pre-loaded from the filesystem.

Conclusion

Flat file distribution works until it doesn't. For single-user deployments with simple skills, a Markdown file remains the correct approach — less complexity, direct editing, no build cycle. For team deployments with complex, proprietary workflows and Claude.ai desktop environments, the MCP skill server approach solves several problems simultaneously.

This is not a widely adopted pattern. We found no direct prior art in public MCP server literature — most existing skill-serving implementations serve facts (memory skills) or reference documentation (resource skills), not procedural workflow knowledge. The tiered loading pattern in particular — compact summary first, section on demand second — is our own contribution to this problem.

As with any architectural choice, the right decision depends on context. This paper was written to help practitioners evaluate that context honestly.

The Thing to Remember

A skill in a flat file is knowledge you distribute. A skill in an MCP server is knowledge you deploy. The difference — version control, atomic delivery, IP protection, on-demand loading — only matters when distribution becomes a problem. Build accordingly.

← Blog