Introducing the Open Knowledge Format: Why It Matters for AI-Ready Businesses

A professional, lifelike photo of a modern technology growth engineering meeting reviewing structured files and data schema metadata on a dashboard

Google Cloud's new open spec, OKF, formalizes the "LLM-wiki" pattern into a portable, vendor-neutral standard for the knowledge AI agents actually need. Here's what it is, why it's a milestone, and what it means for any business that wants to be readable by machines.

1. The short version

On June 12, 2026, Google Cloud introduced the Open Knowledge Format (OKF), an open specification for representing the metadata, context, and curated knowledge that modern AI systems and agents need to do their jobs. The headline idea is deliberately unglamorous, and that's its strength: OKF v0.1 represents knowledge as a directory of markdown files with YAML frontmatter, plus a small set of agreed-upon conventions so that knowledge written by one team or tool can be read by a different team's agents without any translation layer. No new runtime, no required SDK, no proprietary account, just markdown, files, and a little structured frontmatter.

If you've used Obsidian, Notion, Hugo, or the wave of AGENTS.md / CLAUDE.md convention files that emerged over the past year, the shape will feel familiar. What OKF adds is interoperability: a common answer to "what fields should every knowledge document carry, and what do the filenames mean?" so these patterns can finally cooperate instead of each being bespoke. For any business thinking seriously about being usable by AI, this is a meaningful moment, because it points at where machine-readable knowledge is heading, and it's a direction that rewards the businesses already structuring their content for machines to consume.

2. The problem OKF solves: a fragmented context landscape

Foundation models are powerful but context-starved. As Google Cloud's announcement frames it, the lack of relevant context often limits what models can do, especially as they're used to build agentic systems, they can write code, summarize a document, or analyze a dataset, but only if they have the right information in front of them. And in most organizations, the information that matters is internal knowledge: the schema of a table, what a metric actually means to your business, the runbook for an incident, the join paths between two systems, the deprecation notice for an old API.

The trouble is where that knowledge lives. Today it's scattered across mutually incompatible surfaces: metadata catalogs with their own APIs, wikis and shared drives, code comments and docstrings and notebook cells, and, candidly, the heads of a few senior people. When an AI agent needs to answer something like "How do I compute weekly active users from our event stream?", it has to assemble the answer from those scattered, incompatible sources. Every agent builder ends up solving the same context-assembly problem from scratch, every catalog vendor reinvents the same data models, and the knowledge itself stays locked behind whatever tool created it. That fragmentation, not a shortage of knowledge, is the bottleneck OKF targets.

3. What the Open Knowledge Format actually is

OKF is a way of writing knowledge down so that both humans and machines can read it, and so that it survives moving between systems. It formalizes what the AI researcher Andrej Karpathy crisply described as the LLM-wiki pattern, the idea that a living library of markdown notes is a natural home for the facts an AI system reasons over. Karpathy's observation is that the bookkeeping humans abandon, keeping a personal wiki updated, cross-references current, files in sync, is exactly what language models are good at: they don't get bored, don't forget to update a cross-reference, and can touch fifteen files in one pass.

That pattern kept reappearing under different names, Obsidian vaults wired to coding agents, the AGENTS.md / CLAUDE.md convention files, repos full of index.md and log.md artifacts that agents read before doing real work, "metadata as code" inside data teams, but each instance was bespoke and none were designed to cooperate. OKF is the small set of conventions that makes them interoperable. As published, it's described by its authors as three things at once: just markdown (readable in any editor, renderable on GitHub, indexable by any search tool), just files (shippable as a tarball, hostable in any git repo, mountable on any filesystem), and just YAML frontmatter for the handful of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp. That's the entire surface. The full v0.1 spec fits on a single page.

4. How an OKF bundle is structured

An OKF bundle is simply a directory of markdown files, where each file represents a concept, anything you want to capture: a table, a dataset, a metric, a playbook, a runbook, an API. One concept per file, and the file path is the concept's identity. A sales bundle might hold an index.md at the root, then folders like datasets/, tables/, and metrics/, each with its own files (orders.md, customers.md, weekly_active_users.md) and index.

Each concept document carries a small block of YAML frontmatter for the structured, queryable fields: type, title, description, resource link, tags, timestamp, followed by a markdown body for everything else: the schema table, the description, the join paths, whatever the concept needs. Concepts link to one another with ordinary markdown links, which turns the directory into a graph of relationships richer than the simple parent/child nesting the folders imply. Bundles can optionally include index.md files (so an agent can progressively disclose detail as it navigates the hierarchy) and log.md files (a chronological history of changes). The elegance is that none of this requires special tooling to create or read; it's the same files a human edits in a text editor and an agent parses directly.

A software engineer explaining the Open Knowledge Format bundle structure on a whiteboard in a modern dark-themed meeting room

Standardizing metadata and folder hierarchies: how OKF structures concepts, indices, and logs.

5. The three principles behind the design

Three design choices explain why OKF is built the way it is, and each carries a lesson for anyone structuring knowledge for AI.

1. Minimally opinionated: OKF requires exactly one thing of every concept: a type field. Everything else—what types exist, what other fields to include, what sections the body contains—is left to the producer. The spec defines the interoperability surface, not your content model.
2. Producer/consumer independence: OKF cleanly separates who writes the knowledge from who reads it. A bundle hand-authored by a human can be consumed by an AI agent; a bundle generated by an export pipeline can be browsed in a visualizer; a bundle written by one LLM can be queried by another. The format is the contract.
3. Format, not platform: OKF isn't tied to any cloud, database, model provider, or agent framework, and by design it will never require a proprietary account or SDK to read, write, or serve. Google's stated reasoning is telling: the value of a knowledge format comes from how many parties speak it, not from who owns it.

6. Why "a format, not another service" is the whole point

The most important sentence in Google's announcement may be this: the answer to fragmented knowledge isn't another knowledge service, it's a format. The distinction matters enormously. A service locks knowledge behind an API, an account, and a vendor relationship; a format lets knowledge be produced by anyone without an SDK, consumed by anyone without an integration, moved between systems and organizations intact, version-controlled alongside the code it describes, and read by both humans and agents from the same file with no translation step.

This is the same insight that made earlier open formats—HTML, Markdown, CSV, JSON—so durable: they won not because they were sophisticated but because they were portable and universal. OKF is a bet that AI knowledge needs the same treatment, and that the winning representation will be the one the most tools and organizations can speak, not the one with the best proprietary features. To make it concrete, Google shipped reference implementations at both ends, an enrichment agent that walks a BigQuery dataset and drafts an OKF document for every table, and a self-contained static HTML visualizer that turns any bundle into an interactive graph with no backend, plus three ready-to-browse sample bundles. But the authors are explicit that the tools are proofs of concept; the format itself is the contribution.

7. What this signals for AI-ready businesses

Step back from the data-engineering specifics and OKF is a signal about the direction of the whole AI ecosystem, one every business should read. The signal is this: the knowledge AI systems rely on is moving toward open, structured, machine-readable, portable representations, and the organizations whose knowledge is already in that shape will be the ones AI agents can actually use. As AI agents increasingly mediate how customers discover, evaluate, and interact with businesses, being legible to those agents stops being a nicety and becomes table stakes.

The practical reading for a business isn't "go implement OKF tomorrow", most companies aren't shipping BigQuery metadata bundles. It's that the principles OKF encodes are exactly the principles that make a business visible and usable in an AI-mediated world: structure your knowledge so machines can parse it, keep it in clean and portable formats rather than locked in proprietary silos, make relationships between concepts explicit, and treat your knowledge like code that's curated and versioned rather than scattered across tools and people's heads. A business whose product information, documentation, and expertise live in clean, structured, linkable form is one an AI agent, or an AI search engine, can read, trust, and surface. By aligning with what data sources LLMs crawl to verify B2B company information, organizations ensure they remain legible to these agents. A business whose knowledge is trapped in PDFs, screenshots, and tribal memory is invisible to exactly the systems that increasingly drive discovery.

8. OKF and GEO: the same thesis, one layer deeper

For anyone following generative engine optimization (GEO), the practice of structuring content so AI engines cite and surface it, OKF will feel like a familiar thesis taken one layer deeper. GEO is about making your public content legible and trustworthy to the LLMs behind AI search; OKF is about making any knowledge, public or internal, legible to the agents that consume it. Both rest on the same foundation: machines reward structure, clarity, explicit relationships, and portable formats, and they penalize fragmentation and opacity.

The connection is direct. The same discipline that gets a brand cited in an AI answer—clean structure, clear entities, explicit links between concepts, and machine-parseable evidence—is the discipline OKF formalizes for knowledge bundles. A business that has done the work to be AI-discoverable on its public surfaces has already internalized the mindset OKF encodes; a business that hasn't will find both equally foreign. OKF is, in effect, the data-layer expression of a truth GEO practitioners have been acting on for two years: in an AI-mediated world, how your knowledge is structured determines whether machines can use it, and that increasingly determines whether you're discovered at all.

9. How to start thinking in OKF terms

You don't need to adopt the specification to benefit from its lessons. A few practical moves any business can make, in rough order of accessibility:

Get knowledge out of silos: Favor markdown, structured text, and open formats over PDFs, screenshots, and proprietary tools for the knowledge that describes your business, products, and processes.
Make structure explicit: Use clear headings, consistent fields (what something is, what it relates to, when it was updated), and explicit links between related concepts.
Treat knowledge like code: Curate it, version it, keep it current, and store it where it can be maintained deliberately rather than scattered across drives and chat threads.
Add the metadata machines query: The fields OKF standardizes—type, title, description, source link, tags, timestamp—are exactly the fields that make any knowledge findable and trustworthy to a machine.
For technical teams: Read the spec and try the reference tools. The OKF spec and sample bundles are on GitHub.

10. How Gobiya helps you become machine-readable

OKF validates the core of what Gobiya builds for clients: businesses that are structured to be read, trusted, and surfaced by machines, not just by people. Our SEO and AI-engine discoverability work is built on exactly the principles OKF formalizes—clean structure, explicit entities and relationships, machine-parseable metadata, and portable, open formats—applied to the public surfaces where AI search engines discover and cite businesses.

We build sites on fast, clean, crawlable infrastructure so the structure is legible to agents and crawlers rather than buried in heavy, opaque pages, and we wire everything into native CRM and pipeline attribution so the visibility gains connect to real inquiries, not vanity metrics. The outcome pattern shows up in work like SmileCenter Dentistry's growth in search impressions. If you want your business structured to be legible to AI agents and search engines as that shift plays out, book a strategy call and ask about an AI-discoverability assessment.

11. The right call on knowledge formats

So what is the Open Knowledge Format, and why does it matter? It's Google Cloud's open, vendor-neutral specification for representing AI-relevant knowledge as a directory of markdown files with a little YAML frontmatter, formalizing the LLM-wiki pattern into something portable and interoperable. It matters because it's a clear signal of where the AI ecosystem is heading: toward open, structured, machine-readable knowledge, and toward rewarding the organizations whose knowledge already lives in that shape.

Two takeaways matter most. First: the value is in the format, not the platform, and the same logic applies to your own knowledge: the more it lives in clean, portable, structured form, the more usable it is to every AI system. Second: whether or not you implement the spec, the principles behind it—structure, explicit relationships, machine-readable metadata, knowledge treated like code—are the principles that determine whether your business is legible to the AI agents and search engines. Getting your knowledge into that shape is the move.

12. Frequently Asked Questions

What is the Open Knowledge Format (OKF)?

OKF is an open specification introduced by Google Cloud on June 12, 2026, for representing the metadata, context, and curated knowledge that AI systems and agents need. OKF v0.1 represents knowledge as a directory of markdown files with YAML frontmatter and a small set of shared conventions, so knowledge written by one producer can be consumed by a different agent or tool without any translation layer.

What problem does OKF solve?

The fragmentation of internal knowledge. The context AI models need, table schemas, definitions, API notes, typically lives scattered across incompatible systems: metadata catalogs, wikis, code comments, and people's heads. OKF gives them a common, portable format so knowledge stops being locked behind whichever tool created it.

Do I need to use Google Cloud to use OKF?

No. OKF is an open, vendor-neutral standard that works independently of any cloud, database, model provider, or agent framework. The spec and sample bundles are openly available on GitHub.

How does OKF relate to GEO (generative engine optimization)?

They share a thesis. GEO is about structuring your public content so AI search engines cite and surface it; OKF is about structuring knowledge, public or internal, so AI agents can consume it. Both rest on the same foundation: machines reward structure, clarity, explicit relationships, and portable formats.

13. Sources & further reading

Introducing the Open Knowledge Format — Google Cloud Blog — official announcement (June 12, 2026).
OKF specification and reference code — GitHub.
Andrej Karpathy — the LLM Wiki gist.

forensic engineering protocolfree download

Google Core Update & Penalty Recovery Checklist

A step-by-step technical guide to isolating algorithmic drops, diagnosing entity devaluation, and preparing reconsideration submissions.

Isolate query drops from broad Core Update filters
Link-profile triage checklist for manual actions
Reconsideration letter copy-paste template

Introducing the Open Knowledge Format: Why It Matters for AI-Ready Businesses

1. The short version

2. The problem OKF solves: a fragmented context landscape

3. What the Open Knowledge Format actually is

4. How an OKF bundle is structured

5. The three principles behind the design

6. Why "a format, not another service" is the whole point

7. What this signals for AI-ready businesses

8. OKF and GEO: the same thesis, one layer deeper

9. How to start thinking in OKF terms

10. How Gobiya helps you become machine-readable

11. The right call on knowledge formats

12. Frequently Asked Questions

What is the Open Knowledge Format (OKF)?

What problem does OKF solve?

Do I need to use Google Cloud to use OKF?

How does OKF relate to GEO (generative engine optimization)?

13. Sources & further reading

Google Core Update & Penalty Recovery Checklist

Related briefs.

LLM Company Verification: What Data Sources Do AI Bots Crawl?

Multi Location Websites for Franchises: The 2026 Web Architecture Playbook

B2B Organic Traffic Growth: Why Traffic and Pipeline Decoupled in 2026 and What to Do About It

1. The short version

2. The problem OKF solves: a fragmented context landscape

3. What the Open Knowledge Format actually is

4. How an OKF bundle is structured

5. The three principles behind the design

6. Why "a format, not another service" is the whole point

7. What this signals for AI-ready businesses

8. OKF and GEO: the same thesis, one layer deeper

9. How to start thinking in OKF terms

10. How Gobiya helps you become machine-readable

11. The right call on knowledge formats

12. Frequently Asked Questions

What is the Open Knowledge Format (OKF)?

What problem does OKF solve?

Do I need to use Google Cloud to use OKF?

How does OKF relate to GEO (generative engine optimization)?

13. Sources & further reading

14. Related on Gobiya

Google Core Update & Penalty Recovery Checklist

Related briefs.

LLM Company Verification: What Data Sources Do AI Bots Crawl?

Multi Location Websites for Franchises: The 2026 Web Architecture Playbook

B2B Organic Traffic Growth: Why Traffic and Pipeline Decoupled in 2026 and What to Do About It