Typed Markdown Collections Specification
Version: 0.1.0
Status: Draft
Last Updated: 2026-01-30
Abstract
This specification defines the behaviour of tools that treat folders of markdown files as typed, queryable data collections. It covers schema definition, field types, validation, querying, and CRUD operations.
Motivation
Markdown files with YAML frontmatter are a common way to store structured content. The pattern appears in static site generators, knowledge management tools like Obsidian, documentation systems, and increasingly in AI agent frameworks that use markdown for persistent state.
Each of these ecosystems has developed its own conventions for frontmatter structure, querying, and validation. This specification defines one coherent set of behaviours so that:
- A CLI tool and an editor plugin can operate on the same files with consistent semantics
- An AI agent can read and write markdown files that a human can also inspect and edit
- Tool authors have a behaviour contract to implement against rather than inventing new conventions
Intended implementers
CLI tools for querying, validating, and manipulating markdown collections from the command line.
Editor plugins (for Obsidian, VS Code, etc.) that provide validation, autocomplete, and query interfaces. The expression syntax is designed for compatibility with Obsidian Bases.
Libraries in various languages that other applications can use to work with typed markdown.
AI agent frameworks that need structured, human-readable persistent storage.
What a conforming tool does
A tool implementing this specification:
- Recognises collections by the presence of an
mdbase.yamlconfig file - Loads type definitions from markdown files in a designated folder
- Matches files to types based on explicit declaration or configurable rules
- Validates frontmatter against type schemas, reporting errors at configurable severity levels
- Executes queries using an expression language for filtering and sorting (with optional advanced features like grouping and summaries)
- Performs CRUD operations with validation, default values, and auto-generated fields
- Updates references when files are renamed, keeping links consistent across the collection
The specification defines the expected behaviour for each of these capabilities, along with conformance levels for partial implementations.
Design Principles
Files are the source of truth. Tools read from and write to the filesystem. Indexes and caches are derived and disposable.
Human-readable first. Tools should not require proprietary formats. A user with a text editor should be able to read and modify any file.
Progressive strictness. Tools should work on collections with no schema at all. Validation is opt-in and configurable.
Portable. Collections should work with any conforming tool. No vendor lock-in.
Git-friendly. All persistent state is text files suitable for version control.
How It Works
A collection is a folder with an `mdbase.yaml` marker
my-project/
├── mdbase.yaml # Marks this folder as a collection
├── _types/ # Type definitions (schemas)
│ ├── task.md
│ └── person.md
├── tasks/
│ ├── fix-bug.md # A record of type "task"
│ └── write-docs.md
└── people/
└── alice.md # A record of type "person"
The minimal config just declares the spec version:
# mdbase.yaml
spec_version: "0.1.0"
Types are defined as markdown files
A type is a schema for a category of files. Types live in the _types/ folder and are themselves markdown — the frontmatter defines the schema, the body documents it.
---
name: task
fields:
title:
type: string
required: true
status:
type: enum
values: [open, in_progress, done]
default: open
priority:
type: integer
min: 1
max: 5
assignee:
type: link
target: person
---
# Task
A task represents a unit of work. Set `status` to track progress.
Records are markdown files with typed frontmatter
A file declares its type and provides field values in frontmatter. The body is free-form markdown.
---
type: task
title: Fix the login bug
status: in_progress
priority: 4
assignee: "[[alice]]"
tags: [bug, auth]
---
The login form throws a validation error when the email contains a `+` character.
Queries filter and sort records using expressions
Queries are YAML objects with optional clauses for filtering, sorting, and pagination:
query:
types: [task]
where:
and:
- 'status != "done"'
- "priority >= 3"
order_by:
- field: due_date
direction: asc
limit: 20
The expression language supports field access, comparison, boolean logic, string and list methods, date arithmetic, and link traversal:
status == "open" && tags.contains("urgent")
due_date < today() + "7d"
assignee.asFile().team == "engineering"
Validation is progressive
Collections work with no types at all — every file is an untyped record. Types can be added incrementally, and validation severity is configurable per-collection or per-type (off, warn, error). Strictness controls whether unknown fields are allowed, warned, or rejected.
Links connect records across the collection
Records can reference each other using wikilinks ([[alice]]) or markdown links ([Alice](../people/alice.md)). When a file is renamed, conforming tools update all references automatically.
Conformance is levelled
Implementations don't need to support everything. Six conformance levels let tools start with basic CRUD (Level 1) and progressively add matching, querying, links, reference updates, and caching. See §14 for details.
Specification Structure
| Document | Description |
|---|---|
| 01-terminology.md | Definitions of key terms |
| 02-collection-layout.md | How tools identify and scan collections |
| 03-frontmatter.md | Frontmatter parsing, null semantics, serialization |
| 04-configuration.md | The mdbase.yaml configuration file |
| 05-types.md | Type definitions as markdown files |
| 06-matching.md | How tools match files to types |
| 07-field-types.md | Primitive and composite field types |
| 08-links.md | Link syntax, parsing, resolution |
| 09-validation.md | Validation levels and error reporting |
| 10-querying.md | Query model, filters, sorting |
| 11-expressions.md | Expression language for filters and formulas |
| 12-operations.md | Create, Read, Update, Delete, Rename |
| 13-caching.md | Optional caching and indexing |
| 14-conformance.md | Conformance levels and testing |
| 15-watching.md | Watch mode and change events |
| Appendix A | Complete examples |
| Appendix B | Formal expression grammar |
| Appendix C | Standard error codes |
| Appendix D | Compatibility with existing tools |
Versioning
This specification uses semantic versioning. The current version is 0.1.0, indicating a draft in active development. Breaking changes may occur before 1.0.0.
Tools should declare which specification version they implement and should reject configuration files with unsupported spec_version values.
License
This specification is released under CC BY 4.0.
1. Terminology
This section defines the key terms used throughout the specification. Understanding these definitions is essential for correctly interpreting the requirements.
Core Concepts
Collection
A directory (and optionally its subfolders) containing markdown files managed as a unit. A collection is identified by the presence of an mdbase.yaml configuration file at its root. The collection is the fundamental unit of organization—all operations, queries, and validations occur within the scope of a single collection.
Collection Root
The directory containing the mdbase.yaml configuration file. All paths in the specification are relative to this directory unless otherwise stated.
File (or Record)
A markdown file within a collection. Files have the extension .md (or optionally .mdx or .markdown if configured). Each file represents a single record in the collection—analogous to a row in a database table, but richer: it has structured frontmatter, unstructured body content, and file system metadata.
Frontmatter
YAML metadata at the beginning of a file, delimited by --- markers. The frontmatter is a YAML mapping (object) containing the structured fields of the record.
---
title: My Document
status: draft
tags: [important, review]
---
The body content begins here.
Body
The markdown content following the frontmatter. The body is treated as opaque content by default—this specification primarily concerns itself with frontmatter structure. Implementations MAY support body queries, but this is optional.
Type
A named schema defining the expected frontmatter fields, their types, constraints, and validation rules for a category of files. Types are themselves defined as markdown files in a designated folder, making them versionable and documentable. A file may be associated with zero, one, or multiple types.
Type Definition File
A markdown file in the types folder whose frontmatter defines a type schema. The body of the file can contain documentation, examples, and usage notes for the type.
Untyped File
A file that is not associated with any type. Untyped files are valid members of a collection—they simply have no schema constraints applied to them. This allows for gradual adoption of typing.
Config File
The mdbase.yaml file at the collection root. This file defines global settings, the location of type definitions, and collection-wide behavior. It does not contain type definitions themselves—those live in separate markdown files.
Operations and Queries
Expression
A string that evaluates to a value, used in query filters and computed formulas. Expressions follow the syntax defined in the Expression Language section.
Query
A request to retrieve files matching certain criteria. Queries can filter by type, field values, file metadata, and path patterns, with results sorted and paginated.
Formula
A computed field defined by an expression. Formulas are evaluated at query time and can be used for filtering, sorting, and display.
Validation
The process of checking whether a file's frontmatter conforms to the schemas of its matched types. Validation can report issues, warn, or fail operations depending on configuration.
Links and References
Link
A reference from one file to another, expressed in frontmatter or body content. Links can use wikilink syntax ([[target]]), markdown link syntax ([text](path.md)), or bare paths.
Resolution
The process of determining which file a link refers to. Resolution takes into account relative paths, collection-wide search, and optional type-scoped lookups.
Backlink
An incoming link—a reference to a file from another file. Backlinks require indexing to compute efficiently and are an optional feature.
Implementation Terms
Implementation
Any tool, library, or application that reads, writes, or operates on collections according to this specification.
Conformance Level
A defined subset of the specification that an implementation may claim to support. See Conformance for the defined levels.
Cache
An optional derived data store that accelerates queries. Caches MUST be rebuildable from the source files and MUST NOT be the source of truth.
RFC 2119 Keywords
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
In brief:
- MUST / REQUIRED / SHALL: Absolute requirement
- MUST NOT / SHALL NOT: Absolute prohibition
- SHOULD / RECOMMENDED: There may be valid reasons to ignore, but implications must be understood
- SHOULD NOT / NOT RECOMMENDED: There may be valid reasons to do it, but implications must be understood
- MAY / OPTIONAL: Truly optional; implementations may or may not include the feature
2. Collection Layout
This section defines how collections are identified, how files are discovered, and the overall structure of a compliant collection.
2.1 Identifying a Collection
A directory is recognized as a collection if and only if it contains a file named mdbase.yaml at its root. This file is the collection root marker. The directory containing this file is the collection root, and all paths in the specification are relative to this directory.
my-project/
├── mdbase.yaml # Collection root marker
├── _types/ # Type definition files (configurable location)
│ ├── task.md
│ └── note.md
├── tasks/
│ ├── fix-bug.md
│ └── write-docs.md
├── notes/
│ └── meeting-2024-01.md
└── README.md
If a directory does not contain mdbase.yaml, it is not a collection. Implementations MUST NOT treat arbitrary directories of markdown files as collections without this marker.
2.2 File Discovery
Included Files
Implementations MUST scan the collection root and, by default, all subdirectories (recursively) for markdown files.
A file is considered a markdown file if:
- It has the extension
.md, OR - It has an extension listed in
settings.extensionsin the config file (e.g.,.mdx,.markdown)
Implementations MUST treat files with these extensions as collection members (records).
Excluded Paths
Implementations MUST exclude certain paths from scanning:
- The
mdbase.yamlconfig file itself (it is not a record) - Paths listed in
settings.excludein the config file - The types folder (by default
_types/, configurable viasettings.types_folder) - The cache folder if present (by default
.mdbase/)
Default exclusions (applied unless overridden):
.gitnode_modules.mdbase
Subdirectory Scanning
By default, implementations MUST scan subdirectories recursively. This behavior can be disabled by setting settings.include_subfolders: false in the config file.
When subdirectory scanning is disabled, only files directly in the collection root are considered records.
2.3 The Types Folder
Type definitions are stored as markdown files in a designated folder. By default, this folder is _types/ at the collection root, but it can be configured via settings.types_folder.
# mdbase.yaml
settings:
types_folder: "_schemas" # Use _schemas/ instead of _types/
The types folder:
- MUST be excluded from the regular file scan (type files are not records)
- MUST be scanned separately to load type definitions
- MAY contain subdirectories for organization (all
.mdfiles are scanned)
See Types for the format of type definition files.
2.4 Path Conventions
All paths in the specification and in queries use forward slashes (/) regardless of operating system. Implementations on Windows MUST normalize backslashes to forward slashes.
Paths are relative to the collection root unless explicitly stated otherwise.
Examples:
tasks/fix-bug.mdrefers to a file in thetaskssubdirectory./sibling.mdin a link is relative to the containing file's directory../other/file.mdin a link navigates up one directory
2.5 Reserved Names
The following names have special meaning and SHOULD NOT be used as regular record filenames:
| Name | Purpose |
|---|---|
mdbase.yaml |
Collection configuration |
_types/ |
Default types folder (configurable) |
.mdbase/ |
Default cache folder |
Implementations SHOULD warn if a user attempts to create a record with a reserved name.
2.6 Minimal Collection Example
The simplest valid collection consists of a config file and zero or more markdown files:
minimal/
├── mdbase.yaml
└── hello.md
mdbase.yaml:
spec_version: "0.1.0"
hello.md:
---
title: Hello World
---
This is a minimal collection with one untyped file.
This collection has no types defined. The single file is untyped but still a valid record. As the collection grows, types can be added incrementally.
2.7 Recommended Collection Structure
While the specification allows flexibility, the following structure is recommended for clarity:
my-collection/
├── mdbase.yaml # Required: collection configuration
├── _types/ # Type definitions
│ ├── task.md
│ ├── note.md
│ └── person.md
├── .mdbase/ # Cache (gitignored)
│ └── index.sqlite
├── tasks/ # Records organized by type or purpose
│ ├── task-001.md
│ └── task-002.md
├── notes/
│ └── weekly-review.md
├── people/
│ └── alice.md
└── README.md # Documentation (also a record unless excluded)
Note that README.md is a valid record in this structure. If you want to exclude documentation files from the collection, add them to settings.exclude.
2.8 Nested Collections
If a subdirectory within a collection also contains an mdbase.yaml file, it defines an independent nested collection:
- The parent collection's scan MUST NOT descend into the nested collection.
- The nested collection's files are NOT records of the parent collection.
- The nested
mdbase.yamlacts as a boundary marker, similar to exclude patterns. - Implementations SHOULD automatically exclude directories containing
mdbase.yaml.
my-collection/
├── mdbase.yaml # Parent collection
├── tasks/
│ └── task-001.md # Record in parent
└── sub-project/
├── mdbase.yaml # Nested collection (independent)
└── docs/
└── readme.md # Record in sub-project, NOT in parent
This behavior ensures that collections remain self-contained and do not interfere with each other.
2.9 Non-Markdown Files
Collections may contain non-markdown files such as images, PDFs, or other binary assets. These files are NOT records — they have no frontmatter and are not returned by queries.
Status in the Collection
- Non-markdown files are valid link targets:
[[photo.png]]andresolve normally - They appear in
file.linkswhen referenced via non-embed link syntax (e.g.,[[photo.png]]) - They appear in
file.embedswhen referenced via embed syntax (![[...]]or) - Non-markdown files are not assigned types and cannot be validated against type schemas
File Discovery
- File discovery MUST skip non-markdown files when building the record set
- File discovery MUST NOT skip non-markdown files during link resolution — links to images, PDFs, and other assets MUST resolve by path
- Implementations MUST resolve links to non-markdown files by path only (no
id_fieldlookup, since non-markdown files are not records and have no frontmatter)
Example
my-collection/
├── mdbase.yaml
├── tasks/
│ └── task-001.md # Record (included in queries)
├── images/
│ ├── diagram.png # Not a record, but valid link target
│ └── screenshot.jpg # Not a record, but valid link target
└── attachments/
└── report.pdf # Not a record, but valid link target
In task-001.md:
---
type: task
title: Fix the layout
---
See the [[images/diagram.png]] for reference.
![[images/screenshot.jpg]]
Here, diagram.png appears in file.links because it uses non-embed link syntax; screenshot.jpg appears in file.embeds because it uses embed syntax.
3. Frontmatter Parsing and Serialization
This section defines how frontmatter is parsed from files and how it should be written back. Correct handling of YAML edge cases—especially null values and empty fields—is essential for interoperability.
3.1 Frontmatter Delimiters
A file MAY begin with YAML frontmatter. Frontmatter is delimited by two lines consisting of exactly three hyphens (---):
---
title: My Document
status: draft
---
# Heading
Body content begins here.
Rules:
The opening
---MUST be the very first line of the file (no leading whitespace or blank lines).The closing
---MUST be on its own line.The content between the delimiters MUST be valid YAML.
If a file does not begin with
---, it has no frontmatter. The entire file is treated as body content, and the record has an empty frontmatter mapping ({}).
Examples of files without frontmatter:
# Just a heading
No frontmatter here.
---
This is not frontmatter because there's a blank line before the dashes.
---
3.2 YAML Parsing Requirements
Top-Level Structure
The frontmatter MUST parse as a YAML mapping (object/dictionary).
If the frontmatter parses as a different YAML type (scalar, list, null), implementations MUST treat it as invalid frontmatter and handle according to the validation level:
off: Treat as empty frontmatter, log warningwarn: Treat as empty frontmatter, emit warningerror: Fail the operation
Invalid example:
---
- item1
- item2
---
This is a YAML list, not a mapping, and is invalid frontmatter.
YAML Version
Implementations SHOULD support YAML 1.2. Implementations MAY support YAML 1.1 for compatibility with existing tools, but SHOULD prefer 1.2 semantics where they differ.
Character Encoding
Files MUST be UTF-8 encoded. Implementations MUST reject files with invalid UTF-8 sequences.
3.3 Null and Empty Value Semantics
Correct handling of null and empty values is critical for interoperability. This section defines canonical behavior that all implementations MUST follow.
Reading Null Values
The following YAML patterns MUST all be interpreted as null:
# Explicit null keyword
field1: null
field2: Null
field3: NULL
# Tilde (YAML null alias)
field4: ~
# Empty value (key with no value)
field5:
# Explicit empty (flow style)
field6:
All of the above result in field having the value null.
Empty String vs Null
An empty string is distinct from null:
# This is null:
empty_null:
# This is an empty string (not null):
empty_string: ""
# This is also an empty string:
empty_quoted: ''
Implementations MUST preserve this distinction. A field with value "" is a present field with an empty string value. A field with value null (or empty) is a present field with no value.
Missing vs Null
A missing field (key not present) is distinct from a null field (key present with null value):
---
present_null: null
present_string: "hello"
# 'missing_field' is not here
---
present_nullis present with value nullpresent_stringis present with value "hello"missing_fieldis missing (not present at all)
This distinction matters for:
- The
exists()function in expressions (returnstruewhen key is present, even if null) - The
isEmpty()method (returnstruewhen value is null, empty, or missing) - The
requiredconstraint (requires present and non-null) - Default value application (applies to missing, not to null)
Summary Table
| YAML | Parsed Value | exists(field) |
Satisfies required? (before defaults) |
|---|---|---|---|
field: null |
null | true | No |
field: ~ |
null | true | No |
field: |
null | true | No |
field: "" |
"" (empty string) |
true | Yes (string value) |
| (key absent) | undefined | false | No |
Presence vs Meaningful Value
exists(field)is true when the key is present in raw persisted frontmatter, even if the value isnull.field.isEmpty()is true when the value isnull, empty, or missing.required: truerequires the key to be present in the effective frontmatter and the value to be non-null (see §9.2.1).
Implementations MUST preserve these distinctions in validation and query evaluation.
3.4 Writing Frontmatter
When implementations write or update frontmatter, they MUST follow these rules to ensure consistency and avoid ambiguity.
Never Write Empty-Value Nulls
Implementations MUST NOT write the "empty value" null form:
# ❌ NEVER write this
field:
This form is ambiguous in some contexts and causes issues with YAML tools that normalize whitespace.
Writing Null Values
When a field's value is null and the field should be written, implementations MUST use one of:
Option 1: Explicit null (preferred when preserving the field)
field: null
Option 2: Omit the field entirely (preferred when null means "no value")
# field is simply not present
The choice between these options is controlled by settings.write_nulls:
write_nulls |
Behavior |
|---|---|
"omit" (default) |
Omit fields with null values |
"explicit" |
Write field: null |
Writing Empty Strings
Empty strings MUST be written with explicit quotes:
field: ""
Writing Empty Lists
Empty lists can be written as [] or omitted, controlled by settings.write_empty_lists:
write_empty_lists |
Behavior |
|---|---|
true (default) |
Write field: [] |
false |
Omit fields with empty list values |
3.5 Formatting Preservation
When updating a file, implementations SHOULD preserve as much of the original formatting as practical:
SHOULD Preserve
- Field order: Keep fields in their original order when possible
- Blank lines: Preserve blank lines within frontmatter (YAML allows them)
- String style: If a string was written with quotes, keep the quotes
- Comment proximity: Keep comments near their associated fields
MUST Preserve
- Body content: The body MUST NOT be modified by frontmatter updates (except when explicitly changing the body)
- Line endings: Preserve the file's line ending style (LF vs CRLF)
MAY Normalize
Implementations MAY normalize:
- Indentation (2 spaces is conventional)
- Trailing whitespace
- Final newline (files SHOULD end with a newline)
3.6 Special Characters in Field Names
Field names containing special characters MUST be quoted in YAML:
"field-with-dashes": value
"field.with.dots": value
"field:with:colons": value
In expressions and queries, such fields are accessed with bracket notation:
note["field-with-dashes"]
Implementations SHOULD avoid requiring special characters in schema-defined field names. User-defined fields may use them.
3.7 Multi-line Strings
YAML supports several multi-line string formats. Implementations MUST support all standard YAML multi-line syntaxes:
Literal block (preserves newlines):
description: |
This is a multi-line string.
Line breaks are preserved.
Folded block (newlines become spaces):
description: >
This is a long line that will be
folded into a single line.
Quoted strings with escapes:
description: "Line 1\nLine 2"
When writing multi-line values, implementations SHOULD use literal block style (|) for readability.
3.8 Type Coercion
YAML has automatic type inference that can cause surprises. Implementations MUST be aware of these patterns:
| YAML Value | YAML Type | Notes |
|---|---|---|
true, false |
Boolean | |
yes, no |
Boolean (YAML 1.1) | Avoid; prefer true/false |
on, off |
Boolean (YAML 1.1) | Avoid; prefer true/false |
123 |
Integer | |
12.5 |
Float | |
1e10 |
Float (scientific) | |
0x1A |
Integer (hex) | |
.inf, -.inf |
Float (infinity) | |
.nan |
Float (NaN) | |
2024-01-15 |
Date (if YAML date extension) | Implementations MAY parse as date |
null, ~ |
Null | |
"123" |
String | Quoted values are strings |
When schema specifies a type, implementations MUST coerce compatible values (e.g., reading 123 for a string field as "123"). When coercion is not possible, it is a validation error.
3.9 Example: Round-Trip Preservation
Given this input file:
---
title: My Task
status: open
tags:
- important
- review
due_date: 2024-03-15
notes: |
This is a longer note.
It spans multiple lines.
---
# Task Details
The body content here.
After updating status to "done", the output SHOULD be:
---
title: My Task
status: done
tags:
- important
- review
due_date: 2024-03-15
notes: |
This is a longer note.
It spans multiple lines.
---
# Task Details
The body content here.
Note that:
- Field order is preserved
- Multi-line string style is preserved
- List formatting is preserved
- Body content is unchanged
4. Configuration
This section defines the structure and semantics of the mdbase.yaml configuration file that identifies and configures a collection.
4.0 File Encoding
The mdbase.yaml configuration file and all type definition files MUST be encoded in UTF-8 (consistent with the UTF-8 requirement for markdown files in §3.2).
4.1 File Location and Format
The configuration file MUST be named mdbase.yaml and MUST be located at the collection root. This file:
- Identifies the directory as a collection
- Specifies the schema version
- Configures collection behavior
- Points to the types folder
The file MUST be valid YAML and MUST parse as a mapping at the top level.
4.2 Minimal Configuration
The simplest valid configuration declares only the specification version:
spec_version: "0.1.0"
This creates a collection with all default settings and no types (all files are untyped).
4.3 Full Configuration Schema
# =============================================================================
# REQUIRED
# =============================================================================
# Specification version this configuration conforms to
# Implementations MUST reject versions they do not support
spec_version: "0.1.0"
# =============================================================================
# OPTIONAL: Collection Metadata
# =============================================================================
# Human-readable name for the collection
name: "My Project Tasks"
# Description of the collection's purpose
description: "Task and note management for the My Project initiative"
# =============================================================================
# OPTIONAL: Settings
# =============================================================================
settings:
# ---------------------------------------------------------------------------
# File Discovery
# ---------------------------------------------------------------------------
# Additional file extensions to treat as markdown (beyond .md which is always included)
# Default: []
# Common additions: ["mdx", "markdown"]
# Entries MAY include a leading dot; implementations MUST normalize to no-dot.
extensions: ["mdx"]
# Paths to exclude from scanning (relative to collection root)
# Default: [".git", "node_modules", ".mdbase"]
# Glob patterns are supported
exclude:
- ".git"
- "node_modules"
- ".mdbase"
- "drafts/**"
- "*.draft.md"
# Whether to scan subdirectories recursively
# Default: true
include_subfolders: true
# ---------------------------------------------------------------------------
# Types Configuration
# ---------------------------------------------------------------------------
# Folder containing type definition files (relative to collection root)
# Default: "_types"
types_folder: "_types"
# Frontmatter keys that explicitly declare a file's type(s)
# If a file has any of these keys, its value determines the type(s)
# Default: ["type", "types"]
explicit_type_keys: ["type", "types"]
# ---------------------------------------------------------------------------
# Validation
# ---------------------------------------------------------------------------
# Default validation level for operations
# "off": No validation
# "warn": Report issues but don't fail
# "error": Report issues and fail operations
# Default: "warn"
default_validation: "warn"
# Default strictness for types that don't specify their own
# false: Extra fields allowed
# true: Extra fields cause validation failure
# "warn": Extra fields allowed but emit warning
# Default: false
default_strict: false
# ---------------------------------------------------------------------------
# Link Resolution
# ---------------------------------------------------------------------------
# Field name used as unique identifier for link resolution
# When a link is a simple name (no path), implementations search for files
# where this field matches the link target
# Default: "id"
id_field: "id"
# ---------------------------------------------------------------------------
# Write Behavior
# ---------------------------------------------------------------------------
# How to handle null values when writing frontmatter
# "omit": Don't write fields with null values
# "explicit": Write as `field: null`
# Default: "omit"
write_nulls: "omit"
# Whether to write empty lists
# true: Write as `field: []`
# false: Omit fields with empty list values
# Default: true
write_empty_lists: true
# ---------------------------------------------------------------------------
# Rename Behavior
# ---------------------------------------------------------------------------
# Whether to update references across the collection when a file is renamed
# Default: true
rename_update_refs: true
# ---------------------------------------------------------------------------
# Caching
# ---------------------------------------------------------------------------
# Folder for cache files (relative to collection root)
# Default: ".mdbase"
cache_folder: ".mdbase"
4.4 Setting Details
`spec_version` (Required)
The version of this specification the configuration conforms to. Implementations MUST check this value and MUST reject configuration files with versions they do not support.
Valid values: "0.1.0"
Compatibility: Implementations MAY accept "0.1" as an alias for "0.1.0", but
SHOULD emit a warning and normalize to "0.1.0" when writing.
4.4.1 Version Compatibility
The spec_version field uses semantic versioning (MAJOR.MINOR.PATCH):
PATCH bumps (e.g., 0.1.0 → 0.1.1): Clarifications and errata only. No behavioral changes. All implementations of X.Y.z are compatible with any X.Y.z′.
MINOR bumps (e.g., 0.1.0 → 0.2.0): Additive changes only — new optional fields, new expression functions, new config keys. Implementations of X.Y MUST ignore unknown config keys introduced in X.Y+1 rather than rejecting them. Collections authored for X.Y work on X.Y+N without modification.
MAJOR bumps (e.g., 0.x → 1.0): Breaking changes. Implementations MUST reject configuration files with a different major version than the one they support.
During the 0.x series: MINOR bumps MAY contain breaking changes. Implementations SHOULD treat 0.x and 0.y (x ≠ y) as potentially incompatible.
Unknown keys: Implementations MUST ignore unknown keys under settings with a warning, to support forward compatibility within a major version. Unknown top-level keys (outside settings) MUST also be ignored with a warning.
`name` and `description`
Human-readable metadata about the collection. These have no semantic effect but are useful for documentation and tooling that displays collection information.
`settings.extensions`
File extensions to scan. The extension .md is always implicitly included. This setting specifies additional extensions beyond .md:
Default: []
Normalization:
- Implementations MUST treat entries with or without a leading dot as equivalent.
- The
.mdextension is always implicitly included and MUST NOT be required in this list. - If
mdor.mdappears inextensions, it SHOULD be ignored with a warning.
Example: To include MDX files:
settings:
extensions: ["mdx"]
`settings.exclude`
Paths or glob patterns to exclude from file scanning. Paths are relative to the collection root.
Default: [".git", "node_modules", ".mdbase"]
Glob patterns:
*matches any characters except/**matches any characters including/?matches a single character
Example:
settings:
exclude:
- ".git"
- "node_modules"
- "*.draft.md" # Exclude all draft files
- "archive/**" # Exclude everything in archive/
`settings.types_folder`
The folder containing type definition files. Type files are markdown files whose frontmatter defines a schema.
Default: "_types"
The types folder:
- Is automatically excluded from the regular file scan
- Is scanned separately to load type definitions
- May contain subdirectories (all
.mdfiles are processed)
`settings.explicit_type_keys`
Frontmatter keys that can explicitly declare a file's type(s). When a file has one of these keys, its value determines the type assignment, overriding any match rules.
Default: ["type", "types"]
Usage:
# Single type
type: task
# Multiple types
types: [task, urgent]
Using type as a normal field: If you want a frontmatter field named type to be treated as ordinary data, remove it from settings.explicit_type_keys and choose different declaration keys (e.g., kind, kinds).
`settings.default_validation`
The default validation level applied when not otherwise specified.
| Value | Behavior |
|---|---|
"off" |
No validation performed |
"warn" |
Validation issues are reported but operations succeed |
"error" |
Validation issues cause operations to fail |
Default: "warn"
`settings.default_strict`
Default strictness mode for types that don't declare their own.
| Value | Behavior |
|---|---|
false |
Unknown fields are allowed |
"warn" |
Unknown fields are allowed but trigger warnings |
true |
Unknown fields cause validation failure |
Default: false
`settings.id_field`
The field name used as a unique identifier for link resolution. When a link is a simple name (not a path), implementations search for files where this field matches.
Uniqueness requirement: Values of the id_field MUST be unique across the collection.
Implementations MUST validate uniqueness and report duplicate_id issues when multiple
files share the same id_field value.
Default: "id"
Example: With id_field: "id", the link [[task-001]] would resolve to a file with id: task-001 in its frontmatter.
`settings.write_nulls`
Controls how null values are written to frontmatter.
| Value | Behavior |
|---|---|
"omit" |
Fields with null values are not written |
"explicit" |
Null values are written as field: null |
Default: "omit"
`settings.rename_update_refs`
Whether renaming a file automatically updates references to it across the collection.
Default: true
When enabled, implementations MUST update:
- Link fields in frontmatter that resolve to the renamed file
- Link syntax in body content that references the renamed file
See Operations for details.
4.5 Configuration Validation
Implementations MUST validate the configuration file before processing the collection. Validation checks:
- Structure: The file parses as valid YAML with a mapping at the top level
- Required fields:
spec_versionis present - Type correctness: Each field has the expected type
- Valid values: Enum fields have allowed values
- Path validity: Paths in
exclude,types_folder, etc. are syntactically valid
If validation fails, implementations MUST NOT process the collection and MUST report the error clearly.
4.6 Configuration Examples
Minimal
spec_version: "0.1.0"
Standard Project
spec_version: "0.1.0"
name: "Project Documentation"
description: "Specs, decisions, and meeting notes"
settings:
exclude:
- ".git"
- "node_modules"
- "drafts/**"
default_validation: "error"
Knowledge Base with Custom Types Folder
spec_version: "0.1.0"
name: "Personal Knowledge Base"
settings:
types_folder: "schemas"
extensions: ["mdx"]
default_strict: "warn"
id_field: "uid"
Strict Validation
spec_version: "0.1.0"
name: "Production Data"
settings:
default_validation: "error"
default_strict: true
write_nulls: "explicit"
4.7 Environment Variables (Optional)
Implementations MAY support environment variable substitution in configuration values using ${VAR} syntax (and MAY also support ${VAR:-default} for default values):
settings:
cache_folder: "${MDBASE_CACHE:-/tmp/mdbase}"
This feature is OPTIONAL. If not supported, implementations MUST treat ${...} as literal strings. If ${VAR:-default} is not supported, implementations MUST treat the entire string literally (no partial expansion).
4.8 Security Considerations
Regular Expressions
Match rules, field constraints (pattern), and expressions (the matches operator) may contain regular expressions.
Required baseline: Implementations MUST support ECMAScript (ES2018+) regular expression syntax as the baseline flavor. This aligns with JavaScript-based tools (e.g., Obsidian) and is available in every major programming language.
Required features (MUST support):
| Feature | Syntax | Example |
|---|---|---|
| Character classes | [abc], [^abc], \d, \w, \s |
\d{4} |
| Quantifiers | *, +, ?, {n}, {n,m} |
\w+ |
| Alternation | | |
cat|dog |
| Anchors | ^, $ |
^TASK- |
| Capturing groups | (...) |
(\d+)-(\d+) |
| Non-capturing groups | (?:...) |
(?:foo|bar) |
| Lookahead | (?=...), (?!...) |
\d+(?= items) |
Optional features (SHOULD support):
| Feature | Syntax | Notes |
|---|---|---|
| Lookbehind | (?<=...), (?<!...) |
Supported in ES2018 but not in RE2-based engines |
| Named groups | (?<name>...) |
Supported in ES2018 but not in RE2-based engines |
Implementations that do not support optional features MUST reject patterns using those features with a clear error rather than silently ignoring them.
ReDoS mitigations: Implementations SHOULD guard against Regular Expression Denial of Service (ReDoS) by:
- Setting timeouts on regex evaluation
- Rejecting patterns with known dangerous constructs (e.g., nested quantifiers)
- Documenting any regex restrictions
Environment Variables
If an implementation supports environment variable expansion in configuration (e.g., ${VAR}), it MUST:
- Only expand variables explicitly referenced in configuration
- Never log expanded values that may contain secrets
- Document which config fields support expansion
Expression Evaluation
Implementations SHOULD set resource limits on expression evaluation:
- Maximum expression nesting depth
- Maximum number of function calls per evaluation
- Timeout for individual expression evaluations
These limits prevent pathological expressions from consuming unbounded resources.
5. Types
This section defines how types (schemas) are created, structured, and interpreted. In this specification, types are markdown files—they live in a designated folder, have frontmatter that defines the schema, and body content that provides documentation.
5.1 Types as Markdown Files
A type is defined by a markdown file in the types folder (default: _types/). The file's frontmatter contains the schema definition; the body contains documentation for the type.
Example: _types/task.md
---
name: task
description: A task or todo item with status tracking
extends: base
strict: false
fields:
title:
type: string
required: true
description: Short summary of the task
status:
type: enum
values: [open, in_progress, blocked, done]
default: open
priority:
type: integer
min: 1
max: 5
default: 3
due_date:
type: date
tags:
type: list
items:
type: string
default: []
assignee:
type: link
target: person
---
# Task
A task represents a discrete unit of work that can be tracked through its lifecycle.
## Status Values
- **open**: Not yet started
- **in_progress**: Currently being worked on
- **blocked**: Cannot proceed due to external dependency
- **done**: Completed
## Usage
Tasks are typically stored in the `tasks/` folder. Example:
```yaml
---
type: task
title: Fix the login bug
status: in_progress
priority: 4
due_date: 2024-03-15
assignee: "[[alice]]"
tags: [bug, auth]
---
The login form throws an error when...
Related Types
This approach has several benefits:
1. **Documentation lives with the schema**: The markdown body explains how to use the type
2. **Version control friendly**: Types are tracked like any other content
3. **Human readable**: Anyone can understand the type by reading the file
4. **Editable anywhere**: No special tooling required to modify schemas
5. **Meta-consistency**: Types use the same format as the content they describe
---
## 5.2 Type Definition Schema
The frontmatter of a type file MUST conform to this structure:
```yaml
# =============================================================================
# REQUIRED
# =============================================================================
# The type name (must match filename without extension)
name: task
# =============================================================================
# OPTIONAL: Metadata
# =============================================================================
# Human-readable description
description: "A task or todo item"
# Type to inherit fields from
extends: base
# Strictness mode (overrides settings.default_strict)
# false: Allow unknown fields
# true: Reject unknown fields
# "warn": Allow but warn about unknown fields
strict: false
# =============================================================================
# OPTIONAL: Matching Rules
# =============================================================================
# Rules for automatically associating files with this type
# If not specified, files must explicitly declare their type
match:
path_glob: "tasks/**/*.md"
fields_present: [status]
where:
# Field predicates using expression operators
tags:
contains: "task"
# =============================================================================
# OPTIONAL: Filename Pattern
# =============================================================================
# Pattern for validating/generating filenames
# Variables in {} reference field values
filename_pattern: "{id}.md"
# =============================================================================
# REQUIRED (unless extends provides all fields)
# =============================================================================
# Field definitions
fields:
field_name:
type: string
required: false
# ... field options
5.3 The `name` Field
Every type MUST have a name field that matches the filename (without extension).
_types/task.md → name: task
_types/person.md → name: person
If the name doesn't match the filename, implementations MUST emit a warning and use the name value as the canonical type name.
Names MUST:
- Consist of lowercase letters, numbers, hyphens, and underscores
- Start with a letter
- Not exceed 64 characters
Type names are canonicalized to lowercase. Implementations SHOULD treat
type names case-insensitively when reading frontmatter (type/types)
and SHOULD normalize them to lowercase for matching and output while
emitting a warning for non-canonical casing.
Reserved names (MUST NOT be used):
- Names starting with
_(reserved for internal use) file,formula,this(reserved keywords in expressions)
5.4 Type Inheritance
Types MAY inherit from another type using the extends field:
# _types/base.md
---
name: base
fields:
id:
type: string
required: true
created_at:
type: datetime
generated: now
updated_at:
type: datetime
generated: now_on_write
---
# _types/task.md
---
name: task
extends: base
fields:
title:
type: string
required: true
status:
type: enum
values: [open, done]
---
The task type inherits id, created_at, and updated_at from base, and adds title and status.
Inheritance Rules
- Single inheritance only: A type can extend at most one parent type
- Chains allowed:
taskextendsbaseextendsrootis valid - Field override: Child fields with the same name override parent fields completely
- Circular inheritance: MUST be detected and rejected with an error
- Missing parent: If the parent type doesn't exist, validation MUST fail
- Strictness: Child inherits parent's
strictunless explicitly overridden
Field Override Example
# Parent defines priority as 1-3
# _types/base-task.md
fields:
priority:
type: integer
min: 1
max: 3
# Child redefines priority as 1-5
# _types/task.md
extends: base-task
fields:
priority:
type: integer
min: 1
max: 5 # Now allows 4 and 5
The child completely replaces the parent's field definition; constraints are not merged.
5.5 Strictness
The strict field controls how unknown fields are handled during validation:
| Value | Behavior |
|---|---|
false |
Unknown fields are allowed without warning |
"warn" |
Unknown fields are allowed but trigger warnings |
true |
Unknown fields cause validation failure |
Default: Inherits from settings.default_strict in the config (which defaults to false).
"Unknown fields" are fields in a file's frontmatter that are not defined in the type's schema (including inherited fields).
5.6 Filename Patterns
The optional filename_pattern defines expected filename structure:
filename_pattern: "{id}-{slug}.md"
Patterns use {} to reference field values. Common placeholders:
{id}: The id field value{slug}: A URL-safe slug (implementations should slugify automatically){date}: A date field formatted as YYYY-MM-DD
Use cases:
- Validation: Check that existing filenames match the pattern
- Generation: When creating new files, derive filename from field values
Slugification rules:
- Lowercase all characters
- Replace spaces and special characters with hyphens
- Remove consecutive hyphens
- Trim hyphens from start and end
- Unicode handling: Implementations MUST use Unicode-aware lowercasing (not locale-dependent). Non-ASCII letters SHOULD be transliterated to their ASCII equivalents where a well-known mapping exists (e.g.,
ü→u,ñ→n). Characters with no ASCII equivalent SHOULD be removed rather than replaced with hyphens.
5.7 Type Loading Order
When loading types from the types folder:
- Scan all
.mdfiles in the types folder (including subdirectories) - Parse each file's frontmatter
- Build a dependency graph based on
extendsrelationships - Detect and reject circular dependencies
- Load types in dependency order (parents before children)
- Merge inherited fields into each type's effective schema
5.8 Built-in vs User Types
This specification does not define built-in types. All types are user-defined via markdown files.
However, implementations MAY provide starter templates for common types (task, note, person, etc.) that users can copy into their types folder and customize.
5.9 Creating Type Files Programmatically
Implementations MUST provide a way to create type definition files. This is a normal write operation that:
- Validates the type definition schema
- Checks for name conflicts with existing types
- Writes the file to the types folder
- Reloads the types registry
Example CLI interaction:
# Create a new type interactively
mdbase type create
# Create with a template
mdbase type create --from-template task
# Scaffold from an existing file's frontmatter
mdbase type create --infer-from notes/example.md
5.10 Type Documentation (Body Content)
The body of a type file is documentation. It has no semantic effect on the schema but SHOULD explain:
- The purpose of the type
- How to use each field
- Example files
- Relationships with other types
- Best practices
Implementations MAY render this documentation in tooling (e.g., showing field help, type browser).
5.11 Schema Evolution
When a type definition changes, existing files are NOT automatically migrated — files are the source of truth. The following rules define what happens for each kind of schema change:
Field added (optional): Existing files without the field remain valid. The field is undefined (not null) until explicitly set. If a default is specified in the type definition, it applies to the effective value at read/query/validation time.
Field added (required): Existing files without the field fail validation. Implementations MUST report missing_required errors. Users must add the field to affected files manually or via batch update.
Field removed: Existing files with the removed field are treated as having an unknown field. Behavior depends on the type's strict setting (see §5.5). No data is deleted from files.
Field type changed: Existing files with values of the old type fail validation with type_mismatch. No automatic coercion of persisted data is performed.
Field renamed: The specification does not track field renames — this is equivalent to removing one field and adding another. Implementations MAY provide a batch rename tool as a convenience.
Type renamed: Existing files with type: old_name fail type matching. Implementations MUST provide a batch update command to update type declarations across files.
Inheritance changed: The effective schema is recomputed. Fields gained from a new parent apply the same rules as "field added." Fields lost apply the same rules as "field removed."
Materializing defaults: Defaults are not required to be persisted to disk. Implementations MAY provide a flag to write default values on create/update; if a default is written, it MUST equal the declared default at the time of the write.
Migration strategy: Validation is the migration mechanism. Run validation on the collection after schema changes, review reported errors, and fix affected files.
5.12 Computed Fields
Type definitions MAY include fields with a computed property containing an expression:
fields:
full_name:
type: string
computed: "first_name + ' ' + last_name"
overdue:
type: boolean
computed: "due_date < today() && status != 'done'"
Rules
- Computed fields are evaluated at read time and are NOT persisted to the file
- They are available in queries, formulas, and expressions like any other field
- Computed fields MUST NOT be
required(they are always derived) - Computed fields MUST NOT have
defaultorgenerated— these are mutually exclusive mechanisms - If a file contains a frontmatter key matching a computed field name, the persisted value is ignored and the computed value takes precedence. Implementations SHOULD emit a warning
- If a type definition has both
computedandrequired: trueon a field, implementations MUST reject the type definition with aninvalid_type_definitionerror
Conformance
Computed fields are a Level 3 (Querying) capability. Implementations below Level 3 MUST still load type definitions containing computed fields without error, but MUST ignore the computed property and treat the field as a regular (non-computed) field. This ensures type definitions are portable across conformance levels.
Evaluation Order
Non-computed fields are resolved first, then computed fields in dependency order. Computed fields MAY reference other computed fields, which are resolved via dependency ordering.
Circular computed field dependencies MUST be detected and rejected with a circular_computed error.
Inheritance
Computed fields from parent types are inherited and MAY be overridden by child types.
Example
# _types/task.md
---
name: task
fields:
first_name:
type: string
last_name:
type: string
full_name:
type: string
computed: "first_name + ' ' + last_name"
due_date:
type: date
status:
type: enum
values: [open, in_progress, done]
is_overdue:
type: boolean
computed: "due_date < today() && status != 'done'"
---
5.13 Complete Type File Example
---
name: meeting-note
description: Notes from a meeting
extends: base
strict: "warn"
match:
path_glob: "meetings/**/*.md"
fields_present: [date, attendees]
filename_pattern: "{date}-{title}.md"
fields:
title:
type: string
required: true
description: Meeting title or topic
date:
type: date
required: true
description: Date the meeting occurred
attendees:
type: list
items:
type: link
target: person
min_items: 1
description: People who attended
agenda:
type: list
items:
type: string
default: []
description: Planned discussion topics
decisions:
type: list
items:
type: object
fields:
topic:
type: string
required: true
decision:
type: string
required: true
owner:
type: link
target: person
default: []
description: Decisions made during the meeting
action_items:
type: list
items:
type: link
target: task
default: []
description: Tasks created from this meeting
next_meeting:
type: date
description: Scheduled follow-up date
---
# Meeting Note
Meeting notes capture discussions, decisions, and action items from team meetings.
## Required Fields
- **title**: A short, descriptive title (e.g., "Q1 Planning", "Design Review")
- **date**: When the meeting occurred (YYYY-MM-DD format)
- **attendees**: At least one person must be linked
## Decisions Format
Decisions are structured objects with:
- `topic`: What was being decided
- `decision`: The outcome
- `owner`: Who is responsible for follow-through
```yaml
decisions:
- topic: API versioning strategy
decision: Use URL path versioning (/v1/, /v2/)
owner: "[[alice]]"
Linking to Tasks
Action items should be created as separate task files and linked:
action_items:
- "[[tasks/update-api-docs]]"
- "[[tasks/create-v2-endpoints]]"
Example
---
type: meeting-note
title: Sprint Planning
date: 2024-03-01
attendees:
- "[[alice]]"
- "[[bob]]"
- "[[charlie]]"
agenda:
- Review last sprint
- Estimate new stories
- Assign work
decisions:
- topic: Sprint length
decision: Keep 2-week sprints
owner: "[[alice]]"
action_items:
- "[[tasks/story-123]]"
next_meeting: 2024-03-15
---
## Discussion
Sprint velocity was 42 points last sprint...
6. Type Matching
This section defines how files are associated with types. Unlike traditional schemas where each record belongs to exactly one table, this specification supports multi-type matching: a file may match zero, one, or multiple types simultaneously.
6.1 Matching Overview
Type matching determines which types apply to a file. This happens:
- When reading a file (to know which schemas to validate against)
- When querying (to filter by type)
- When updating (to apply type-specific validation)
A file's types are determined by:
- Explicit declaration (highest precedence): If the file's frontmatter contains a type key (e.g.,
type: task), only those declared types apply - Match rules: If no explicit declaration, each type's
matchrules are evaluated; all matching types apply - Untyped: If nothing matches, the file is untyped
6.2 Explicit Type Declaration
Files can explicitly declare their type(s) using frontmatter keys defined in settings.explicit_type_keys (default: type and types).
Type names SHOULD be treated case-insensitively when read from frontmatter and
normalized to lowercase for matching; non-canonical casing SHOULD emit a warning.
If you want to use a field like type as ordinary data, remove it from
settings.explicit_type_keys and use different keys (e.g., kind, kinds) for
type declarations.
Single Type
---
type: task
title: Fix the bug
---
This file is a task and only task. Match rules are not evaluated.
Multiple Types
---
types: [task, urgent]
title: Fix critical security bug
---
This file is both a task and an urgent record. It must validate against both schemas.
Precedence
If both type and types are present, implementations SHOULD prefer types (the plural form).
---
type: task # Ignored when types is present
types: [task, bug] # This is used
---
6.3 Match Rules
Types can define rules for automatically associating files without explicit declaration. Match rules are specified in the type's match field:
# _types/task.md
---
name: task
match:
path_glob: "tasks/**/*.md"
fields_present: [status, due_date]
where:
status:
exists: true
priority:
gte: 1
---
All conditions in match are combined with AND logic—all must be true for the type to match.
6.4 Match Conditions
`path_glob`
Matches files by their path relative to the collection root.
match:
path_glob: "tasks/**/*.md"
Glob syntax:
*matches any characters except/**matches any characters including/?matches a single character
Examples:
| Pattern | Matches |
|---|---|
tasks/*.md |
tasks/foo.md, not tasks/sub/foo.md |
tasks/**/*.md |
Any .md in tasks/ or subdirectories |
*.task.md |
foo.task.md, bar.task.md |
notes/2024-*.md |
notes/2024-01.md, notes/2024-12.md |
`fields_present`
Matches files that have all specified fields present and non-null.
match:
fields_present: [status, assignee]
A field is "present" for matching purposes if:
- The key exists in frontmatter, AND
- The value is not
null
Note: This is stricter than the exists() expression function, which returns true even for null values. The fields_present match condition requires a meaningful (non-null) value.
`where`
Matches files based on field value conditions. This uses a subset of the expression language operators:
match:
where:
# Exact equality
kind: "task"
# Field exists and is non-null
status:
exists: true
# Comparison operators
priority:
gte: 3
# List contains
tags:
contains: "important"
# String prefix
title:
startsWith: "URGENT:"
Available operators in where:
| Operator | Description | Example |
|---|---|---|
| (direct value) | Exact equality | status: open |
exists |
Field is present (true) or missing (false) | assignee: { exists: true } |
eq |
Equal to | priority: { eq: 3 } |
neq |
Not equal to | status: { neq: "done" } |
gt |
Greater than | priority: { gt: 2 } |
gte |
Greater than or equal | priority: { gte: 3 } |
lt |
Less than | priority: { lt: 4 } |
lte |
Less than or equal | priority: { lte: 5 } |
contains |
List contains value | tags: { contains: "bug" } |
containsAll |
List contains all values | tags: { containsAll: ["bug", "urgent"] } |
containsAny |
List contains any value | tags: { containsAny: ["bug", "feature"] } |
startsWith |
String starts with | title: { startsWith: "WIP:" } |
endsWith |
String ends with | file: { endsWith: ".draft.md" } |
matches |
Regex match (see §4.8 for regex flavor) | title: { matches: "^TASK-\\d+" } |
Match `where` vs Query `where`
The match rule where clause uses a YAML-structured form with operator keys:
# Match rule where (YAML-structured)
match:
where:
status:
neq: "done"
The query where clause (see Querying §10.3) uses expression strings:
# Query where (expression string)
where: 'status != "done"'
These are two distinct syntaxes. Match rules use the structured form because they are evaluated during type matching (before the expression engine is available). Query where clauses use expression strings for greater flexibility.
6.5 Multi-Type Matching
When a file matches multiple types (whether by explicit declaration or match rules), the file must conform to all matched types.
Validation
The file is validated against each type's schema. All validations must pass:
# File: tasks/urgent-bug.md
---
types: [task, urgent]
title: Fix login
status: open
escalation_contact: alice@example.com
---
This file must:
- Have all required fields from
task - Satisfy all constraints from
task - Have all required fields from
urgent - Satisfy all constraints from
urgent
Field Conflicts
When two types define the same field differently:
Compatible definitions occur when both types define the same field with the same base type. Constraints are merged by taking the most restrictive intersection:
| Constraint | Merge Rule |
|---|---|
required |
true if EITHER type requires it |
min / min_length / min_items |
Take the higher minimum |
max / max_length / max_items |
Take the lower maximum |
pattern |
Value must match all patterns |
values (enum) |
Take the intersection of allowed values |
default |
If both define defaults, they MUST be equal; otherwise it is an error |
unique (list) |
true if EITHER type requires it |
unique (cross-file) |
true if EITHER type requires it |
# Type A: priority as integer 1-5
# Type B: priority as integer 1-3
# Effective: priority as integer 1-3 (most restrictive)
Composite and advanced constraints:
| Constraint | Merge Rule |
|---|---|
list.items |
Item schemas MUST be compatible and are merged recursively using these same rules |
object.fields |
Fields are merged by name; overlapping fields are merged recursively |
generated |
If both define generated, the values MUST be identical; otherwise error |
deprecated |
true if EITHER type marks the field deprecated |
link.target |
If both define target, they MUST be identical; otherwise error |
link.validate_exists |
true if EITHER type sets it to true |
Incompatible definitions occur when:
- The base types differ (e.g.,
stringvsinteger) - Enum intersections produce an empty set
- Merged min exceeds merged max
# Type A: status as string
# Type B: status as enum [open, closed]
# Incompatible: different base types → validation error
When field types are incompatible, implementations MUST report a type_conflict error. The file cannot satisfy both schemas simultaneously.
Querying
A multi-type file appears in queries for ANY of its matched types:
# Query for tasks
query:
types: [task]
# Returns files that are tasks (including files that are also other types)
# Query for files that are BOTH task AND urgent
query:
where:
and:
- 'types.contains("task")'
- 'types.contains("urgent")'
6.6 Matching Evaluation Order
Check explicit declaration: If
typeortypesis in frontmatter, use those types exclusively. Stop.Evaluate match rules: For each type with match rules, evaluate all conditions:
- If all conditions pass, the type matches
- A type without match rules never matches implicitly
Collect matches: The file's types are all types that matched in step 2.
Untyped: If no types matched, the file is untyped.
6.7 Match Rule Examples
Path-Based Matching
# _types/task.md
match:
path_glob: "tasks/**/*.md"
All files in tasks/ are tasks.
Field-Based Matching
# _types/actionable.md
match:
fields_present: [due_date]
Any file with a due_date field is actionable.
Tag-Based Matching
# _types/urgent.md
match:
where:
tags:
contains: "urgent"
Any file tagged "urgent" matches this type.
Combined Matching
# _types/active-task.md
match:
path_glob: "tasks/**/*.md"
fields_present: [status, assignee]
where:
status:
neq: "done"
Files in tasks/ with status and assignee fields, where status is not "done".
6.8 Type-Only Files (No Matching)
A type without match rules will never automatically match files. Files must explicitly declare the type:
# _types/template.md
---
name: template
# No match rules
fields:
template_name:
type: string
required: true
---
This type only applies to files that declare type: template or types: [template, ...].
6.9 The `types` Property in Expressions
In expressions, files have a types property (list of strings) representing their matched types:
# Filter: files that are tasks
filters: 'types.contains("task")'
# Filter: files that are both task and urgent
filters: 'types.contains("task") && types.contains("urgent")'
# Filter: files that have no type
filters: "types.length == 0"
6.10 Debugging Type Matching
Implementations SHOULD provide a way to see why a file matched (or didn't match) specific types. For example:
# Show matching analysis for a file
mdbase debug match tasks/fix-bug.md
# Output:
# tasks/fix-bug.md
# ├── Explicit types: none
# ├── Matched types: [task, urgent]
# │ ├── task: matched via path_glob "tasks/**/*.md"
# │ └── urgent: matched via where.tags.contains("urgent")
# └── Unmatched types:
# └── done: failed where.status.eq("done")
7. Field Types and Constraints
This section defines the data types that can be used in type definitions, along with their constraints and validation rules.
7.1 Field Definition Structure
Every field in a type's fields section has this structure:
fields:
field_name:
# Required: the data type
type: string
# Optional: is this field required?
required: false
# Optional: default value if field is missing
default: "untitled"
# Optional: auto-generation strategy
generated: now
# Optional: human-readable description
description: "A brief summary"
# Optional: mark as deprecated
deprecated: false
# Type-specific constraints (see each type below)
7.2 Common Field Options
These options apply to all field types:
`type` (Required)
The data type. One of: string, integer, number, boolean, date, datetime, time, enum, list, object, link, any.
`required`
Whether the field must be present and non-null.
| Value | Behavior |
|---|---|
false (default) |
Field may be missing or null |
true |
Field must be present and non-null |
`default`
A default value applied to the effective value when the field is missing. The default is NOT applied when the field is present but null. Defaults are not required to be persisted unless the caller explicitly requests materialization (see §5.11).
status:
type: enum
values: [open, done]
default: open # Applied only if 'status' key is absent
`generated`
Automatic value generation. See 7.15 Generated Fields.
`description`
Human-readable description of the field's purpose. Implementations MAY display this in tooling.
`deprecated`
Mark a field as deprecated. Implementations SHOULD warn when deprecated fields are used.
`unique`
Cross-file uniqueness constraint.
| Value | Behavior |
|---|---|
false (default) |
No uniqueness checking |
true |
Field value MUST be unique across all files matching the declaring type |
Rules:
- Null/undefined values are exempt from uniqueness checks — multiple files may omit the field
- For multi-type files: uniqueness is checked within each type's file set independently
- Validation of uniqueness requires scanning all files of the type. Implementations SHOULD use caching for performance
- Error code:
duplicate_value, reported with the field name, conflicting file paths, and the duplicate value
Note: settings.id_field implicitly has unique: true behavior (see §4.4). The unique option makes cross-file uniqueness available for any field.
Example:
fields:
slug:
type: string
unique: true
email:
type: string
unique: true
7.3 `string`
A text value.
title:
type: string
required: true
min_length: 1
max_length: 200
pattern: "^[A-Z].*"
Constraints:
| Constraint | Type | Description |
|---|---|---|
min_length |
integer | Minimum string length |
max_length |
integer | Maximum string length |
pattern |
string | Regex pattern the value must match |
Validation:
- Value must be a string (or coercible to string)
- Length constraints apply to character count (not bytes)
- Pattern uses ECMAScript (ES2018+) regular expression syntax as the required baseline (see §4.8 for the full regex specification)
7.4 `integer`
A whole number.
priority:
type: integer
min: 1
max: 5
default: 3
Constraints:
| Constraint | Type | Description |
|---|---|---|
min |
integer | Minimum value (inclusive) |
max |
integer | Maximum value (inclusive) |
Validation:
- Value must be a whole number (no decimal part)
- YAML integers, strings containing integers, and floats with no decimal part MAY be coerced
7.5 `number`
A floating-point number.
rating:
type: number
min: 0.0
max: 5.0
Constraints:
| Constraint | Type | Description |
|---|---|---|
min |
number | Minimum value (inclusive) |
max |
number | Maximum value (inclusive) |
Validation:
- Value must be numeric (integer or float)
- IEEE 754 special values (NaN, Infinity) are allowed unless explicitly constrained
7.6 `boolean`
A true/false value.
draft:
type: boolean
default: false
Validation:
- Accepts YAML boolean values:
true,false - Implementations SHOULD also accept YAML 1.1 boolean spellings:
yes,no,on,off - These should be normalized to
true/falseon write
7.7 `date`
A calendar date without time.
due_date:
type: date
Format: ISO 8601 date: YYYY-MM-DD
Examples: 2024-03-15, 2024-12-01
Validation:
- Must be a valid date (no February 30th)
- String format must match ISO 8601
- YAML date scalars MAY be accepted and MUST be normalized to ISO 8601 on write
7.8 `datetime`
A date with time.
created_at:
type: datetime
Format: ISO 8601 datetime with optional timezone:
YYYY-MM-DDTHH:MM:SSYYYY-MM-DDTHH:MM:SSZYYYY-MM-DDTHH:MM:SS+HH:MM
Examples:
2024-03-15T10:30:002024-03-15T10:30:00Z2024-03-15T10:30:00+05:30
Validation:
- Must be valid datetime
- Implementations MUST preserve timezone information if present
- YAML timestamp scalars MAY be accepted and MUST be normalized to ISO 8601 on write
Timezone Comparison Rules
- Datetime values with explicit offsets are compared as absolute instants (convert to a common epoch before comparison)
- Datetime values WITHOUT offsets (naive) are treated as local time in the implementation's configured timezone
- Comparing an offset-aware datetime with a naive datetime: the naive datetime is interpreted in local time, then both are compared as absolute instants
now()returns an offset-aware datetime in the implementation's local timezonetoday()returns a date in the implementation's local timezone- Date arithmetic preserves offset:
datetime_with_offset + "1d"keeps the same offset - Serialization MUST preserve the original offset if present.
2024-03-15T10:00:00+05:30MUST NOT be normalized to UTC on write - Implementations MAY provide a configuration option for default timezone (not specified in this version — use the local system timezone)
See also §11.7 for date/time functions in expressions.
7.9 `time`
A time without date.
meeting_time:
type: time
Format: HH:MM or HH:MM:SS
Examples: 14:30, 09:00:00
7.10 `enum`
A value from a fixed set of options.
status:
type: enum
values: [draft, review, published, archived]
default: draft
Required constraint:
| Constraint | Type | Description |
|---|---|---|
values |
list | The allowed values (must be strings) |
Validation:
- Value must exactly match one of the
valuesentries - Comparison is case-sensitive
- Enum values MUST be strings
7.11 `list`
An ordered collection of values.
tags:
type: list
items:
type: string
min_items: 0
max_items: 10
unique: true
Required constraint:
| Constraint | Type | Description |
|---|---|---|
items |
field definition | The type of each list element |
Optional constraints:
| Constraint | Type | Description |
|---|---|---|
min_items |
integer | Minimum list length |
max_items |
integer | Maximum list length |
unique |
boolean | If true, no duplicate values allowed |
Validation:
- Value must be a YAML list
- Each element is validated against
items - If
unique: true, duplicates cause validation failure
Nested lists:
matrix:
type: list
items:
type: list
items:
type: number
7.12 `object`
A nested structure with its own fields.
author:
type: object
fields:
name:
type: string
required: true
email:
type: string
url:
type: string
Required constraint:
| Constraint | Type | Description |
|---|---|---|
fields |
mapping | Field definitions for the nested object |
Validation:
- Value must be a YAML mapping
- Each field is validated according to its definition
- Unknown fields are handled according to type's strictness
7.13 `link`
A reference to another file in the collection.
parent_task:
type: link
target: task
validate_exists: false
related:
type: list
items:
type: link
Optional constraints:
| Constraint | Type | Description |
|---|---|---|
target |
string | Type name to constrain resolution scope |
validate_exists |
boolean | If true, validate that target file exists |
Accepted formats:
- Wikilinks:
[[target]],[[target|alias]],[[folder/target]] - Markdown links:
[text](path.md),[text](./relative.md) - Bare paths:
./sibling.md,../parent/file.md
See Links for detailed parsing and resolution rules.
7.14 `any`
Accepts any valid YAML value.
metadata:
type: any
Use cases:
- Migration: Temporarily accept untyped data
- Flexible schemas: When structure varies
- Extension points: Allow arbitrary user data
Validation:
- Any valid YAML value is accepted
- No constraints available
7.15 Generated Fields
Fields can be automatically populated using the generated option:
fields:
id:
type: string
generated: ulid
created_at:
type: datetime
generated: now
updated_at:
type: datetime
generated: now_on_write
slug:
type: string
generated:
from: title
transform: slugify
Generation strategies:
| Strategy | Description |
|---|---|
ulid |
Generate a ULID (Universally Unique Lexicographically Sortable Identifier) |
uuid |
Generate a UUID v4 |
now |
Current datetime (on create only) |
now_on_write |
Current datetime (on every write) |
{from, transform} |
Derive from another field |
Transform functions for derived fields:
| Transform | Description |
|---|---|
slugify |
Convert to URL-safe slug |
lowercase |
Convert to lowercase |
uppercase |
Convert to uppercase |
Important rules:
- Generated values are only applied when the field is missing
- User-provided values are NEVER overwritten by
noworulid/uuid now_on_writeALWAYS updates the field on every write operation- Generated fields can still have
required: true(they'll satisfy the requirement via generation)
7.16 Type Coercion
When reading values, implementations MUST attempt to coerce compatible types:
| Schema Type | Accepts |
|---|---|
string |
Any scalar (converted via toString) |
integer |
Integer, float with no decimal, numeric string |
number |
Integer, float, numeric string |
boolean |
Boolean, "true"/"false" strings, yes/no |
date |
ISO date string, YAML date |
datetime |
ISO datetime string, YAML timestamp |
When coercion fails, it is a validation error.
7.17 Summary Table
| Type | YAML | Constraints | Notes |
|---|---|---|---|
string |
String | min_length, max_length, pattern |
|
integer |
Integer | min, max |
Whole numbers only |
number |
Float/Int | min, max |
Allows decimals |
boolean |
Boolean | — | Normalized to true/false |
date |
String | — | ISO 8601 date |
datetime |
String | — | ISO 8601 datetime |
time |
String | — | HH:MM or HH:MM:SS |
enum |
String | values (required) |
Must match exactly |
list |
List | items (required), min_items, max_items, unique |
|
object |
Mapping | fields (required) |
Nested structure |
link |
String | target, validate_exists |
Reference to file |
any |
Any | — | No validation |
8. Links
Links are references from one file to another. They are a first-class concept in this specification due to their prevalence in markdown-based knowledge systems. This section defines link syntax, parsing, resolution, and traversal.
8.1 Why Links Matter
Links transform a folder of files into a knowledge graph. They enable:
- Navigation: Jump between related documents
- Backlinks: See what references a document
- Queries: Find documents by their relationships
- Validation: Ensure references point to real files
- Refactoring: Rename files without breaking connections
This specification treats links as typed data with well-defined parsing and resolution semantics.
8.2 Link Formats
The link field type accepts three input formats:
8.2.1 Wikilinks
The format popularized by wikis and knowledge management tools:
[[target]]
[[target|alias]]
[[target#anchor]]
[[target#anchor|alias]]
[[folder/target]]
[[./relative]]
[[../parent/target]]
Components:
- target: The file being linked to (without extension by default)
- alias: Display text (does not affect resolution)
- anchor: A heading or block reference within the target
- path: May be absolute (from collection root) or relative (from current file)
Examples:
# Simple link
parent: "[[task-001]]"
# Link with alias (alias is metadata, not used for resolution)
assignee: "[[alice|Alice Smith]]"
# Link to specific section
reference: "[[api-docs#authentication]]"
# Relative link
related: "[[./sibling-task]]"
# Path from root
spec: "[[docs/specs/api-v2]]"
8.2.2 Markdown Links
Standard markdown link syntax:
[text](path.md)
[text](./relative.md)
[text](../other/file.md)
[text](path.md#anchor)
The text portion is treated as an alias.
Examples:
parent: "[Parent Task](./tasks/parent.md)"
reference: "[API Docs](docs/api.md#auth)"
8.2.3 Bare Paths
A path without link syntax:
./sibling.md
../other/file.md
folder/file.md
Examples:
config: "./config.md"
parent: "../parent-project/overview.md"
Bare paths follow the same resolution rules as markdown links: they are relative to the containing file's directory unless they start with / (root-relative).
8.3 Link Parsing
When a link value is read, implementations MUST parse it into a structured representation:
| Component | Type | Description |
|---|---|---|
raw |
string | Original string value exactly as written |
target |
string | File path or identifier (without anchor/alias) |
alias |
string? | Display text if provided, otherwise null |
anchor |
string? | Heading/block reference if provided, otherwise null |
format |
enum | One of: wikilink, markdown, path |
is_relative |
boolean | Whether path starts with ./ or ../ |
Parsing examples:
| Input | target | alias | anchor | format | is_relative |
|---|---|---|---|---|---|
[[task-001]] |
task-001 |
null | null | wikilink | false |
[[task-001|My Task]] |
task-001 |
My Task |
null | wikilink | false |
[[docs/api#auth]] |
docs/api |
null | auth |
wikilink | false |
[[./sibling]] |
./sibling |
null | null | wikilink | true |
[Link](file.md) |
file.md |
Link |
null | markdown | false |
./other.md |
./other.md |
null | null | path | true |
8.4 Link Resolution
Resolution transforms a parsed link into an absolute path (relative to collection root) pointing to the target file.
Resolution Algorithm
Given a link value and the path of the file containing it:
Parse the link into components (target, format, is_relative)
If format is
markdownorpath:- If target starts with
/, resolve from collection root (strip the leading/) - Otherwise, resolve relative to the containing file's directory (markdown-standard behavior)
- Example: Link
[Docs](docs/api.md)innotes/meeting.mdresolves tonotes/docs/api.md
- If target starts with
If format is
wikilink:- If target starts with
./or../, resolve relative to the containing file's directory - If target starts with
/, resolve from collection root (strip the leading/) - If target contains
/(and is not relative), resolve from collection root - Example:
[[docs/api]]resolves todocs/api
- If target starts with
If simple name (no
/, no./or../, and format iswikilink):- Define the search scope:
- If the link field has
targetconstraint specifying a type, scope to files matching that type - Otherwise, scope to the entire collection
- If the link field has
- ID match pass: Search scoped files for
id_field == name- If exactly one match, resolve to it
- If multiple matches, resolution MUST fail with
ambiguous_link
- Filename match pass: If no
id_fieldmatch exists, search scoped files by filename - If multiple filename candidates match, apply tiebreakers in order: a. Same directory: Prefer a file in the same directory as the referring file b. Shortest path: Prefer the file with the shortest path (closest to collection root) c. Alphabetical: Sort candidate paths lexicographically and take the first
- If multiple candidates remain after all tiebreakers, resolve to
nulland emit anambiguous_linkwarning
- Define the search scope:
Extension handling:
- If target lacks extension, try configured extensions in order (default:
.md) - Example:
[[readme]]triesreadme.md,readme.mdx, etc.
- If target lacks extension, try configured extensions in order (default:
Path traversal check:
- After resolution and normalization, if the resolved path would escape the collection root, abort with
path_traversal
- After resolution and normalization, if the resolved path would escape the collection root, abort with
Return:
- The absolute path (relative to collection root) if found
nullif no matching file exists
Resolution Examples
Given collection structure:
/
├── mdbase.yaml
├── tasks/
│ ├── task-001.md
│ └── subtasks/
│ └── task-002.md
├── notes/
│ └── meeting.md
├── people/
│ └── alice.md
└── journal/
└── 2024/
└── 01/
└── 15.md
Link resolution from tasks/subtasks/task-002.md:
| Link Value | Resolved Path | Notes |
|---|---|---|
[[task-001]] |
tasks/task-001.md |
Search by name |
[[../task-001]] |
tasks/task-001.md |
Relative path |
[[./task-003]] |
tasks/subtasks/task-003.md |
Relative (may not exist) |
[[notes/meeting]] |
notes/meeting.md |
Absolute from root |
[[meeting]] |
notes/meeting.md |
Search by name |
[[alice]] |
people/alice.md |
Search by name |
[link](../task-001.md) |
tasks/task-001.md |
Markdown, relative |
../task-001.md |
tasks/task-001.md |
Bare path, relative |
8.5 Link Schema Options
When defining a link field in a type:
fields:
parent:
type: link
target: task # Constrain resolution to 'task' type
validate_exists: true # Fail if target doesn't exist
description: "Parent task this subtask belongs to"
related:
type: list
items:
type: link # List of links (no constraints)
`target` Constraint
Limits resolution scope to files of a specific type:
assignee:
type: link
target: person
When resolving [[alice]] for this field:
- Implementation searches only files that match the
persontype - Matches by the configured
id_field(default:id) - If no
persontype file hasid: alice, resolution fails
`validate_exists` Constraint
When true, unresolved links cause validation errors:
parent:
type: link
validate_exists: true
Default is false (links can point to non-existent files).
8.6 Link and Tag Extraction (for `file.*` properties)
To support file.links, file.backlinks, file.embeds, and file.tags, implementations
MUST extract links and tags from both frontmatter and body content using these rules:
Link Extraction
Included:
- Frontmatter fields of type
linkandlistoflink - Body links in wikilink form (
[[target]],[[target|alias]],[[target#anchor]]) - Body links in markdown form (
[text](path.md)), including#anchor - Embeds in wikilink form (
![[target]]) and markdown form ()
Excluded:
- Links inside fenced code blocks
- Links inside inline code spans
file.links returns all non-embed links; file.embeds returns only embeds.
Tag Extraction
file.tags includes:
- Raw persisted frontmatter
tagsfield if present (string or list of strings) - Inline tags in body content of the form
#tag
Inline tags MUST:
- Be preceded by whitespace or appear at the start of a line
- Match the pattern
[A-Za-z0-9_/-]+after#(forward slashes create nested tag hierarchies) - Be outside fenced code blocks and inline code spans
- Not be preceded by
](or](httppatterns (to exclude URL fragments)
Implementations SHOULD ignore # fragments in URLs. A simple heuristic is to skip any # that is preceded by ), ", ', or appears within a markdown link target ([text](url#fragment)).
Nested Tags
Tags MAY contain forward slashes (/) to create hierarchies: #inbox/to-read, #project/alpha/urgent.
The file.hasTag() function performs prefix matching on nested tags: file.hasTag("inbox") matches #inbox, #inbox/to-read, and #inbox/processing. This is consistent with Obsidian's tag behavior.
Nesting has no depth limit. The / character is purely conventional — implementations do not need to build a tree structure.
8.7 Link Traversal
Links can be traversed to access properties of the linked file using the asFile() method.
The `asFile()` Method
In expressions, link.asFile() resolves a link to its target file object:
# Filter: tasks assigned to someone on the engineering team
filters: 'assignee.asFile().team == "engineering"'
# Formula: get the parent task's status
formulas:
parent_status: "parent.asFile().status"
If the link cannot be resolved, asFile() returns null. Subsequent property access on null returns null (no error).
Multi-Hop Traversal
asFile() MAY be chained to traverse multiple links:
assignee.asFile().manager.asFile().name
parent.asFile().project.asFile().status
Each asFile() call resolves the link field on the current file and returns the target file object.
Null propagation: If any hop returns null, the entire chain evaluates to null (no error).
Depth limit: Implementations MUST enforce a maximum traversal depth (default: 10 hops). Exceeding this limit MUST produce an expression_depth_exceeded error. Circular traversal (A → B → A) does not cause infinite loops because the depth limit applies.
Accessing Linked File Properties
Once resolved, you can access:
Frontmatter fields:
parent.asFile().status
parent.asFile().priority
assignee.asFile().name
File metadata:
parent.asFile().file.name
parent.asFile().file.mtime
parent.asFile().file.path
Performance Considerations
Each hop requires loading and parsing the target file. Implementations SHOULD:
- Cache resolved files during query execution
- Document performance characteristics for multi-hop queries
- Consider lazy resolution (only resolve when accessed)
- Warn users about expensive traversals in large collections
8.8 Link Functions
The following functions operate on links and files in expressions:
| Function | Description | Example |
|---|---|---|
link.asFile() |
Resolve link to file object | assignee.asFile().name |
file.hasLink(target) |
File contains link to target | file.hasLink(link("tasks/main")) |
file.links |
List of outgoing links | file.links.length > 5 |
file.backlinks |
List of incoming links (requires index) | file.backlinks.length |
link(path) |
Construct a link from path | link("people/alice") |
Backlinks
file.backlinks returns files that link TO the current file. This requires either:
- A full scan of all files (slow without cache)
- A pre-computed reverse index (requires cache)
Implementations SHOULD document whether file.backlinks requires caching for reasonable performance.
Example: Find files linking to current file
filters: "file.hasLink(this.file)"
8.9 Link Storage and Round-Tripping
When writing links to frontmatter, implementations SHOULD preserve the original format when possible:
- If user wrote
[[note]], prefer outputting[[note]]over./note.md - If user wrote a relative path, preserve relativity when possible
- If user wrote with an alias, preserve the alias
This preserves user intent and keeps files human-readable.
8.10 Links in Body Content
While this specification focuses on frontmatter, links also appear in markdown body content. Implementations SHOULD support:
- Parsing links from body content
- Updating body links during rename operations
- Including body links in
file.links
Implementations that do NOT support body link parsing MUST document this limitation. See Operations for rename behavior.
8.11 Broken Links
A "broken link" is a link that cannot be resolved to an existing file. Handling options:
| Scenario | Behavior |
|---|---|
validate_exists: false (default) |
Broken links are allowed; asFile() returns null |
validate_exists: true |
Broken links cause validation errors |
| Rename operations | Implementations SHOULD update links to maintain validity |
| Delete operations | Implementations MAY warn about incoming links that will break |
8.12 Link Examples
Simple Task Hierarchy
# tasks/parent.md
---
type: task
id: parent
title: Main Feature
subtasks:
- "[[child-1]]"
- "[[child-2]]"
---
# tasks/child-1.md
---
type: task
id: child-1
title: Subtask One
parent: "[[parent]]"
---
Cross-Type References
# tasks/implement-api.md
---
type: task
assignee: "[[alice]]"
spec: "[[docs/api-spec]]"
related:
- "[[tasks/write-tests]]"
- "[[tasks/update-docs]]"
---
Relative Links
# projects/alpha/tasks/task-001.md
---
type: task
parent: "[[../overview]]" # projects/alpha/overview.md
sibling: "[[./task-002]]" # projects/alpha/tasks/task-002.md
global_reference: "[[people/bob]]" # people/bob.md (from root)
---
8.13 Path Sandboxing
Link resolution MUST NOT resolve to paths outside the collection root.
Rules
- After resolving relative paths (applying
../segments), the resulting absolute path MUST be within the collection root directory - If resolution would escape the collection root, the link MUST resolve to
nulland implementations MUST emit apath_traversalerror - This applies to all link formats: wikilinks, markdown links, and bare paths
- Implementations MUST normalize paths (resolve
.and..segments) before checking containment
Example
In a collection rooted at /home/user/notes/:
| Link | From File | Result |
|---|---|---|
[[../../../etc/passwd]] |
notes/daily.md |
null + path_traversal error |
[[../../secrets/key]] |
deep/nested/file.md |
null + path_traversal error |
[[../sibling]] |
tasks/task-001.md |
Resolves normally (stays within root) |
9. Validation
Validation ensures that files conform to their type schemas. This section defines what is validated, when validation occurs, and how errors are reported.
9.1 Validation Levels
Implementations MUST support three validation levels:
| Level | Behavior |
|---|---|
off |
No validation performed |
warn |
Validation runs; issues are reported but operations succeed |
error |
Validation runs; issues cause operations to fail |
The default level is configured via settings.default_validation (default: "warn").
Operations MAY override the default level:
# Force error-level validation
mdbase validate --level error
# Create with no validation
mdbase create --no-validate
9.2 What Is Validated
For each typed file, validation checks the following:
9.2.1 Required Fields
Fields marked required: true MUST be:
- Present in the effective frontmatter (defaults applied; computed fields excluded)
- Non-null (value is not
null)
Note: exists(field) checks for a present key in raw persisted frontmatter even if its value is null. Required fields must be present in the effective frontmatter and non-null.
# Type definition
fields:
title:
type: string
required: true
# Valid
title: "My Document"
# Invalid: missing
# (no title key)
# Invalid: null
title: null
title:
9.2.2 Type Correctness
Values MUST match their declared type (or be coercible):
# Type definition
fields:
priority:
type: integer
# Valid
priority: 5
priority: "5" # Coerced to integer
# Invalid
priority: "high"
priority: 5.5
9.2.3 Field Constraints
Type-specific constraints MUST be satisfied:
| Type | Constraints |
|---|---|
string |
min_length, max_length, pattern |
integer, number |
min, max |
list |
min_items, max_items, unique |
enum |
values |
link |
validate_exists |
9.2.4 Unknown Fields (Strictness)
When a type has strict: true, unknown fields cause validation failure:
# Type definition: strict: true
fields:
title:
type: string
# Valid
title: "Doc"
# Invalid: unknown field
title: "Doc"
extra_field: "not allowed"
With strict: "warn", unknown fields trigger warnings but pass validation.
Implicit fields: The following frontmatter keys are always implicitly allowed, even in strict mode:
type/types— type declaration keys (configurable viasettings.explicit_type_keys)- Any keys listed in
settings.explicit_type_keys
These keys are structural and do not need to be declared in the type's fields definition.
9.2.5 Multi-Type Validation
When a file matches multiple types, it MUST validate against ALL of them:
# File matches both 'task' and 'urgent' types
# Must satisfy:
# - All required fields from 'task'
# - All constraints from 'task'
# - All required fields from 'urgent'
# - All constraints from 'urgent'
9.2.6 Link Existence
For link fields with validate_exists: true, the target file MUST exist:
# Type definition
fields:
parent:
type: link
validate_exists: true
# Valid (if file exists)
parent: "[[existing-task]]"
# Invalid (file doesn't exist)
parent: "[[nonexistent]]"
9.2.7 Filename Patterns
If a type defines filename_pattern, filenames MAY be validated:
# Type definition
filename_pattern: "{id}.md"
# File: task-001.md with id: "task-001" → valid
# File: random-name.md with id: "task-001" → warning (mismatch)
Filename pattern validation is RECOMMENDED but not strictly required.
9.2.8 Unique ID Field
If settings.id_field is configured (default: id), values of that field MUST be
unique across the collection. If duplicates exist, validation MUST emit a
duplicate_id issue for each file that shares the duplicated value.
9.3 Validation Issue Format
Each validation issue MUST include:
| Field | Type | Description |
|---|---|---|
path |
string | File path relative to collection root |
field |
string | Field path (e.g., author.email, tags[0]) |
code |
string | Error code (see Appendix C) |
message |
string | Human-readable error description |
severity |
enum | error or warning |
Optional fields:
| Field | Type | Description |
|---|---|---|
expected |
any | Expected value or type |
actual |
any | Actual value found |
type |
string | Type name that triggered the issue |
line |
integer | 1-based line number in the source file |
column |
integer | 1-based column number |
end_line |
integer | End line of the issue range |
end_column |
integer | End column of the issue range |
Implementations SHOULD include line and column fields when source position information is available. These fields enable LSP-style diagnostics and precise issue reporting in CI tooling.
Example Issue
{
"path": "tasks/fix-bug.md",
"field": "priority",
"code": "constraint_violation",
"message": "Value 7 exceeds maximum of 5",
"severity": "error",
"expected": { "max": 5 },
"actual": 7,
"type": "task",
"line": 5,
"column": 11,
"end_line": 5,
"end_column": 12
}
9.4 Validation Timing
Implementations MAY validate at different times:
| When | Description |
|---|---|
| On read | Validate when loading a file |
| On write | Validate before creating or updating |
| On demand | Validate via explicit command |
| Continuous | Watch mode; validate on file changes |
The specification does not mandate when validation occurs, only the behavior when it does.
Recommended Behavior
- Create/Update operations: Validate before writing; fail if
validation: error - Read/Query operations: Optionally validate; report issues but don't fail
- Explicit validate command: Full collection validation with detailed report
9.5 Validation Commands
Implementations SHOULD provide explicit validation commands:
# Validate entire collection
mdbase validate
# Validate specific files
mdbase validate tasks/fix-bug.md notes/meeting.md
# Validate files of a specific type
mdbase validate --type task
# Validate with specific level
mdbase validate --level error
# Output validation report as JSON
mdbase validate --format json
9.6 Partial Validation
For large collections, implementations MAY support partial validation:
- Validate only modified files (since last validation)
- Validate only files in specific folders
- Validate only files matching certain types
This is an optimization; full validation MUST remain available.
9.7 Validation Report Example
Human-readable format:
Validation Report
================
Errors: 3
Warnings: 5
tasks/fix-bug.md
ERROR [missing_required] Field 'title' is required but missing
ERROR [type_mismatch] Field 'priority': expected integer, got string "high"
WARNING [unknown_field] Field 'custom' is not defined in type 'task'
notes/meeting.md
ERROR [constraint_violation] Field 'attendees': minimum 1 item required, got 0
WARNING [deprecated_field] Field 'old_field' is deprecated
tasks/subtask.md
WARNING [link_not_found] Field 'parent': target '[[nonexistent]]' not found
JSON format:
{
"summary": {
"files_checked": 42,
"files_valid": 39,
"files_invalid": 3,
"errors": 3,
"warnings": 5
},
"issues": [
{
"path": "tasks/fix-bug.md",
"field": "title",
"code": "missing_required",
"message": "Field 'title' is required but missing",
"severity": "error",
"type": "task"
}
]
}
9.8 Auto-Fix (Optional)
Implementations MAY support automatic fixing of certain issues:
| Issue | Auto-Fix |
|---|---|
| Missing field with default | Apply default value |
| Type coercion possible | Coerce value |
| Missing generated field | Generate value |
Auto-fix MUST NOT:
- Delete user data
- Make changes that could lose information
- Fix issues where the correct resolution is ambiguous
# Preview fixes
mdbase validate --fix --dry-run
# Apply fixes
mdbase validate --fix
9.9 Validation in Multi-Type Context
When a file matches multiple types, validation follows these rules:
- All types validated: The file must pass validation for ALL matched types
- Issues attributed: Each issue includes which type triggered it
- Conflict detection: If types have incompatible field definitions, report as error
Example conflict:
# Type 'a' defines: status as string
# Type 'b' defines: status as enum [open, closed]
# File matches both types
# File has: status: "pending"
# Result:
# - Passes type 'a' validation (valid string)
# - Fails type 'b' validation ("pending" not in enum)
# - Overall: FAIL (must pass all types)
9.10 Skipping Validation
Certain scenarios may warrant skipping validation:
- Migration: Importing data that doesn't yet conform
- Bulk operations: Performance-critical batch updates
- Emergency fixes: Bypassing validation to fix broken state
Implementations SHOULD support:
# Skip validation on create
mdbase create --no-validate task.md
# Skip validation on update
mdbase update --no-validate task.md
Skipping validation SHOULD be logged for audit purposes.
10. Query Model
Queries retrieve files from the collection based on filters, with support for sorting, pagination, and computed fields. This section defines the query structure and semantics.
10.1 Query Overview
A query is a request to retrieve files matching certain criteria. Queries can:
- Filter by type
- Filter by frontmatter field values
- Filter by file metadata
- Filter by path patterns
- Sort results
- Paginate results
- Compute derived fields (formulas)
Queries operate on the collection as a flat list of files. The result is a list of file records matching the criteria.
10.2 Core Query Structure
A query is expressed as a YAML object with optional clauses:
query:
# Filter by type(s) - optional
types: [task]
# Filter by folder prefix - optional
folder: "projects/alpha"
# Filter expressions - optional
where:
and:
- 'status != "done"'
- "priority >= 3"
# Sorting - optional
order_by:
- field: due_date
direction: asc
- field: priority
direction: desc
# Pagination - optional
limit: 20
offset: 0
Core Query checklist: types, folder, where, order_by, limit, offset, include_body
Core vs Query+ Summary
| Clause | Core | Query+ |
|---|---|---|
types |
✅ | — |
folder |
✅ | — |
where |
✅ | — |
order_by |
✅ | — |
limit / offset |
✅ | — |
include_body |
✅ | — |
formulas |
— | ✅ |
groupBy |
— | ✅ |
summaries |
— | ✅ |
property_summaries |
— | ✅ |
properties |
— | ✅ |
10.3 Core Query Clauses
`types`
Filter to files matching specified type(s):
# Single type
types: [task]
# Multiple types (OR)
types: [task, note]
Files must match at least one of the listed types.
`folder`
Filter to files within a folder (and subfolders):
folder: "projects/alpha"
Matches files with paths starting with projects/alpha/.
`where`
Filter by expression conditions. Can be:
A single expression string:
where: 'status == "open"'
A logical combination:
where:
and:
- 'status == "open"'
- "priority >= 3"
- or:
- 'tags.contains("urgent")'
- "due_date < today()"
- not: "draft == true"
Shape rules (Obsidian Bases-compatible):
- A
wherevalue MAY be a string expression. - A
wherevalue MAY be a logical object with one of the keysand,or,not. and/orvalues are lists of conditions (each condition is either a string expression or another logical object).notvalue is a single condition (string expression or logical object).
See Expressions for the full expression language.
`order_by`
Sort results by one or more fields:
order_by:
- field: due_date
direction: asc # ascending (oldest first)
- field: priority
direction: desc # descending (highest first)
Direction values:
asc: Ascending (A-Z, 1-9, oldest-newest)desc: Descending (Z-A, 9-1, newest-oldest)
Null handling: Null values sort last by default.
Tie-breakers: If all order_by fields compare equal, implementations MUST
apply a stable tie-breaker by ascending file.path to ensure deterministic output.
Formula sorting:
order_by:
- field: formula.urgency_score
direction: desc
String Collation
Default string ordering uses Unicode code point order (lexicographic comparison of Unicode scalar values):
- Comparison is case-sensitive by default: uppercase letters sort before lowercase (
"A" < "a") - Null values sort LAST in ascending order and FIRST in descending order
- Implementations MAY support locale-aware collation as an
ext-prefixed extension - For
enumfields, sort order follows thevalueslist declaration order, not string order
`limit` and `offset`
Paginate results:
limit: 20 # Return at most 20 results
offset: 40 # Skip the first 40 results
Together these enable pagination: page 3 of 20 items = offset: 40, limit: 20.
10.4 Logical Operators in `where`
The where clause supports nested logical operators:
| Operator | YAML Key | Description |
|---|---|---|
| AND | and: |
All conditions must be true |
| OR | or: |
At least one condition must be true |
| NOT | not: |
Condition must be false |
Examples:
# AND: all must match
where:
and:
- 'status == "open"'
- "priority >= 3"
# OR: any must match
where:
or:
- 'status == "blocked"'
- "due_date < today()"
# NOT: must not match
where:
not: 'status == "done"'
# Nested logic
where:
and:
- 'status != "done"'
- or:
- "priority >= 4"
- 'tags.contains("urgent")'
Alternatively, use expression operators directly:
where: 'status != "done" && (priority >= 4 || tags.contains("urgent"))'
10.5 Property Namespaces
In query expressions, properties are accessed through namespaces:
| Namespace | Description | Example |
|---|---|---|
| (bare) | Effective frontmatter property (defaults applied, computed excluded) | status, priority |
note. |
Raw persisted frontmatter (for reserved names) | note.type, note["my-field"] |
file. |
File metadata | file.name, file.mtime |
formula. |
Computed fields | formula.overdue |
this |
Context file (for embedded queries) | this.file.name |
File Properties
| Property | Type | Description |
|---|---|---|
file.name |
string | Filename with extension (e.g., "task-001.md") |
file.basename |
string | Filename without final extension (e.g., "task-001"; for "file.draft.md" this is "file.draft") |
file.path |
string | Full path from collection root |
file.folder |
string | Parent folder path |
file.ext |
string | File extension without dot (e.g., "md") |
file.size |
number | File size in bytes |
file.ctime |
datetime | Created time |
file.mtime |
datetime | Modified time |
file.links |
list | Outgoing links (including links to non-markdown files) |
file.backlinks |
list | Incoming links (requires index); MAY be null/empty if backlinks are unsupported |
file.tags |
list | All tags (raw frontmatter tags + inline #tags, including nested) |
file.properties |
object | Raw persisted frontmatter properties only (no computed fields, no applied defaults). This is equivalent to note. |
file.embeds |
list | All embed links in the file body |
Body Content Properties
The file.body property provides access to the raw markdown body content (everything after the frontmatter closing ---):
# Find files that mention a keyword in their body
query:
where: 'file.body.contains("TODO")'
# Case-insensitive body search
query:
where: 'file.body.lower().contains("important")'
# Regex body search
query:
where: 'file.body.matches("\\bAPI\\b")'
Rules:
file.bodyis a string and supports all string methods from §11.5:.contains(),.matches(),.lower(),.startsWith(), etc.- Body search operates on raw markdown text including syntax characters
- Content inside fenced code blocks IS included in
file.body(it is the raw text) - Implementations SHOULD support
file.bodyin filters without requiringinclude_body: truein the query — the body is used for filtering, not necessarily returned in results - Performance note: Body search without caching requires reading every file. Implementations SHOULD use full-text indexes when available
- Note:
file.bodyincludes content inside code blocks, butfile.linksandfile.tagsexclude links and tags inside code blocks (see §8). This meansfile.body.contains("[[foo]]")may match a link that does not appear infile.links
The `this` Context
In embedded queries (queries within a file), this refers to the containing file:
# Find files linking to current file
where: "file.hasLink(this.file)"
# Find tasks assigned to current file's author
where: "assignee == this.author"
10.6 Result Structure
Query results return file objects with this structure. The frontmatter field is the effective frontmatter (defaults applied, computed fields excluded). Raw persisted values are available via file.properties/note..
- path: "tasks/fix-bug.md"
types: [task, urgent]
frontmatter: # Effective frontmatter (defaults applied, computed excluded)
id: "task-001"
title: "Fix the login bug"
status: open
priority: 4
tags: [bug, auth]
formulas:
overdue: true
days_until_due: -3
file:
name: "fix-bug.md"
folder: "tasks"
mtime: "2024-03-15T10:30:00Z"
size: 1234
body: "..." # Optional, if requested
Result Envelope
Query results MUST include metadata alongside the result list:
results:
- path: "tasks/fix-bug.md"
types: [task, urgent]
frontmatter:
id: "task-001"
title: "Fix the login bug"
# ...
meta:
total_count: 142 # Total matching records (before limit/offset)
limit: 20
offset: 0
has_more: true # Whether more results exist beyond this page
Fields:
total_count: The total number of records matching the query filters, ignoringlimitandoffset. Implementations MUST compute this accuratelyhas_more:trueifoffset + length(results) < total_count- When no
limitis specified,has_moreisfalseandtotal_countequals the result count
Including Body Content
By default, body content is not included in results. To include it:
query:
include_body: true
This increases memory usage for large result sets.
10.7 Query+ (Optional Advanced Features)
The following clauses are OPTIONAL and are part of the Query+ profile. Implementations are not required to support Query+ to claim conformance at Level 3.
`formulas`
Define computed fields evaluated for each result:
formulas:
overdue: "due_date < today() && status != 'done'"
days_until_due: "due_date - today()"
display_priority: 'if(priority >= 4, "🔴", if(priority >= 2, "🟡", "🟢"))'
Formulas are accessible via the formula. namespace in subsequent expressions and in results.
`groupBy`
Group results by a property value. Each unique value creates a group:
groupBy:
property: status
direction: ASC # ASC or DESC
- Only one
groupByproperty is supported per query. directioncontrols the sort order of groups:ASC(default) orDESC.- Results within each group follow the
order_bysort. - Ungrouped results (null/missing group value) appear in a separate group.
`summaries`
Define custom summary formulas. In summary expressions, the values keyword represents all values for the associated property across the result set:
summaries:
custom_avg: "values.reduce(acc + value, 0) / values.length"
rounded_mean: "values.reduce(acc + value, 0) / values.length"
Summary semantics:
valuesis an ordered list matching the result order (or group order whengroupByis used).- Missing properties contribute
nullvalues tovalues. - Implementations SHOULD preserve
nullvalues invaluesfor custom summaries. - Built-in summaries SHOULD ignore
null/empty values unless otherwise specified (e.g.,Empty,Filled).
See Expressions §11.14 for default summary functions.
`property_summaries`
Assign summary functions to specific properties. These calculate an aggregate value across all records (or per group when groupBy is used):
property_summaries:
priority: Average
estimate_hours: Sum
due_date: Earliest
formula.overdue: Checked
Values reference either default summary names (see Expressions §11.14) or custom summaries defined in the summaries section.
When groupBy is present, property summaries are computed per group.
`properties`
Display configuration for properties. Does not affect query logic---used by view renderers:
properties:
status:
displayName: "Current Status"
formula.overdue:
displayName: "Overdue?"
file.ext:
displayName: "Extension"
Display names are not used in filters or formulas.
10.8 Query Examples
Core Examples
All Open Tasks
query:
types: [task]
where: 'status == "open"'
High Priority Tasks Due This Week
query:
types: [task]
where:
and:
- "priority >= 4"
- "due_date <= today() + '7d'"
- 'status != "done"'
Files Modified Today
query:
where: "file.mtime > today()"
Tasks Tagged Urgent or Blocker
query:
types: [task]
where: 'tags.containsAny("urgent", "blocker")'
Tasks Assigned to Engineering Team Members
query:
types: [task]
where: 'assignee.asFile().team == "engineering"'
Notes Linking to a Specific Task
query:
types: [note]
where: 'file.hasLink(link("tasks/task-001"))'
Backlinks to Current File
query:
where: "file.hasLink(this.file)"
Files Matching Multiple Types
query:
where:
and:
- 'types.contains("actionable")'
- 'types.contains("urgent")'
Query+ Examples
Overdue Tasks Sorted by Priority
query:
types: [task]
where:
and:
- "formula.is_overdue == true"
- 'status != "blocked"'
formulas:
is_overdue: "due_date < today() && status != 'done'"
urgency_score: "priority + if(due_date < today() - '7d', 5, 0)"
order_by:
- field: formula.urgency_score
direction: desc
limit: 10
Tasks Grouped by Status (Query+)
query:
types: [task]
where: 'status != "cancelled"'
groupBy:
property: status
direction: ASC
property_summaries:
priority: Average
estimate_hours: Sum
order_by:
- field: priority
direction: desc
Untyped Files
query:
where: "types.length == 0"
Paginated Results
# Page 1
query:
types: [task]
order_by:
- field: created_at
direction: desc
limit: 20
offset: 0
# Page 2
query:
types: [task]
order_by:
- field: created_at
direction: desc
limit: 20
offset: 20
10.9 Query Optimization
Implementations SHOULD optimize queries where possible:
- Index usage: Use indexes for common filters (type, path prefix)
- Short-circuit evaluation: Stop evaluating OR clauses on first match
- Lazy loading: Don't parse body content unless requested
- Caching: Cache query results for repeated queries
Complex queries (link traversal, formulas) may require full scans. Implementations SHOULD document performance characteristics.
10.10 Query API Considerations
Implementations exposing queries via API SHOULD support:
Programmatic access:
const results = await collection.query({
types: ['task'],
where: 'status == "open"',
orderBy: [{ field: 'priority', direction: 'desc' }],
limit: 10
});
CLI access:
mdbase query --type task --where 'status == "open"' --limit 10
The exact API surface is implementation-dependent.
11. Expression Language
Expressions are strings that evaluate to values. They are used in query filters, match conditions, and computed formulas. This section defines the expression syntax and available functions.
11.1 Expression Context
Expressions are evaluated in a context that provides:
- Frontmatter fields: Direct access via bare names (effective values: defaults applied, computed excluded)
- Raw frontmatter: Via the
note.namespace (equivalent tofile.properties) - File metadata: Via
file.prefix - Formula values: Via
formula.prefix - Context reference: Via
this(in embedded queries) - Built-in functions: Date functions, type checks, etc.
11.2 Literals
Strings
"hello world" // Double quotes
'hello world' // Single quotes
"line 1\nline 2" // Escape sequences supported
Numbers
123 // Integer
45.67 // Decimal
-10 // Negative
1e6 // Scientific notation
Booleans
true
false
Null
null
Lists (in expressions)
["a", "b", "c"]
[1, 2, 3]
11.3 Property Access
Frontmatter Fields
status // Direct access
priority // Direct access
author.name // Nested object
tags[0] // List index (0-based)
Bracket Notation
For fields with special characters:
note["field-with-dashes"]
note["field.with.dots"]
File Metadata
file.name // "task-001.md"
file.path // "tasks/task-001.md"
file.folder // "tasks"
file.ext // "md"
file.size // 1234 (bytes)
file.ctime // Created datetime
file.mtime // Modified datetime
file.body // Raw markdown body content (string)
Formulas
formula.overdue
formula.urgency_score
Context (this)
this.file.name // Current file's name
this.author // Current file's author field
11.4 Operators
Comparison Operators
| Operator | Description | Example |
|---|---|---|
== |
Equal | status == "open" |
!= |
Not equal | status != "done" |
> |
Greater than | priority > 3 |
< |
Less than | priority < 3 |
>= |
Greater or equal | priority >= 3 |
<= |
Less or equal | priority <= 3 |
Arithmetic Operators
| Operator | Description | Example |
|---|---|---|
+ |
Addition | priority + 1 |
- |
Subtraction | total - discount |
* |
Multiplication | count * 2 |
/ |
Division | total / count |
% |
Modulo | index % 2 |
( ) |
Grouping | (a + b) * c |
Boolean Operators
| Operator | Description | Example |
|---|---|---|
&& |
Logical AND | a && b |
|| |
Logical OR | a || b |
! |
Logical NOT | !done |
Null Coalescing
value ?? default // Returns default if value is null
11.5 String Methods
| Method | Description | Example |
|---|---|---|
.length |
String length (field) | title.length |
.contains(str) |
Contains substring | title.contains("bug") |
.containsAll(...strs) |
Contains all substrings | title.containsAll("bug", "fix") |
.containsAny(...strs) |
Contains any substring | title.containsAny("bug", "fix") |
.startsWith(str) |
Starts with prefix | title.startsWith("WIP:") |
.endsWith(str) |
Ends with suffix | file.name.endsWith(".draft.md") |
.isEmpty() |
Empty or absent | title.isEmpty() |
.lower() |
Convert to lowercase | status.lower() |
.upper() |
Convert to uppercase | status.upper() |
.title() |
Title case | name.title() |
.trim() |
Remove whitespace | title.trim() |
.slice(start, end?) |
Extract substring | id.slice(0, 4) |
.split(sep, n?) |
Split to list | tags_str.split(",") |
.replace(pattern, repl) |
Replace pattern | title.replace("old", "new") |
.repeat(count) |
Repeat string | "-".repeat(3) |
.reverse() |
Reverse string | name.reverse() |
.matches(regex) |
Regex match (see §4.8 for regex flavor) | title.matches("^TASK-\\d+") |
11.6 List Methods
| Method | Description | Example |
|---|---|---|
.length |
List length (field) | tags.length |
.contains(value) |
Contains element | tags.contains("urgent") |
.containsAll(...values) |
Contains all elements | tags.containsAll("a", "b") |
.containsAny(...values) |
Contains any element | tags.containsAny("a", "b") |
.isEmpty() |
List is empty | tags.isEmpty() |
[index] |
Element at index | tags[0] |
.filter(expr) |
Filter elements | items.filter(value > 2) |
.map(expr) |
Transform elements | tags.map(value.lower()) |
.reduce(expr, init) |
Reduce to single value | nums.reduce(acc + value, 0) |
.flat() |
Flatten nested lists | nested.flat() |
.reverse() |
Reverse element order | items.reverse() |
.slice(start, end?) |
Extract portion | items.slice(0, 3) |
.sort() |
Sort ascending | tags.sort() |
.unique() |
Remove duplicates | tags.unique() |
.join(sep) |
Join to string | tags.join(", ") |
In filter(), map(), and reduce(), the implicit variables value and index refer to the current element and its position. For reduce(), acc is the accumulator.
containsAll() and containsAny() are variadic; passing a list literal counts as a single value and does not auto-expand.
11.7 Date/Time Functions
Current Date/Time
| Function | Returns | Description |
|---|---|---|
now() |
datetime | Current date and time |
today() |
date | Current date (no time) |
Timezone semantics:
now()andtoday()use the implementation's local timezone unless otherwise configured.- Date-only values (
datetype) are interpreted in the local timezone for comparisons. - Datetime values with explicit offsets MUST be compared in absolute time.
Parsing
| Function | Description | Example |
|---|---|---|
date(string) |
Parse date | date("2024-03-15") |
datetime(string) |
Parse datetime | datetime("2024-03-15T10:30:00Z") |
Date Components
| Method | Returns | Description |
|---|---|---|
.year |
integer | Year component |
.month |
integer | Month (1-12) |
.day |
integer | Day of month |
.hour |
integer | Hour (0-23) |
.minute |
integer | Minute (0-59) |
.second |
integer | Second (0-59) |
.dayOfWeek |
integer | Day of week (0=Sunday) |
.date() |
date | Date portion only |
.time() |
time | Time portion only |
Date Formatting
due_date.format("YYYY-MM-DD")
created_at.format("MMM D, YYYY")
Common format tokens:
YYYY: 4-digit yearMM: 2-digit monthDD: 2-digit dayHH: 2-digit hour (24h)mm: 2-digit minutess: 2-digit second
11.8 Date Arithmetic
Dates support arithmetic with duration strings:
due_date + "7d" // Add 7 days
now() - "1w" // Subtract 1 week
file.mtime > now() - "24h" // Modified in last 24 hours
Duration units:
| Unit | Aliases |
|---|---|
y |
year, years |
M |
month, months |
w |
week, weeks |
d |
day, days |
h |
hour, hours |
m |
minute, minutes |
s |
second, seconds |
Duration string format:
Each duration string contains a single number-unit pair. Whitespace between the number and unit is allowed ("7d" and "7 days" are equivalent). Compound durations in a single string (e.g., "1d12h") are NOT supported — chain additions instead:
date + "1M" + "4h" + "3m" // Add 1 month, 4 hours, 3 minutes
Calendar arithmetic: Adding months or years clamps to the last day of the target month. For example, date("2024-01-31") + "1M" returns 2024-02-29 (2024 is a leap year), not 2024-03-02.
Examples:
today() + "30d" // 30 days from now
due_date - "2w" // 2 weeks before due date
created_at + "1y" // 1 year after creation
Date comparison:
due_date < today() // Overdue
due_date < today() + "7d" // Due within a week
file.mtime > now() - "1h" // Modified in last hour
Date subtraction:
Subtracting two dates returns the difference in milliseconds:
now() - file.ctime // Milliseconds since creation
(today() - due_date) / 86400000 // Days overdue (negative if not yet due)
(now() + "1d") - now() // Returns 86400000
Duration function:
The duration() function explicitly parses a duration string. This is needed when performing arithmetic on durations themselves:
now() + (duration("1d") * 2) // 2 days from now
duration("5h") * 3 // Duration must be on the left
11.9 Conditional Expression
if(condition, then_value, else_value)
Examples:
if(priority > 3, "high", "normal")
if(status == "done", "✓", "○")
if(due_date < today(), "overdue", if(due_date < today() + "7d", "soon", "ok"))
11.10 Null Handling
Check Existence
exists(field) // true if field key is present (including null values)
field.isEmpty() // true if field is null, empty, or absent
Note: exists() checks for key presence in raw persisted frontmatter. A field with value null exists but is empty. Use isEmpty() to check if a field has a meaningful value.
Provide Default
default(field, value) // Return value if field is null or missing
field ?? value // Null coalescing operator
Examples:
exists(due_date) // Has a due date?
default(priority, 3) // Default priority to 3
assignee ?? "unassigned" // Default to "unassigned"
**Missing vs null:** In expressions, missing properties are treated like `null` for `default()` and `??`. Use `exists(field)` to distinguish missing from present-null.
11.11 Type Checking and Conversion
Type Checking
value.isType("string") // true if value is a string
value.isType("number") // true if value is a number
value.isType("boolean") // true if value is a boolean
value.isType("date") // true if value is a date
value.isType("list") // true if value is a list
value.isType("object") // true if value is an object
Type Conversion
value.toString() // Convert any value to string
number("3.14") // Parse string to number
number(true) // Returns 1 (false returns 0)
number(date_value) // Milliseconds since epoch
value.isTruthy() // Coerce to boolean
list(value) // Wrap in list if not already a list
11.12 Link Functions
| Function | Description | Example |
|---|---|---|
link.asFile() |
Resolve link to file | parent.asFile().status |
link(path) |
Construct link | link("tasks/task-001") |
file.hasLink(target) |
File links to target | file.hasLink(link("api-docs")) |
file.hasTag(...tags) |
File has any of the given tags; uses prefix matching for nested tags (see §8) | file.hasTag("important") |
file.hasProperty(name) |
Raw persisted frontmatter has the key | file.hasProperty("status") |
file.inFolder(path) |
File is in folder (or subfolder) | file.inFolder("archive") |
file.asLink(display?) |
Convert file to link | file.asLink("display text") |
11.13 Object Methods
| Method | Description | Example |
|---|---|---|
.isEmpty() |
Has no properties | metadata.isEmpty() |
.keys() |
List of property names | metadata.keys() |
.values() |
List of property values | metadata.values() |
11.14 Summary Functions
Summary functions operate on a collection of values across all matching records. They are used in the summaries section of a query (see Querying).
In summary formulas, the values keyword represents all values for a given property across the result set. The formula MUST return a single value.
Summary value semantics:
valuesis ordered to match the query result order (or group order when grouped).- Missing properties contribute
nullvalues tovalues. - Custom summaries receive
valueswithnullentries intact. - Built-in summaries SHOULD ignore
null/empty values unless the function is explicitly about emptiness (e.g.,Empty,Filled).
values.reduce(acc + value, 0) // Sum
values.reduce(acc + value, 0) / values.length // Average
values.filter(value.isTruthy()).length // Count of truthy values
Default Summary Functions
Implementations SHOULD provide these built-in summary functions:
| Name | Input Type | Description |
|---|---|---|
Average |
Number | Mean of all numeric values |
Min |
Number | Smallest number |
Max |
Number | Largest number |
Sum |
Number | Sum of all numbers |
Range |
Number | Difference between Max and Min |
Median |
Number | Median value |
Earliest |
Date | Earliest date |
Latest |
Date | Latest date |
Checked |
Boolean | Count of true values |
Unchecked |
Boolean | Count of false values |
Empty |
Any | Count of empty/null values |
Filled |
Any | Count of non-empty values |
Unique |
Any | Count of unique values |
11.15 Operator Precedence
From highest to lowest:
( )- Grouping.[]- Property access!-(unary) - Negation*/%- Multiplication+-- Addition<<=>>=- Comparison==!=- Equality&&- Logical AND||- Logical OR??- Null coalescing
Use parentheses to clarify complex expressions.
11.16 Lambda Expressions
List methods like filter(), map(), and reduce() use implicit variables rather than arrow function syntax:
// value refers to the current element, index to its position
items.filter(value > 2)
tags.map(value.lower())
items.map(value.toString() + " (" + index.toString() + ")")
// reduce also provides acc (accumulator)
numbers.reduce(acc + value, 0)
Implementations MAY also support arrow function syntax as an extension:
tags.map(t => t.lower())
tasks.filter(t => t.status != "done")
If arrow functions are supported, implementations SHOULD parse them only within
function argument positions and treat => as part of the lambda expression itself
(not as a general-purpose operator).
11.17 Expression Examples
Simple Filters
status == "open"
priority >= 4
tags.contains("urgent")
Combined Conditions
status == "open" && priority >= 4
status == "blocked" || due_date < today()
!(status == "done")
Date Filters
due_date < today()
due_date < today() + "7d"
file.mtime > now() - "24h"
created_at.year == 2024
String Filters
title.contains("bug")
title.lower().contains("urgent")
file.name.startsWith("draft-")
id.matches("^TASK-\\d{4}$")
List Operations
tags.length > 0
tags.contains("important")
tags.containsAny("urgent", "critical")
assignees.filter(a => a.asFile().team == "eng").length > 0
Computed Fields (Formulas)
// Is overdue?
due_date < today() && status != "done"
// Days overdue (date subtraction returns milliseconds)
(today() - due_date) / 86400000
// Priority display
if(priority >= 4, "🔴 Critical", if(priority >= 2, "🟡 Normal", "🟢 Low"))
// Urgency score
priority * 10 + if(due_date < today(), 50, 0)
Link Traversal
parent.asFile().status == "done"
assignee.asFile().team == "engineering"
blocks.map(b => b.asFile().status).contains("blocked")
11.18 Error Handling
Expression evaluation errors MUST be handled gracefully and MUST NOT abort the overall query:
| Error | Behavior |
|---|---|
| Property access on null | Returns null |
| Method call on null | Returns null |
| Division by zero | Returns null and emits type_error |
| Invalid regex | Evaluation error (see §4.8 for regex flavor) |
| Type mismatch | Returns null and emits type_error |
Implementations SHOULD log evaluation errors and continue processing where possible.
11.19 Expression Portability
Expressions using only spec-defined functions and operators are portable expressions. This section defines rules for maintaining portability across implementations.
Custom Functions
Implementations MAY define custom functions beyond those specified in this document. Custom functions MUST be namespaced with the ext prefix using either :: or . as a delimiter:
ext::myFunc(value) // Double-colon delimiter
ext.myFunc(value) // Dot delimiter
Both delimiter forms are equivalent. Implementations MUST accept either form for custom functions they define.
Rules
Namespace requirement: Implementations MUST namespace all custom functions with the
extprefix. Unprefixed custom functions are not permitted.No shadowing: Implementations MUST NOT override or shadow built-in functions or operators defined in this specification.
Non-portable warnings: Implementations SHOULD emit a warning when evaluating an expression that uses non-portable functions (i.e.,
ext-prefixed functions).Documentation: Type definitions and queries SHOULD note when they depend on non-portable expressions.
Example
# Portable expression — uses only spec-defined functions
filters: 'status == "open" && due_date < today()'
# Non-portable expression — uses a custom function
filters: 'ext::sentiment(title) > 0.5'
Implementations encountering an unknown ext-prefixed function MUST treat it as an evaluation error (see §11.18).
12. Operations
This section defines the behavior of Create, Read, Update, Delete, and Rename operations on collection files.
12.1 Create
Creates a new file in the collection.
Input
| Parameter | Required | Description |
|---|---|---|
type |
No | Type name(s) for the file |
frontmatter |
Yes | Field values (may be partial) |
body |
No | Markdown body content |
path |
No | Target file path (may be derived) |
Behavior
Determine type(s): Use provided type, or infer from frontmatter if
type/typeskey presentApply defaults: For each missing field with a
defaultvalue, apply the default to the effective record used for validation and outputGenerate values: For fields with
generatedstrategy:ulid: Generate ULIDuuid: Generate UUID v4now: Set to current datetime{from, transform}: Derive from source field
Validate: If validation level is not
off:- Validate against all matched type schemas
- If level is
errorand validation fails, abort
Determine path:
- If
pathprovided, use it - If type has
filename_pattern, derive from field values - Otherwise, require explicit path
- If
Check existence: If file already exists at path, abort with error
Write file:
- Serialize frontmatter to YAML
- MUST include all explicitly provided fields and all generated fields
- SHOULD omit fields that were filled solely by defaults, unless the caller explicitly requests default materialization
- Combine with body
- Write atomically (temp file + rename)
Output
path: "tasks/task-001.md"
frontmatter:
id: "01ARZ3NDEKTSV4RRFFQ69G5FAV"
title: "Fix the bug"
status: open
created_at: "2024-03-15T10:30:00Z"
# ... all fields including generated
Errors
| Code | Description |
|---|---|
unknown_type |
Specified type doesn't exist |
validation_failed |
Validation errors (with details) |
path_conflict |
File already exists at target path |
path_required |
Cannot determine path |
Example
mdbase create task \
--field title="Fix login bug" \
--field priority=4 \
--field "assignee=[[alice]]"
12.2 Read
Reads a file and returns its parsed content.
Input
| Parameter | Required | Description |
|---|---|---|
path |
Yes | File path relative to collection root |
validate |
No | Whether to validate (default: per settings) |
include_body |
No | Include body content (default: true) |
Behavior
Load file: Read from filesystem
Parse frontmatter: Extract YAML frontmatter and body
Determine types:
- Check for explicit
type/typesfield - Evaluate match rules for all types
- Collect matched types
- Check for explicit
Validate (if enabled):
- Validate against all matched types
- Collect validation issues
Return record: Structured representation
frontmatteris the effective frontmatter (defaults applied, computed fields excluded)file.properties(see Querying §10.5) provides raw persisted frontmatter when needed
Output
path: "tasks/task-001.md"
types: [task]
frontmatter:
id: "task-001"
title: "Fix the bug"
status: open
# ... all fields
file:
name: "task-001.md"
folder: "tasks"
mtime: "2024-03-15T10:30:00Z"
size: 1234
body: "## Description\n\nThe login form..."
validation:
valid: true
issues: []
Errors
| Code | Description |
|---|---|
file_not_found |
File doesn't exist |
invalid_frontmatter |
YAML parse error |
12.3 Update
Modifies an existing file's frontmatter and/or body.
Input
| Parameter | Required | Description |
|---|---|---|
path |
Yes | File path |
fields |
No | Field updates (partial) |
body |
No | New body content (null = no change) |
Behavior
Read existing file: Load current content
Merge fields: Apply field updates to existing frontmatter
- New fields are added
- Existing fields are replaced
- Explicit null removes the field (if
write_nulls: omit) or writes null
Update generated fields: For fields with
generated: now_on_write, update to current timeApply defaults: For each missing field with a
default, apply the default to the effective record used for validation and outputValidate: If enabled, validate merged frontmatter (using effective values for required checks)
Write file:
- Preserve field order where possible
- Preserve body if not provided
- Write atomically
Null Handling on Update
When updating a field to null:
write_nulls setting |
Behavior |
|---|---|
"omit" (default) |
Remove the field from frontmatter |
"explicit" |
Write field: null |
Important: Never write the empty-value form field: (see Frontmatter).
Output
path: "tasks/task-001.md"
frontmatter: # Effective frontmatter (defaults applied, computed excluded)
# ... updated frontmatter
previous:
status: open
updated:
status: done
Errors
| Code | Description |
|---|---|
file_not_found |
File doesn't exist |
validation_failed |
Validation errors |
Example
# Update single field
mdbase update tasks/task-001.md --field status=done
# Update multiple fields
mdbase update tasks/task-001.md \
--field status=done \
--field "completed_at=$(date -Iseconds)"
# Clear a field
mdbase update tasks/task-001.md --field assignee=null
12.4 Delete
Removes a file from the collection.
Input
| Parameter | Required | Description |
|---|---|---|
path |
Yes | File path |
check_backlinks |
No | Warn about incoming links (default: true) |
Behavior
Check existence: Verify file exists
Check backlinks (if enabled):
- Find files that link to this file
- Warn user about potential broken links
Delete file: Remove from filesystem
Output
path: "tasks/task-001.md"
deleted: true
broken_links:
- path: "tasks/parent.md"
field: "subtasks"
Errors
| Code | Description |
|---|---|
file_not_found |
File doesn't exist |
Example
# Delete with confirmation
mdbase delete tasks/task-001.md
# Delete without backlink check
mdbase delete tasks/task-001.md --no-check-backlinks
# Force delete
mdbase delete tasks/task-001.md --force
12.5 Rename (and Move)
Renames or moves a file, optionally updating references across the collection.
Input
| Parameter | Required | Description |
|---|---|---|
from |
Yes | Current file path |
to |
Yes | New file path |
update_refs |
No | Update references (default: per settings) |
Behavior
Validate paths: Ensure source exists and target doesn't
Rename file: Move file to new path atomically
Update references (if
rename_update_refsis true):Frontmatter links: Update link fields in all files that reference the renamed file
# Before: links to tasks/old-name.md parent: "[[old-name]]" # After: file renamed to tasks/new-name.md parent: "[[new-name]]"Body links: Update link syntax in markdown body content
<!-- Before --> See [[old-name]] for details. Check [the task](./old-name.md). <!-- After --> See [[new-name]] for details. Check [the task](./new-name.md).
Reference Update Rules
Preserve link style:
- Wikilinks stay as wikilinks
- Markdown links stay as markdown links
- Relative links stay relative when possible
Update all matching references:
- By resolved path (most reliable)
- By name when unambiguous
ID-based links:
- If a simple-name link (
[[name]]) resolves viaid_fieldand the target file'sid_fieldvalue did not change, implementations SHOULD NOT rewrite the link during rename (to avoid unnecessary churn).
- If a simple-name link (
Handle ambiguity:
- If a link could refer to multiple files, don't update
- Emit warning for manual review
Scope: Update references in ALL collection files, not just same folder
Output
from: "tasks/old-name.md"
to: "tasks/new-name.md"
references_updated:
- path: "tasks/parent.md"
field: "subtasks[0]"
old_value: "[[old-name]]"
new_value: "[[new-name]]"
- path: "notes/meeting.md"
location: "body"
old_value: "[[old-name]]"
new_value: "[[new-name]]"
warnings:
- path: "archive/legacy.md"
message: "Ambiguous link '[[name]]' not updated"
Errors
| Code | Description |
|---|---|
file_not_found |
Source file doesn't exist |
path_conflict |
Target path already exists |
Example
# Simple rename
mdbase rename tasks/old.md tasks/new.md
# Move to different folder
mdbase rename tasks/task.md archive/task.md
# Rename without updating references
mdbase rename tasks/old.md tasks/new.md --no-update-refs
# Dry run (show what would change)
mdbase rename tasks/old.md tasks/new.md --dry-run
12.6 Atomicity
All write operations (Create, Update, Delete, Rename) SHOULD be atomic:
- Create/Update: Write to temporary file, then rename
- Delete: Single filesystem delete
- Rename: Filesystem rename (atomic on most systems)
For Rename with reference updates, atomicity across multiple files is not guaranteed. Implementations SHOULD:
- Complete the rename first
- Update references file by file
- Report partial failures clearly
12.7 Batch Operations
Implementations MAY support batch operations for efficiency:
# Bulk update
mdbase update --where 'status == "open"' --field status=in_progress
# Bulk delete
mdbase delete --where 'tags.contains("archive")' --confirm
# Bulk move
mdbase move 'tasks/*.md' archive/
Validation Phase
Before applying any changes, implementations MUST validate ALL affected files. If any file fails validation and default_validation is error, the entire batch MUST be aborted with no files modified.
Execution Phase
After validation passes, apply changes file by file.
Partial Failure
If a file write fails during execution (I/O error, concurrent modification):
- Implementations MUST NOT roll back already-written files (filesystem operations are not transactional)
- Implementations MUST continue processing remaining files (best-effort)
- Implementations MUST report per-file results: success, failure (with error code), or skipped
Result Format
batch_result:
total: 50
succeeded: 47
failed: 2
skipped: 1
details:
- path: "tasks/task-001.md"
status: "success"
- path: "tasks/task-002.md"
status: "failed"
error: { code: "concurrent_modification", message: "..." }
- path: "tasks/task-003.md"
status: "skipped"
reason: "Depends on failed task-002.md"
Dry-Run Mode
Batch operations MUST support --dry-run which validates all changes and reports what would happen without modifying any files.
12.8 Formatting Preservation
When writing files, implementations SHOULD preserve:
MUST Preserve
- Body content (unless explicitly changed)
- Line ending style (LF vs CRLF)
SHOULD Preserve
- Frontmatter field order
- String quoting style
- Multi-line string format (literal vs folded)
- Comments (if YAML parser supports it)
MAY Normalize
- Indentation (recommend 2 spaces)
- Trailing whitespace
- Final newline (files SHOULD end with newline)
12.9 Hooks (Optional)
Implementations MAY support hooks for custom logic:
| Hook | When |
|---|---|
beforeCreate |
Before validation and write |
afterCreate |
After successful write |
beforeUpdate |
Before validation and write |
afterUpdate |
After successful write |
beforeDelete |
Before deletion |
afterDelete |
After successful deletion |
beforeRename |
Before rename |
afterRename |
After successful rename and ref updates |
Hooks receive operation context and can:
- Modify values (before hooks)
- Perform side effects (after hooks)
- Abort operation (before hooks, by throwing)
This is an OPTIONAL feature; implementations need not support hooks.
12.10 Concurrency
Read-Modify-Write Cycle
When updating a file, implementations MUST detect concurrent modifications. The recommended approach is optimistic concurrency using file mtime:
- Read file, record mtime
- Apply changes in memory
- Before writing, check that file mtime has not changed
- If mtime changed, abort with
concurrent_modificationerror - Write atomically (temp file + rename)
Implementations MAY use content hashing instead of mtime for more reliable conflict detection.
Conflict Behavior
On detecting a concurrent modification, implementations MUST abort the operation and report concurrent_modification. Implementations MUST NOT silently overwrite concurrent changes.
Implementations MAY offer a retry mechanism (re-read, re-apply, re-check) but MUST NOT retry automatically without user/caller consent.
Cross-File Operations
Rename with reference updates touches multiple files. These are NOT atomic across files. Implementations MUST:
- Complete the primary rename first
- Update references file by file
- Use mtime checking on each referenced file before updating
- Report partial failures — which files were updated and which were skipped due to conflicts
File Locking
Implementations MAY use advisory file locks for write operations. If used:
- Locks MUST be released on operation completion (including error paths)
- Lock timeouts SHOULD be documented
- Implementations MUST NOT require locking for read operations
13. Caching and Indexing
Caching and indexing are optional features that accelerate queries on large collections. This section defines cache behavior and requirements.
13.1 Core Principle
Files are the source of truth.
Caches are derived data. They MUST be:
- Rebuildable from files alone
- Deletable without data loss
- Optional for correctness (only affect performance)
If you delete the cache folder, the collection still works—queries just run slower.
13.2 When Caching Helps
Caching significantly improves performance for:
| Operation | Without Cache | With Cache |
|---|---|---|
| Query by type | Scan all files | Index lookup |
| Query by field | Scan all files | Index lookup |
| Path prefix filter | Filesystem scan | Index lookup |
| Link resolution | Search all files | Direct lookup |
| Backlink queries | Scan all files | Reverse index |
| Full-text body search | Read all files | Text index |
For small collections (< 100 files), caching overhead may not be worthwhile. For large collections (1000+ files), caching is strongly recommended.
13.3 Cache Requirements
If an implementation supports caching, it MUST follow these rules:
13.3.1 Derivable
The cache MUST be completely rebuildable from:
- Collection files (markdown)
- Configuration (mdbase.yaml)
- Type definitions
No information should exist only in the cache.
13.3.2 Optional
All operations MUST work without the cache, possibly slower:
- Queries scan files directly
- Backlinks are computed on demand
- Link resolution searches the collection
13.3.3 Detectable Staleness
The implementation MUST detect when the cache is stale:
- File modified after cache entry
- File deleted but still in cache
- New file not in cache
- Config changed since cache build
13.3.4 Explicit Rebuild
Users MUST be able to force a full cache rebuild:
mdbase cache rebuild
13.3.5 Deletable
Deleting the cache folder MUST NOT affect:
- File contents
- Collection integrity
- Operation correctness
13.4 Cache Location
The default cache location is .mdbase/ at the collection root, configurable via settings.cache_folder.
my-collection/
├── mdbase.yaml
├── _types/
├── tasks/
├── notes/
└── .mdbase/ # Cache folder
├── index.sqlite # Main index (example)
├── links.json # Link graph (example)
└── meta.json # Cache metadata
Gitignore
The cache folder SHOULD be gitignored. Add to .gitignore:
.mdbase/
Cache files are machine-specific and should not be version controlled.
13.5 What to Cache
Implementations MAY cache:
| Data | Purpose |
|---|---|
| File metadata | Fast file lookups |
| Parsed frontmatter | Avoid re-parsing |
| Type assignments | Fast type queries |
| Field values | Field-based queries |
| Link graph | Link resolution, backlinks |
| Full-text index | Body content search |
Minimum Recommended
At minimum, caching implementations SHOULD index:
- File paths and mtimes (for staleness detection)
- Type assignments (for type queries)
- Link relationships (for backlinks)
13.6 Cache Invalidation
Staleness Detection
For each file, track:
- Last known mtime
- Content hash (optional, more reliable)
On query:
- Check if file mtime matches cached mtime
- If different, mark entry stale
- Re-parse file and update cache
Change Triggers
Cache entries should be invalidated when:
| Change | Invalidation |
|---|---|
| File modified | Re-index that file |
| File created | Index new file |
| File deleted | Remove from index |
| File renamed | Update path, check links |
| Config changed | Full rebuild |
| Type definition changed | Re-index affected files |
Incremental vs Full Rebuild
Incremental: Update only changed entries (fast, normal operation)
Full rebuild: Recreate entire cache (slow, guaranteed consistent)
Implementations SHOULD support both.
13.7 Backlinks and Caching
The file.backlinks property requires knowing which files link TO a given file. This requires a reverse link index.
Building the Backlink Index
For each file A in the collection:
- Parse A's frontmatter and body
- Extract all links from A
- For each link target B:
- Add A to B's backlinks set
Storage
{
"tasks/task-001.md": {
"backlinks": [
"tasks/parent.md",
"notes/meeting.md"
]
}
}
Performance Note
Without caching, computing backlinks requires scanning every file. For large collections, this is prohibitively slow. Implementations SHOULD:
- Document that
file.backlinksrequires caching for good performance - Warn when backlink queries are slow
- Suggest enabling caching
13.8 Cache Commands
Implementations SHOULD provide cache management commands:
# Show cache status
mdbase cache status
# Output: Cache valid, 1234 files indexed, last built 5 min ago
# Rebuild entire cache
mdbase cache rebuild
# Clear cache
mdbase cache clear
# Update cache incrementally
mdbase cache update
# Verify cache integrity
mdbase cache verify
13.9 Cache Implementation Options
Implementations may use various storage backends:
SQLite (Recommended)
.mdbase/index.sqlite
Pros: ACID, queryable, single file, widely supported Cons: Binary file (not human-readable)
JSON Files
.mdbase/files.json
.mdbase/links.json
.mdbase/types.json
Pros: Human-readable, simple Cons: Full rewrite on update, no concurrent access
Memory Only
No persistent cache; rebuild on each run.
Pros: No disk I/O, always fresh Cons: Slow startup for large collections
13.10 Concurrent Access
When multiple processes may access the collection:
Read-Only Access
Multiple readers can safely share a cache. Use file locking or SQLite's WAL mode.
Write Access
When one process writes:
- Acquire exclusive lock
- Perform operation
- Update cache
- Release lock
Implementations SHOULD document concurrency behavior.
13.11 Cache Warming
For large collections, initial cache build can take time. Implementations MAY support:
Eager warming: Build cache on first access
mdbase cache build
Lazy warming: Build entries on first query for each file
Background warming: Build cache asynchronously while serving queries
13.12 Cache Versioning
Cache format may change between implementation versions. Include a version marker:
{
"version": "1.0",
"spec_version": "0.1.0",
"built_at": "2024-03-15T10:30:00Z"
}
When loading cache:
- Check version compatibility
- If incompatible, trigger full rebuild
- Log version mismatch for debugging
14. Conformance
This section defines conformance levels and testing requirements for implementations.
14.1 Conformance Levels
Implementations may claim conformance at different levels. Each level builds on all previous levels.
Level 1: Core
Required capabilities:
- Parse
mdbase.yamlconfiguration - Locate and scan markdown files in collection
- Parse YAML frontmatter from files
- Handle null values correctly (per §3)
- Load type definitions from types folder
- Single-type matching via explicit declaration (
typefield) - Validate fields against type schemas
- Implement Create, Read, Update, Delete operations
- Type coercion (per §7.16)
Test coverage: Basic parsing, validation, CRUD operations, concurrency (basic mtime conflict detection)
Level 2: Matching
Additional capabilities:
- Path-based type matching (
path_glob) - Field presence matching (
fields_present) - Field value matching (
whereconditions in match rules) - Multi-type matching (files matching multiple types)
- Multi-type validation with constraint merging (per §6.5)
Test coverage: Match rule evaluation, multi-type scenarios, constraint merging
Level 3: Querying
Additional capabilities:
- Core query model (filter, sort, limit, offset)
- Expression evaluation (all operators in §11)
- String, list, and object methods
- Date arithmetic (including date subtraction)
- Duration parsing (
duration()) - Null coalescing (
??)
Test coverage: Core query correctness, expression edge cases, body_search, computed_fields
Level 4: Links
Additional capabilities:
- Parse all link formats (wikilink, markdown, bare path)
- Resolve links to files
asFile()traversal in expressionsfile.hasLink()andfile.hasTag()functionsfile.linksproperty
Test coverage: Link parsing, resolution, traversal
Level 5: References
Additional capabilities:
- Rename with reference updates
- Backlink computation (
file.backlinks) - Body link detection and update
Test coverage: Reference update correctness, backlink accuracy
Level 6: Full
All capabilities including:
- Caching with staleness detection (per §13)
- Batch operations
- Watch mode with event delivery (per §15)
- Type creation via tooling
- Nested collection detection (per §2)
Test coverage: Performance, cache correctness, watching, edge cases
14.1.1 Optional Profiles (Non-Normative)
Implementations MAY support optional profiles beyond the core levels. The Query+ profile adds advanced query features (formulas, groupBy, summaries, property_summaries, properties) as defined in Querying §10.7. Support for Query+ is not required for conformance.
14.2 Conformance Claims
Implementations SHOULD clearly state their conformance level:
mdbase-tool v1.0.0
Conformance: Level 4 (Links)
Specification: 0.1.0
Implementations MAY implement features from higher levels while claiming a lower level, but SHOULD NOT claim a level without passing all tests for that level.
14.3 Test Suite
A conformance test suite is provided as a collection of test cases. Each test group specifies a spec_ref identifying the specification section(s) under test. Individual test cases MAY also include a spec_ref to pinpoint the exact clause being validated.
The spec_ref field uses section numbers (e.g., "§7.2", "§3.4, §7.2"). When a test case omits spec_ref, it inherits the group-level reference.
name: "required field validation"
level: 1
category: validation
spec_ref: "§7.2"
setup:
config: |
spec_version: "0.1.0"
types:
task.md: |
---
name: task
fields:
title:
type: string
required: true
---
files:
tasks/valid.md: |
---
type: task
title: "Valid task"
---
tasks/invalid.md: |
---
type: task
---
tests:
- name: "valid file passes validation"
operation: validate
input:
path: "tasks/valid.md"
expect:
valid: true
issues: []
- name: "missing required field fails validation"
spec_ref: "§7.2"
operation: validate
input:
path: "tasks/invalid.md"
expect:
valid: false
issues:
- code: missing_required
field: title
Test Categories
| Category | Description | Spec Reference |
|---|---|---|
config |
Configuration parsing and validation | §4 |
types |
Type definition loading and inheritance | §5 |
matching |
Type matching rules | §6 |
validation |
Schema validation | §9 |
expressions |
Expression evaluation | §11 |
queries |
Query execution | §10 |
links |
Link parsing and resolution | §8 |
operations |
CRUD operations | §12 |
references |
Reference updates | §12.5 |
caching |
Cache behavior | §13 |
concurrency |
Concurrent modification detection | §12.10 |
watching |
Watch mode event delivery | §15 |
body_search |
Body content filtering | §10.5 |
computed_fields |
Computed field evaluation | §5.12 |
14.4 Required Test Coverage
For each conformance level, implementations MUST pass:
| Level | Required Categories |
|---|---|
| 1 | config, types (basic), validation (basic), operations, concurrency |
| 2 | + matching |
| 3 | + expressions, queries, body_search, computed_fields |
| 4 | + links |
| 5 | + references |
| 6 | + caching, watching |
14.5 Test Execution
Test suite can be run against any implementation:
# Run all tests
mdbase-test run --impl ./my-impl
# Run specific level
mdbase-test run --impl ./my-impl --level 3
# Run specific category
mdbase-test run --impl ./my-impl --category validation
# Generate conformance report
mdbase-test report --impl ./my-impl --output report.html
14.6 Implementation Notes
Edge Cases to Handle
Implementations should correctly handle:
- Empty frontmatter (
---\n---) - File without frontmatter
- Frontmatter with only null values
- Empty collection (no files)
- Type with no fields defined
- File matching zero types
- File matching multiple conflicting types
- Circular type inheritance (should error)
- Self-referential links
- Links to non-existent files
- Very long field values
- Unicode in field names and values
- Files with unusual characters in names
Performance Expectations
While not strictly required, implementations SHOULD:
| Operation | Target | Collection Size |
|---|---|---|
| Read single file | < 10ms | Any |
| Query by type | < 100ms | 1000 files |
| Query with filter | < 500ms | 1000 files |
| Link resolution | < 10ms | Any |
| Backlink query | < 1s | 1000 files (cached) |
Error Messages
Implementations SHOULD provide helpful error messages:
❌ Validation failed: tasks/task-001.md
Field 'priority' has invalid value
Expected: integer between 1 and 5
Actual: "high" (string)
At line 5 in frontmatter:
priority: high
^^^^
Hint: Use a number like `priority: 3`
14.7 Extensions
Implementations MAY extend the specification with additional features:
- Custom field types
- Additional expression functions
- Query output formats
- Integration hooks
- Custom validation rules
Extensions SHOULD:
- Be clearly documented as non-standard
- Not conflict with standard behavior
- Be optional (spec-compliant usage should work without them)
Extensions SHOULD NOT:
- Change the meaning of standard features
- Require non-standard syntax for basic operations
- Break interoperability with compliant tools
14.8 Reporting Issues
If the specification is ambiguous or conflicts with practical implementation needs, please report issues to the specification maintainers. The goal is a spec that is:
- Clear and unambiguous
- Implementable in any language
- Useful for real-world applications
15. Watching
This section defines the watch mode event model for monitoring a collection for changes. Watch mode is a required capability for Level 6 conformance.
15.1 Overview
Watch mode enables implementations to monitor a collection for filesystem changes and emit structured events. This supports real-time UIs, continuous validation, and incremental cache updates.
15.2 Event Types
| Event | Trigger | Payload Fields |
|---|---|---|
file_created |
New markdown file detected | path, types, frontmatter |
file_modified |
Existing file content changed | path, types, frontmatter, changed_fields |
file_deleted |
Markdown file removed | path, last_known_types |
file_renamed |
File moved/renamed (if detectable) | from, to, types |
type_changed |
Type definition file modified | type_name, affected_files |
config_changed |
mdbase.yaml modified |
previous_hash, new_hash |
validation_error |
File fails validation after change | path, issues |
When present, frontmatter in events is the effective frontmatter (defaults applied, computed excluded).
15.3 Event Payload Structure
All events include a common set of fields:
| Field | Type | Description |
|---|---|---|
event |
string | Event type (e.g., file_created) |
timestamp |
datetime | When the event was emitted |
path |
string | File path relative to collection root |
Additional fields per event type are listed in §15.2.
Example Payloads
file_created:
event: file_created
timestamp: "2024-03-15T10:30:00Z"
path: "tasks/task-042.md"
types: [task]
frontmatter: # Effective frontmatter (defaults applied, computed excluded)
title: "New task"
status: open
file_modified:
event: file_modified
timestamp: "2024-03-15T10:31:00Z"
path: "tasks/task-042.md"
types: [task]
frontmatter: # Effective frontmatter (defaults applied, computed excluded)
title: "New task"
status: in_progress
changed_fields: [status] # Raw persisted frontmatter keys that changed
file_deleted:
event: file_deleted
timestamp: "2024-03-15T10:32:00Z"
path: "tasks/task-042.md"
last_known_types: [task]
file_renamed:
event: file_renamed
timestamp: "2024-03-15T10:33:00Z"
from: "tasks/task-042.md"
to: "archive/task-042.md"
types: [task]
15.4 Debouncing
Implementations MUST debounce filesystem events — multiple rapid changes to the same file MUST be coalesced into a single event.
- Recommended debounce window: 100–500ms (implementation-defined)
- After the debounce window, the implementation reads the file's current state and emits one event reflecting the net change
- If a file is created and then immediately deleted within the debounce window, no event is emitted
15.5 Rename Detection
Filesystem watchers typically see a delete followed by a create rather than a rename.
- Implementations SHOULD detect renames by correlating a delete and create within a short window (e.g., same content hash, or same
id_fieldvalue) - If rename detection succeeds, emit a single
file_renamedevent - If rename detection fails, implementations MUST emit separate
file_deletedandfile_createdevents
15.6 Event Delivery
- Implementations MUST support callback/listener registration for events
- Events MUST be delivered in order per file — events for the same file are delivered sequentially. Events for different files MAY be delivered concurrently
- If event processing (in a listener callback) fails, the error MUST NOT stop the watcher. Implementations SHOULD log the error and continue
15.7 Interaction with Caching
When a cache is present (see §13):
- Watch events SHOULD trigger incremental cache updates
- Cache updates MUST complete before the event is delivered to listeners, so that listeners see consistent state when they query the collection
Appendix A: Complete Examples
This appendix provides complete, working examples of collections and their components.
A.1 Minimal Collection
The simplest valid collection:
minimal/
├── mdbase.yaml
└── hello.md
mdbase.yaml:
spec_version: "0.1.0"
hello.md:
---
title: Hello World
---
This is a minimal collection with one untyped file.
A.2 Task Management Collection
A complete task management setup with types, queries, and examples.
Structure
tasks-project/
├── mdbase.yaml
├── _types/
│ ├── base.md
│ ├── task.md
│ ├── person.md
│ └── urgent.md
├── people/
│ ├── alice.md
│ └── bob.md
├── tasks/
│ ├── feature-login.md
│ ├── bug-crash.md
│ └── subtasks/
│ └── login-ui.md
└── .mdbase/
└── (cache files)
Configuration
mdbase.yaml:
spec_version: "0.1.0"
name: "Project Tasks"
description: "Task tracking for the main project"
settings:
exclude:
- ".git"
- "node_modules"
- ".mdbase"
- "*.draft.md"
default_validation: "warn"
default_strict: false
id_field: "id"
rename_update_refs: true
write_nulls: "omit"
Type Definitions
_types/base.md:
---
name: base
fields:
id:
type: string
required: true
generated: ulid
created_at:
type: datetime
generated: now
updated_at:
type: datetime
generated: now_on_write
---
# Base Type
Common fields for all tracked entities. Provides automatic ID generation and timestamps.
_types/task.md:
---
name: task
description: A task or todo item with lifecycle tracking
extends: base
match:
path_glob: "tasks/**/*.md"
filename_pattern: "{id}.md"
fields:
title:
type: string
required: true
min_length: 1
max_length: 200
description: Short, descriptive task title
status:
type: enum
values: [open, in_progress, blocked, done, cancelled]
default: open
description: Current lifecycle state
priority:
type: integer
min: 1
max: 5
default: 3
description: "1 = lowest, 5 = highest"
assignee:
type: link
target: person
description: Person responsible for this task
due_date:
type: date
description: When the task should be completed
tags:
type: list
items:
type: string
default: []
description: Categorization tags
parent:
type: link
target: task
description: Parent task (for subtasks)
blocks:
type: list
items:
type: link
target: task
default: []
description: Tasks that this task blocks
estimate_hours:
type: number
min: 0
description: Estimated hours to complete
---
# Task
Tasks represent discrete units of work tracked through their lifecycle.
## Status Values
| Status | Description |
|--------|-------------|
| `open` | Not started |
| `in_progress` | Currently being worked on |
| `blocked` | Waiting on external dependency |
| `done` | Completed successfully |
| `cancelled` | Will not be done |
## Priority Scale
- **5**: Critical - drop everything
- **4**: High - do this week
- **3**: Medium - normal priority
- **2**: Low - when time permits
- **1**: Someday - nice to have
## Example
```yaml
---
type: task
title: Implement user authentication
status: in_progress
priority: 4
assignee: "[[alice]]"
due_date: 2024-04-01
tags: [feature, security]
estimate_hours: 16
---
## Requirements
- OAuth 2.0 support
- Remember me functionality
- Password reset flow
**_types/person.md:**
```markdown
---
name: person
description: A team member
extends: base
match:
path_glob: "people/**/*.md"
fields:
name:
type: string
required: true
description: Full name
email:
type: string
description: Email address
team:
type: string
description: Team or department
role:
type: string
description: Job title or role
active:
type: boolean
default: true
description: Whether currently on the team
---
# Person
Team member records for assignment and reference.
_types/urgent.md:
---
name: urgent
description: Marks items requiring immediate attention
match:
where:
tags:
contains: "urgent"
fields:
escalation_contact:
type: string
description: Who to contact for escalation
sla_hours:
type: integer
description: Hours until SLA breach
---
# Urgent
Items tagged "urgent" automatically get this type applied.
This enables additional tracking fields for urgent items.
Sample Files
people/alice.md:
---
type: person
id: alice
name: Alice Chen
email: alice@example.com
team: engineering
role: Senior Developer
active: true
---
Alice is the tech lead for the backend team.
## Expertise
- Authentication systems
- API design
- Performance optimization
tasks/feature-login.md:
---
type: task
id: feature-login
title: Implement user login system
status: in_progress
priority: 4
assignee: "[[alice]]"
due_date: 2024-04-01
tags: [feature, security, auth]
estimate_hours: 24
---
# Login System Implementation
## Overview
Build complete authentication system with OAuth support.
## Subtasks
- [[subtasks/login-ui]] - Frontend components
- Database schema design
- API endpoints
- Testing
tasks/bug-crash.md:
---
types: [task, urgent]
id: bug-crash
title: Fix crash on startup
status: open
priority: 5
assignee: "[[bob]]"
tags: [bug, urgent, production]
escalation_contact: alice@example.com
sla_hours: 4
---
# Critical: App Crashes on Startup
## Symptoms
App crashes immediately when user opens it.
## Impact
100% of users affected.
## Workaround
None known.
A.3 Query Examples
Core Examples
All Open Tasks
query:
types: [task]
where: 'status == "open"'
order_by:
- field: priority
direction: desc
My Tasks (Assigned to Alice)
query:
types: [task]
where:
and:
- 'assignee.asFile().id == "alice"'
- 'status != "done"'
order_by:
- field: due_date
direction: asc
Tasks Due This Week
query:
types: [task]
where:
and:
- "due_date >= today()"
- "due_date <= today() + '7d'"
- 'status != "done"'
order_by:
- field: due_date
direction: asc
High Priority Blockers
query:
types: [task]
where:
and:
- "priority >= 4"
- 'status == "blocked"'
Urgent Items (Multi-Type)
query:
where: 'types.contains("urgent")'
order_by:
- field: sla_hours
direction: asc
Query+ Examples
Overdue Tasks (Query+)
query:
types: [task]
where:
and:
- "due_date < today()"
- 'status != "done"'
- 'status != "cancelled"'
formulas:
days_overdue: "(today() - due_date) / 86400000" # date subtraction returns milliseconds
order_by:
- field: formula.days_overdue
direction: desc
Workload by Person (Query+)
query:
types: [task]
where: 'status != "done" && exists(assignee)'
formulas:
assignee_name: "assignee.asFile().name"
To group by assignee (Query+):
query:
types: [task]
where: 'status != "done" && exists(assignee)'
formulas:
assignee_name: "assignee.asFile().name"
groupBy:
property: formula.assignee_name
direction: ASC
property_summaries:
estimate_hours: Sum
A.4 Knowledge Base Collection
A personal wiki / knowledge base setup.
Structure
knowledge-base/
├── mdbase.yaml
├── _types/
│ ├── document.md
│ ├── concept.md
│ ├── source.md
│ └── daily.md
├── concepts/
│ ├── machine-learning.md
│ └── distributed-systems.md
├── sources/
│ └── attention-paper.md
├── daily/
│ └── 2024/
│ └── 03/
│ └── 15.md
└── inbox/
└── random-thought.md
Types
_types/document.md:
---
name: document
description: General note or document
match:
path_glob: "**/*.md"
fields:
title:
type: string
tags:
type: list
items:
type: string
default: []
related:
type: list
items:
type: link
default: []
---
# Document
Base type for all notes. Matched by default for any markdown file.
_types/concept.md:
---
name: concept
description: A concept or topic being studied
extends: document
match:
path_glob: "concepts/**/*.md"
fields:
aliases:
type: list
items:
type: string
default: []
description: Alternative names for this concept
status:
type: enum
values: [stub, developing, mature]
default: stub
sources:
type: list
items:
type: link
target: source
default: []
---
# Concept
Represents a topic or idea being studied. Evolves from stub to mature.
_types/daily.md:
---
name: daily
description: Daily journal entry
extends: document
match:
path_glob: "daily/**/*.md"
filename_pattern: "{date}.md"
fields:
date:
type: date
required: true
mood:
type: enum
values: [great, good, okay, rough, bad]
highlights:
type: list
items:
type: string
default: []
---
# Daily Note
Journal entries organized by date.
A.5 CLI Workflow Example
# Initialize a new collection
mkdir my-project && cd my-project
mdbase init
# Create a type
mdbase type create task
# Create a task
mdbase create task \
--field title="Build the thing" \
--field priority=4
# List all tasks
mdbase query --type task
# Find overdue tasks
mdbase query --type task --where 'due_date < today() && status != "done"'
# Update a task
mdbase update tasks/01ABC.md --field status=done
# Rename with reference updates
mdbase rename tasks/old-name.md tasks/new-name.md
# Validate the collection
mdbase validate
# Rebuild cache
mdbase cache rebuild
Appendix B: Expression Grammar
This appendix provides a formal grammar for the expression language used in filters, match conditions, and formulas.
B.1 Grammar Notation
This grammar uses Extended Backus-Naur Form (EBNF):
=defines a production rule|denotes alternatives[ ]denotes optional elements{ }denotes zero or more repetitions" "denotes literal strings( )groups elements/* */are comments
B.2 Complete Grammar
(* Top-level *)
expression = null_coalescing_expression ;
(* Null coalescing - low precedence *)
null_coalescing_expression = or_expression { "??" or_expression } ;
(* Logical operators *)
or_expression = and_expression { "||" and_expression } ;
and_expression = not_expression { "&&" not_expression } ;
not_expression = "!" not_expression
| comparison_expression ;
(* Comparison operators *)
comparison_expression = additive_expression [ comparison_op additive_expression ] ;
comparison_op = "==" | "!=" | "<" | ">" | "<=" | ">=" ;
(* Arithmetic operators *)
additive_expression = multiplicative_expression { ( "+" | "-" ) multiplicative_expression } ;
multiplicative_expression = unary_expression { ( "*" | "/" | "%" ) unary_expression } ;
unary_expression = "-" unary_expression
| postfix_expression ;
(* Property access and function calls *)
postfix_expression = primary_expression { postfix_op } ;
postfix_op = "." identifier [ call_arguments ] (* Method or property *)
| "[" expression "]" (* Index access *)
| call_arguments ; (* Function call *)
call_arguments = "(" [ argument_list ] ")" ;
argument_list = expression { "," expression } ;
(* Primary expressions *)
primary_expression = literal
| identifier
| "(" expression ")"
| if_expression
| list_literal ;
(* If expression *)
if_expression = "if" "(" expression "," expression "," expression ")" ;
(* Literals *)
literal = string_literal
| number_literal
| boolean_literal
| null_literal ;
string_literal = '"' { string_char } '"'
| "'" { string_char } "'" ;
string_char = /* any character except quote or backslash */
| escape_sequence ;
escape_sequence = "\\" ( '"' | "'" | "\\" | "n" | "r" | "t" ) ;
number_literal = integer_literal [ exponent ]
| float_literal ;
integer_literal = [ "-" ] digit { digit } ;
float_literal = [ "-" ] digit { digit } "." digit { digit } [ exponent ] ;
exponent = ( "e" | "E" ) [ "+" | "-" ] digit { digit } ;
boolean_literal = "true" | "false" ;
null_literal = "null" ;
list_literal = "[" [ expression { "," expression } ] "]" ;
(* Identifiers *)
identifier = ( letter | "_" ) { letter | digit | "_" } ;
letter = "a" | "b" | ... | "z" | "A" | "B" | ... | "Z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
B.3 Operator Precedence
From highest to lowest precedence:
| Level | Operators | Associativity | Description |
|---|---|---|---|
| 1 | ( ) |
— | Grouping |
| 2 | . [] () |
Left-to-right | Property access, index, call |
| 3 | ! - (unary) |
Right-to-left | Logical NOT, negation |
| 4 | * / % |
Left-to-right | Multiplication, division, modulo |
| 5 | + - |
Left-to-right | Addition, subtraction |
| 6 | < <= > >= |
Left-to-right | Comparison |
| 7 | == != |
Left-to-right | Equality |
| 8 | && |
Left-to-right | Logical AND |
| 9 | ` | ` | |
| 10 | ?? |
Left-to-right | Null coalescing |
B.4 Reserved Words
The following identifiers are reserved:
true
false
null
if
note
file
formula
this
These cannot be used as bare field names without bracket notation (e.g., use note["file"] to access a frontmatter field named file).
The keywords note, file, formula, and this serve as namespace prefixes for property access (see Querying §10.5). note. accesses raw persisted frontmatter; bare field names access effective frontmatter. When a frontmatter field name collides with a namespace keyword, use the note. prefix with bracket notation: note["file"], note["formula"].
`file` Namespace Properties
The file namespace provides access to file metadata. The following properties are valid under file.:
file.name file.basename file.path file.folder
file.ext file.size file.ctime file.mtime
file.body file.links file.backlinks file.tags
file.properties file.embeds
file.body is a string containing the raw markdown body content (everything after the frontmatter closing ---). It supports all string methods defined in §11.5.
Chained Method Calls
The grammar supports chained method calls through the recursive postfix_op production. This explicitly includes chained .asFile() for multi-hop link traversal:
assignee.asFile().manager.asFile().name
This is parsed as a sequence of postfix operations:
postfix_expression
├── identifier: "assignee"
├── method_call: "asFile" ()
├── property_access: "manager"
├── method_call: "asFile" ()
└── property_access: "name"
Implementations MUST enforce a maximum traversal depth (default: 10) to prevent unbounded chains. See §8.7 for traversal rules.
B.5 Whitespace and Comments
Whitespace (spaces, tabs, newlines) is ignored except within string literals.
Comments are not supported in the expression language. (Use YAML comments in query files instead.)
B.6 String Escaping
Within string literals:
| Escape | Meaning |
|---|---|
\\ |
Backslash |
\" |
Double quote |
\' |
Single quote |
\n |
Newline |
\r |
Carriage return |
\t |
Tab |
B.7 Duration Literals
Duration literals are strings with a special format used in date arithmetic:
duration_literal = string_literal ; (* Must match duration pattern *)
duration_pattern = number duration_unit ;
duration_unit = "y" | "year" | "years"
| "M" | "month" | "months"
| "w" | "week" | "weeks"
| "d" | "day" | "days"
| "h" | "hour" | "hours"
| "m" | "minute" | "minutes"
| "s" | "second" | "seconds" ;
Examples: "7d", "2 weeks", "1h", "30m"
Note: Durations are regular strings; the arithmetic operators recognize them contextually.
B.8 Parse Examples
Simple Comparison
status == "open"
Parse tree:
comparison_expression
├── additive_expression
│ └── primary_expression
│ └── identifier: "status"
├── comparison_op: "=="
└── additive_expression
└── primary_expression
└── string_literal: "open"
Combined Logic
priority >= 3 && status != "done"
Parse tree:
and_expression
├── comparison_expression
│ ├── identifier: "priority"
│ ├── ">="
│ └── number_literal: 3
└── comparison_expression
├── identifier: "status"
├── "!="
└── string_literal: "done"
Method Chain (Arrow Extension Only)
tags.filter(t => t.startsWith("bug")).length > 0
Parse tree (only valid if the optional arrow-function extension is enabled):
comparison_expression
├── postfix_expression
│ ├── postfix_expression
│ │ ├── identifier: "tags"
│ │ └── method_call: "filter"
│ │ └── lambda_expression
│ │ ├── parameter: "t"
│ │ └── method_call
│ │ ├── identifier: "t"
│ │ └── method: "startsWith"
│ │ └── string_literal: "bug"
│ └── property_access: "length"
├── ">"
└── number_literal: 0
Conditional
if(priority > 3, "high", "normal")
Parse tree:
if_expression
├── condition
│ └── comparison_expression
│ ├── identifier: "priority"
│ ├── ">"
│ └── number_literal: 3
├── then_value
│ └── string_literal: "high"
└── else_value
└── string_literal: "normal"
B.9 Implementation Notes
Tokenization
Recommended token types:
STRING : '"' ... '"' | "'" ... "'"
NUMBER : [0-9]+ ('.' [0-9]+)?
IDENTIFIER : [a-zA-Z_][a-zA-Z0-9_]*
BOOLEAN : 'true' | 'false'
NULL : 'null'
IF : 'if'
OPERATOR : '==' | '!=' | '<=' | '>=' | '<' | '>' | '&&' | '||' | '!' | '+' | '-' | '*' | '/' | '%' | '??' | '=>'
(* include '=>' only if the optional arrow-function extension is enabled *)
PUNCTUATION : '(' | ')' | '[' | ']' | '.' | ','
Error Recovery
When parsing fails, implementations SHOULD:
- Report the position of the error
- Provide context (surrounding tokens)
- Suggest likely fixes for common errors
Example error message:
Expression parse error at position 15:
status == "open" &&
^
Expected: expression
Found: end of input
Hint: Expression is incomplete after '&&'
B.10 Optional Arrow-Function Extension
Implementations MAY support arrow-function syntax in list methods. If supported, the following grammar is added for lambda expressions used within argument lists:
lambda_expression = identifier "=>" expression
| "(" [ parameter_list ] ")" "=>" expression ;
parameter_list = identifier { "," identifier } ;
Appendix C: Error Codes
This appendix defines standard error codes for validation issues and operation errors.
C.1 Validation Error Codes
Field Errors
| Code | Description | Example |
|---|---|---|
missing_required |
Required field is absent or null | Field title is required |
type_mismatch |
Value doesn't match declared type | Expected integer, got string |
constraint_violation |
Value violates min/max/pattern/etc | Value 7 exceeds max of 5 |
invalid_enum |
Value not in enum options | "pending" not in [open, done] |
unknown_field |
Field not in schema (strict mode) | Unknown field "custom" |
deprecated_field |
Field is marked deprecated | Field "old_name" is deprecated |
duplicate_id |
id_field value is not unique in collection |
id: task-001 appears in multiple files |
duplicate_value |
Field value violates cross-file unique constraint |
slug: "my-post" appears in multiple files |
List Errors
| Code | Description | Example |
|---|---|---|
list_too_short |
List has fewer than min_items | Minimum 1 item required |
list_too_long |
List has more than max_items | Maximum 10 items allowed |
list_duplicate |
Duplicate in list with unique=true | Duplicate value "a" |
list_item_invalid |
List item fails validation | Item [2] type mismatch |
String Errors
| Code | Description | Example |
|---|---|---|
string_too_short |
String shorter than min_length | Minimum 1 character required |
string_too_long |
String longer than max_length | Maximum 200 characters allowed |
pattern_mismatch |
String doesn't match pattern | Must match "^[A-Z].*" |
Number Errors
| Code | Description | Example |
|---|---|---|
number_too_small |
Number below min | Value -1 below min of 0 |
number_too_large |
Number above max | Value 100 above max of 10 |
not_integer |
Expected integer, got float | 3.5 is not an integer |
Link Errors
| Code | Description | Example |
|---|---|---|
invalid_link |
Link cannot be parsed | Malformed wikilink |
link_not_found |
Link target doesn't exist | Target "[[missing]]" not found |
link_wrong_type |
Target is wrong type | Expected person, found task |
ambiguous_link |
Multiple candidates for simple name link after tiebreakers | "[[note]]" matches notes/note.md and archive/note.md |
Date/Time Errors
| Code | Description | Example |
|---|---|---|
invalid_date |
Cannot parse as date | "tomorrow" is not ISO date |
invalid_datetime |
Cannot parse as datetime | Invalid datetime format |
invalid_time |
Cannot parse as time | Invalid time format |
C.2 Type System Errors
| Code | Description | Example |
|---|---|---|
unknown_type |
Type name not defined | Type "taks" not found |
circular_inheritance |
Type inheritance forms cycle | task → base → task |
missing_parent_type |
Parent type doesn't exist | Parent "base" not found |
type_conflict |
Multi-type field incompatibility | "status" defined as string and enum |
invalid_type_definition |
Type file has invalid schema | Missing required "name" field |
circular_computed |
Circular dependency between computed fields | full_name depends on display depends on full_name |
C.3 Operation Errors
File Operations
| Code | Description | Example |
|---|---|---|
file_not_found |
File doesn't exist | tasks/missing.md not found |
path_conflict |
File already exists at target path (on create or rename) | tasks/task.md already exists |
path_required |
Cannot determine file path | No path provided or derivable |
invalid_path |
Path is malformed | Path contains invalid characters |
invalid_frontmatter |
Frontmatter YAML cannot be parsed | YAML syntax error in frontmatter |
validation_failed |
Frontmatter fails validation against type schema(s) | Contains individual validation issues (see §C.1) |
permission_denied |
Filesystem permission error | Cannot write to file |
concurrent_modification |
File was modified by another process during operation | File mtime changed between read and write |
path_traversal |
Link resolution attempted to escape collection root | [[../../../etc/passwd]] escapes root |
Rename Operations
| Code | Description | Example |
|---|---|---|
rename_ref_update_failed |
Reference update failed for one or more files | Could not update links in X |
Configuration Errors
| Code | Description | Example |
|---|---|---|
invalid_config |
Config file malformed | YAML parse error |
missing_config |
No mdbase.yaml found | Not a collection |
unsupported_version |
spec_version not supported | Version 2.0 not supported |
C.4 Expression Errors
| Code | Description | Example |
|---|---|---|
invalid_expression |
Expression syntax error | Unexpected token |
unknown_function |
Function doesn't exist | Unknown function "foo" |
wrong_argument_count |
Wrong number of arguments | if() requires 3 arguments |
type_error |
Type error in expression | Cannot add string and number |
expression_depth_exceeded |
Expression traversal exceeded maximum depth | Chained asFile() calls exceed 10-hop limit |
C.5 Formula Errors
| Code | Description | Example |
|---|---|---|
circular_formula |
Formula references form cycle | a refs b refs a |
invalid_formula |
Formula expression invalid | Parse error in formula |
formula_evaluation_error |
Runtime error in formula | Division by zero |
C.6 Error Response Format
Errors SHOULD be returned in a consistent format:
Single Error
{
"error": {
"code": "file_not_found",
"message": "File 'tasks/missing.md' not found",
"path": "tasks/missing.md"
}
}
Validation Errors
{
"valid": false,
"errors": [
{
"path": "tasks/task-001.md",
"field": "priority",
"code": "constraint_violation",
"message": "Value 7 exceeds maximum of 5",
"severity": "error",
"expected": { "max": 5 },
"actual": 7,
"type": "task",
"line": 5,
"column": 11,
"end_line": 5,
"end_column": 12
},
{
"path": "tasks/task-001.md",
"field": "custom_field",
"code": "unknown_field",
"message": "Field 'custom_field' is not defined in type 'task'",
"severity": "warning",
"type": "task",
"line": 8,
"column": 1,
"end_line": 8,
"end_column": 27
}
],
"warnings": 1,
"errorCount": 1
}
C.7 Error Severity
| Severity | Description | Effect |
|---|---|---|
error |
Definite problem | Fails validation at error level |
warning |
Potential problem | Reported but doesn't fail |
info |
Informational | Logged, no effect |
C.8 Human-Readable Messages
Error messages SHOULD be:
- Clear: State what went wrong
- Specific: Include relevant values
- Actionable: Suggest how to fix
Good example:
Field 'priority' has value 7, but maximum allowed is 5.
Change the value to 5 or less.
Bad example:
Constraint violation on priority.
C.9 Exit Codes (CLI)
For CLI implementations:
| Code | Description |
|---|---|
| 0 | Success |
| 1 | General error |
| 2 | Validation error(s) |
| 3 | Configuration error |
| 4 | File not found |
| 5 | Permission denied |
Appendix D: Compatibility Notes
This appendix describes compatibility with existing tools and migration paths from other systems.
D.1 Obsidian Bases Compatibility
This specification was designed with Obsidian Bases compatibility as a goal. Many expression and query patterns are directly compatible.
Compatible Features
| Feature | This Spec | Obsidian Bases |
|---|---|---|
| Property access | status, file.name |
Same |
| Comparison | ==, !=, <, >, <=, >= |
Same |
| Boolean logic | &&, ||, ! |
Same |
| Date functions | now(), today() |
Same |
| Date arithmetic | date + "7d" |
Same |
| String methods | .contains(), .startsWith() |
Same |
| List methods | .contains(), .length |
Same |
| Link traversal | link.asFile() |
Same |
| File metadata | file.mtime, file.path, file.tags |
Same |
| Context | this.file, this.property |
Same |
| Logical structure | and:, or:, not: in YAML |
Same |
| Type checking | .isType("string") |
Same |
| Type conversion | number(), list(), .toString() |
Same |
| List methods | .unique(), .reduce(), .reverse() |
Same |
| String methods | .lower(), .upper(), .split() |
Same |
| Summaries | values keyword, default functions |
Same |
| Grouping | groupBy (single property) |
Same |
Extended Features
This specification adds features not in Obsidian Bases:
- Type definitions as markdown files (version-controlled schemas)
- Multi-type matching with constraint merging
- Formal validation with error codes and levels
- Rename with reference updates
- CRUD operations specification
- Generated fields (ULID, UUID, timestamps)
- Filename patterns with slug generation
- Match rules for automatic type assignment
- Nested collection detection
- Security considerations (ReDoS, resource limits)
Differences
| Aspect | This Spec | Obsidian Bases |
|---|---|---|
| Type storage | Markdown files in types folder | Obsidian internal |
| Configuration | mdbase.yaml |
Obsidian settings |
| Views | Not specified (query only) | Table, Board, Gallery, etc. |
| Grouping | groupBy clause (single property, per §10.7) |
Built-in groupBy |
| Summaries | property_summaries and custom summaries (per §10.7, §11.14) |
Built-in summary functions |
| Lambda style | Implicit variables (value, index, acc); arrow syntax optional |
Implicit variables |
| Method names | .lower(), .upper(), .title() |
Same |
Optional Compatibility Profile (Non-Normative)
Implementations MAY provide an optional "Bases compatibility" profile that mirrors Obsidian Bases query and expression behavior. This is not a required part of conformance. If provided, tools SHOULD document:
- Which Bases features are supported
- Any behavioral differences
- How to enable the profile (if applicable)
Migration from Bases Queries
Most Bases queries work directly:
# Bases query
types: [task]
where: 'status == "open"'
order_by:
- field: due_date
direction: asc
# This spec: identical!
D.2 Dataview Compatibility
Dataview is a popular Obsidian plugin with its own query language. Here's how to migrate common patterns.
Query Migration
| Dataview | This Spec |
|---|---|
FROM "tasks" |
folder: "tasks" |
WHERE status = "open" |
where: 'status == "open"' |
WHERE contains(tags, "urgent") |
where: 'tags.contains("urgent")' |
SORT due_date ASC |
order_by: [{field: due_date, direction: asc}] |
LIMIT 10 |
limit: 10 |
Full Example
Dataview:
TABLE title, status, due_date
FROM "tasks"
WHERE status != "done"
SORT due_date ASC
LIMIT 20
This Spec:
query:
folder: "tasks"
where: 'status != "done"'
order_by:
- field: due_date
direction: asc
limit: 20
Unsupported Dataview Features
| Feature | Notes |
|---|---|
Inline fields (field::) |
Not supported; use frontmatter |
| TABLE format | Implementations define output format |
| LIST format | Implementations define output format |
| TASK queries | Use where with checkbox fields |
| CALENDAR view | Implementation-specific |
| DataviewJS | Not applicable |
D.3 Hugo/Jekyll Front Matter
Static site generators use frontmatter similarly but with different conventions.
Hugo
Hugo uses specific frontmatter keys:
| Hugo | This Spec |
|---|---|
title |
Same (user-defined) |
date |
Same (user-defined) |
draft |
Same (user-defined) |
weight |
Same (user-defined) |
taxonomies |
Use list fields |
Migration: Hugo content mostly works directly. Define types that match your content structure.
Jekyll
Jekyll collections map naturally:
# Jekyll _config.yml
collections:
posts:
output: true
projects:
output: true
# This spec mdbase.yaml
spec_version: "0.1.0"
settings:
types_folder: "_types"
# Define post and project types
D.4 Notion Export Compatibility
When exporting from Notion to markdown:
Database Properties → Frontmatter
Notion databases export properties as frontmatter:
---
title: My Page
Status: In Progress
Due Date: 2024-03-15
Tags:
- important
- review
---
Type Creation
Create types matching your Notion databases:
# _types/notion-task.md
---
name: notion-task
fields:
title:
type: string
Status:
type: enum
values: [Not Started, In Progress, Done]
"Due Date":
type: date
Tags:
type: list
items:
type: string
---
Note: Notion uses spaces in property names. Use bracket notation in queries: note["Due Date"].
D.5 Logseq Compatibility
Logseq uses a block-based structure with page properties.
Page Properties
Logseq page properties map to frontmatter:
title:: My Page
status:: open
tags:: #task #urgent
Becomes:
---
title: My Page
status: open
tags: [task, urgent]
---
Block Properties
Logseq block properties don't have a direct equivalent. Consider:
- Converting important blocks to separate files
- Using structured frontmatter objects
- Using the body for detailed content
D.6 Tana Compatibility
Tana exports use a specific JSON/markdown format. Key mappings:
| Tana | This Spec |
|---|---|
| Supertags | Types |
| Fields | Frontmatter fields |
| References | Links |
D.7 Migration Strategies
Incremental Migration
- Start untyped: Import files without types
- Add types gradually: Create types for one category at a time
- Enable validation: Move from
offtowarntoerror - Enforce strictness: Enable
strict: truewhen ready
Automated Migration
# 1. Initialize collection
mdbase init
# 2. Scan existing files and suggest types
mdbase infer-types --output _types/
# 3. Review and adjust generated types
# (manual step)
# 4. Validate and fix issues
mdbase validate --fix --dry-run
mdbase validate --fix
Handling Legacy Fields
For fields that don't fit the new schema:
# Option 1: Type with any field for legacy data
fields:
legacy:
type: any
deprecated: true
# Option 2: Loose strictness during migration
strict: false # Allow unknown fields
D.8 Tool-Specific Notes
VS Code
- Extensions can parse
mdbase.yamlfor IntelliSense - Frontmatter validation possible via YAML schemas
- Query preview via custom webviews
Vim/Neovim
- YAML syntax highlighting works for frontmatter
- Custom commands can invoke CLI tools
- LSP integration possible for validation
Emacs
- Org-mode users: consider bidirectional sync
- Markdown-mode with YAML support
- Custom functions for query execution
D.9 Interoperability Best Practices
- Stick to common field types: String, integer, date, list work everywhere
- Avoid tool-specific features: Keep frontmatter portable
- Use standard date formats: ISO 8601 always
- Keep links simple: Wikilinks are most portable
- Document your schema: Types are self-documenting
- Version your types: Track schema changes in git