Database documentation tool for private databases

A database documentation tool for private databases has a harder job than a documentation tool for public or cloud-accessible databases. It must help the team understand the schema without asking the company to weaken the network model.

That is the point of the Taavik agent architecture. The database documentation tool runs from a hosted workspace, but the scan is performed by an on-premise agent. Credentials stay in the private environment, protected by the security model described on the sealed credentials page.

The result is a shared documentation workspace that can describe private databases without turning the cloud into a direct database client.

What makes private database documentation different

Private databases usually sit behind firewalls, VPNs, private subnets, customer networks, or internal routing. That is intentional. Production databases should not become reachable just because the team wants a better data dictionary.

A documentation workflow has to respect that boundary.

Teams often try one of three approaches:

export schema snapshots by hand.
copy database metadata into a wiki.
install a self-hosted catalog platform.

Each approach can work for a while. The problem is maintenance. Snapshots go stale. Wikis drift away from the schema. Self-hosted platforms add operational cost.

An agent-based documentation tool gives the team another option: keep the database private, scan metadata from inside the network, and centralize the documentation view.

The role of the agent

The agent is the local runtime. It is installed where the database is reachable. It connects with the configured database role, scans the schema metadata, and sends the catalog back to the workspace for rendering.

That split is important. The workspace coordinates the experience: documentation, object pages, search, manual notes, and change history. The agent owns the database connection path.

For private teams, this model is usually easier to approve than opening inbound database access to a hosted documentation vendor.

What the documentation includes

The generated documentation should cover the parts of the schema that people ask about every week:

tables.
columns.
data types.
nullability.
defaults.
indexes.
foreign keys.
views and routines where supported.
recent schema changes.

This gives the team a current baseline. Then engineers can add manual descriptions to explain business meaning.

A good documentation tool should not overwrite those descriptions every time the schema changes. Generated metadata and human context should live together, but they should not fight each other.

Why generated Markdown is practical

Markdown is a good format for database documentation because it is readable and structured. A generated Markdown page can show a table summary, a column list, indexes, relationships, and recent changes without requiring a custom authoring workflow.

It also creates a consistent shape across databases. A PostgreSQL table, a SQL Server table, and a MySQL table can be rendered in a similar pattern even when the provider metadata differs.

That consistency lowers the cost of onboarding. New engineers learn one documentation shape, then apply it across connections.

How change detection keeps trust

The main reason teams abandon documentation is not missing features. It is lost trust.

If a table page shows the old column name, the reader immediately wonders what else is wrong. If a wiki does not mention last week's schema change, the team goes back to direct database inspection.

A private database documentation tool should therefore include change detection. Every scan should compare the current schema with the previous snapshot and record the important differences.

When documentation and schema changes live together, the reader can answer two questions at once: what exists now, and what changed recently.

Where security review usually focuses

Security teams will usually ask four questions:

where do credentials live?
who can access the documentation?
what metadata is stored in the workspace?
how is the agent authenticated?

Those questions are useful. They force the documentation workflow to match the private database boundary.

The safest operating model is boring and explicit: use a restricted database role, install the agent in the approved network, limit workspace access by team, and review the metadata that will be stored in the cloud control plane.

When to choose this model

An agent-based documentation tool fits when the team wants shared documentation but the database should stay private.

It is especially useful for:

internal SaaS databases.
customer-hosted installations.
regulated operational databases.
environments with no inbound database access.
teams replacing manual wiki maintenance.

Start documenting a private database safely.