For data teams building medallion + data mesh

The company brain for AI data teams.

Stop breaking changes at merge. Resolve incidents in minutes. Answer any data question with AI grounded in your contracts.

Powered by LakeLogic OSS. Metadata only — your data never leaves your lakehouse.

14K+PyPI downloadsorganic — no paid acquisition
Apache 2.0Open sourceGitHub: LakeLogic/LakeLogic
3 enginesRun anywherePolars · DuckDB · Spark
YAMLOne contractBronze → Silver → Gold

Where LakeLogic sits

Between observability and governance. Contract-first.

Observability

Monte Carlo · Bigeye · Anomalo

Watches pipelines after they run. Detects anomalies once they've already hit production.

  • ·Post-hoc detection
  • ·You learn after the dashboard breaks
  • ·No PR-time enforcement

Contract Control Plane

LakeLogic

Defines trust before the run, enforces it during the run, proves it after. Across every engine.

  • Block unsafe changes in PRs
  • Quarantine bad rows at runtime
  • Zeus diagnoses incidents and reduces MTTR
  • Polars · DuckDB · Spark · Delta · Iceberg

Governance & Catalog

Collibra · Atlan · Informatica

Documents pipelines after they exist. Heavyweight rollouts, separate from engineering workflow.

  • ·Documentation-led
  • ·Lives outside the PR workflow
  • ·6-month rollouts, 6-figure prices

Most teams buy separate tools for detection, governance, and remediation. LakeLogic compresses the three jobs into one operating layer — built around the contract, not the dashboard.

Trust by Default

Data contracts that enforce themselves.

Define quality rules, owners, SLAs, and PII flags in a visual editor business users can navigate — while engineers commit the same definition as plain text alongside the code.

  • Bad data never reaches your dashboards.— Quarantine routing on every pipeline run
  • Business users and engineers edit the same contract.— Visual editor for ownership, SLAs, and PII tags; plain-text for code reviews
  • Generate a contract in minutes from code you already have.— Zeus reads your pipelines (AI-assisted)
  • Change reviews built into your workflow.— Git + pull requests, no new tools to learn
  • Open standard — works with your existing tooling.— Plain YAML, no proprietary format

Runs on

One contract. Three execution paths.

The same YAML runs as native code or on any Spark runtime, and reads from / writes to your warehouse — without rewriting transformations.

Native engines

In-process. Zero JVM. Today.

Apache SparkSpark
DuckDBDuckDB
PolarsPolars

Spark runtimes

Anywhere PySpark runs. Today.

DatabricksDatabricks
Microsoft FabricFabric

Also runs on: AWS EMR, AWS Glue, Azure Synapse, GCP Dataproc — anywhere you can pip install lakelogic.

Warehouses

Read / write today

Native SQL pushdown on the roadmap.

SnowflakeSnowflake
BigQueryBigQuery
RedshiftRedshift

How: pull data in via the integrated load layer, transform on Polars/DuckDB/Spark, write results back.

Company Brain

Every incident makes Zeus smarter.

Over 6 months, teams go from 45-minute incident resolution to 5 minutes. The Knowledge Base becomes institutional memory that never leaves with people.

↻ Next incident auto-suggested
Memory match found

Prior playbook: stripe_email_normalize

Match confidence95%
Fix confidence92%
Pattern seen3 times
Prior fix: SET email = LOWER(TRIM(email))
10×
Faster incident resolution
80%
Auto-suggested fixes at scale
0
Knowledge lost to turnover

Zeus ROI

What that’s worth to your team.

Move the sliders to match your team. The number on the right is the engineer-hours Zeus reclaims every year.

Your setup

25
1500
30
0150

A healthy 25-pipeline team typically sees 10–40 incidents/month. The pipelines slider above just sets context.

6 hrs
1 hrs24 hrs
$100
$50$300

Assumption: Zeus diagnoses ~80% of incidents to a resolution in under an hour. Untouched incidents fall back to your current time-to-resolve. Adjust your hourly cost to match fully-loaded salary + benefits.

Annual impact

Reclaimed eng cost / year

$144,000

≈ 1,440 engineer-hours reclaimed — 0.7 FTE of capacity returned to feature work.

Incidents / yr

360

Cost without Zeus

$216,000

Cost with Zeus

$72,000

Hours saved / yr

1,440

Estimates only. Real savings depend on incident mix, on-call structure, and how quickly your team adopts Zeus suggestions.

Have questions? Most teams do.

The short answers below cover what we get asked most often. If you don't see yours, the founders read every inbound — reach out directly.

Talk to the founders

Built for the security team too

Your data never leaves your lakehouse. Period.

Metadata only

We process schemas, lineage, rule names, row counts. Never row-level data. Your warehouse stays your warehouse.

Open source core

The runtime engine is Apache 2.0 on GitHub. Audit the code. Self-host the OSS forever. No vendor lock-in by design.

GDPR-ready primitives

PII flagging, masking strategies, and right-to-be-forgotten erasure are first-class — built into the contract, not bolted on.

SOC 2 on the roadmap

Pre-launch and pursuing SOC 2 Type II. Until then: minimal data surface, regional deployment, signed DPA on request.

Need a security questionnaire, DPA, or architecture deep-dive? Contact us — the founders read every inbound and reply within a business day.

Two Products · One Vision

Build your company brain.

Join the data teams building institutional memory for their data platforms — powered by contracts, lineage, and AI agents.

Contact Us
LakeLogic Open Source

The declarative, executable contract engine. Apache 2.0 — free forever, runs on Polars, Spark, or DuckDB.

Get LakeLogic OSS
LakeLogic Cloud

Observability, Zeus AI, and enterprise governance — fully managed. Zero infrastructure to run.

Join Cloud waitlist
Migration Path

On OSS already? On launch, drop your docs/contracts/*.yaml straight into Cloud.

Reserve early-access