Lensgrid — A faster Datasette for ML datasets.

A read-side dataset explorer optimized for the way ML researchers actually browse data: faceted filters, fast random samples, and column-stat views in one shot.

I want to Follow

No one has applied yet

Looking for

Founding ML engineer

Help me design the dataset-exploration primitives that real ML researchers actually need. You should have spent at least a year doing applied-ML work where you had to inspect raw data daily. Revenue-share — I'd rather have your judgement than your runway.

Open

revenue_share

The problem

ML teams spend a frustrating amount of time on "let me just check what's in this dataset" — and the answer is usually a Jupyter notebook and 20 minutes of pandas. Datasette is wonderful but optimized for SQLite + civic data; it's slow on the kind of multi-million-row Parquet files ML teams actually work with. Streamlit / Gradio dashboards are too custom — every team rebuilds the same column-stats + sampling UI. Roboflow + Label Studio focus on the labeling step, not the exploration step. There's an open lane for a fast, opinionated read-side tool.

Our solution

Lensgrid is a desktop app (Tauri-based, runs locally) that opens any Parquet / CSV / Arrow file and gives you: instant column stats (cardinality, top-k, null rate, distribution), faceted filters that compose, and fast random sampling. No SQL knowledge required, but a SQL escape hatch for power users. Built around Apache DataFusion so a 50M-row Parquet file opens in <2 seconds and filters run in <500ms.

Alice Builder

@alice_qa

Message

Join Lensgrid — A faster Datasette for ML datasets. Help us ship the next chapter.

1 role open · Remote

I want to