Curate training data
Upload PDF, PowerPoint, or text files; paste or edit source text; then generate QA or chat-style rows to review with your team. Datasets can be scoped to a group.
dotori
How it works
From raw inputs to production models in four steps.
Hover a card for a preview, or click to jump straight into the guided tour.
The journey
Four chapters — tap a step or watch the tour. Each stage maps to real screens in your workspace.
Upload PDF, PPTX, or TXT, or paste text in the builder. Edit the source, then generate Q&A or multi-turn chat rows your team can review before training.
Workspace preview
One workspace to ingest documents or pasted text, shape QA and chat datasets, run training jobs, evaluate checkpoints, and roll out — with lineage and access controls so your data stays in your boundary.
Upload PDF, PowerPoint, or text files; paste or edit source text; then generate QA or chat-style rows to review with your team. Datasets can be scoped to a group.
Run adapters and checkpoints on your stack — you choose what goes in, and what ships out.
Weights and artifacts stay in your boundary. No need to paste sensitive docs into public APIs.
See which job produced which checkpoint, compare runs, and hand off models without guesswork.
Groups, roles, and shared workspaces so experts contribute data and feedback without becoming ML engineers.
N-gram scores against your dataset references on every plan; paid tiers unlock LLM-as-judge ratings and side-by-side reference vs fine-tuned output before you ship.
API keys, training workers, and activity trails structured for teams that need to show their work.
Interview content
Hallucinations, legal exposure, and one-size-fits-all responses are costly in enterprise settings. Firms and hospitals need customized models with policy controls, traceability, and domain alignment.
Create an account, invite your team, and open the dataset builder — upload files, paste text, or generate training examples — no credit card required to explore the workspace.