How to Use Label Studio: A Complete, No‑Fluff Guide for 2025
If you’re building computer vision, NLP, or multimodal AI, you’ll likely hit the same bottleneck: high‑quality labeled data. Label Studio, an open‑source data labeling platform, gives you flexible control over image, text, audio, time series, and video annotations without locking you into a single ML stack. In this practical, step‑by‑step tutorial, we’ll show you how to use Label Studio—from installation to export—so you can move from “blank project” to “production‑ready labels” with confidence.
We’ll follow a practical & solution‑oriented style: short steps, clear decisions, and helpful tips to avoid common gotchas.
What You’ll Learn
- How to install and launch Label Studio
- How to create your first project and choose a labeling template
- How to import data (local files, cloud buckets, URLs)
- How to set up the labeling interface for images, text, audio, or video
- How to manage labelers, reviews, and quality assurance
- How to export annotations to formats compatible with your training pipelines
Worth noting: If you’re orchestrating multi‑model research or drafting dataset documentation, an AI copilot like Sider.AI can help generate task guidelines or auto‑summaries of annotation policies to keep teams aligned. You can check it out at Sider.ai. Why Label Studio?
- Flexible schema: Define custom labeling config for bounding boxes, polygons, keypoints, text spans, relations, audio regions, and more.
- Broad data types: Images, text, audio, HTML, time series, and video.
- Team workflows: Assign tasks, enable consensus, review annotations, and manage quality.
- Extensible: Integrate with storage backends, webhooks, and model‑assisted labeling.
For official overview and downloads, see the Label Studio homepage.
Step 1: Install Label Studio
You can run Label Studio locally with Python or Docker. Pick one approach:
Option A: Python (pip)
# Create a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install Label Studio
pip install label-studio
# Launch
label-studio start
Then visit the printed local URL (often `).
Option B: Docker
docker run -it -p 8080:8080 heartexlabs/label-studio:latest
If you’re new to Label Studio, the official “Getting Started” guide is concise and updated regularly, and the quick start focuses on the minimal steps to label a sample dataset.
Pro tip: For teams, consider a managed database (PostgreSQL) and mounted storage for resilience.
Step 2: Create a Project
- Log in to the UI and click “Create Project.”
- Give it a clear name (e.g., “Retail Shelf Detection v1”) and description (include dataset version and purpose).
- Choose “Labeling Setup.” You can:
- Start from a template (e.g., object detection, NER, sentiment, audio regions)
- Or write a custom XML config to tailor tools and classes
The quick start wizard helps you pick a template, rename classes, and save the config.
Step 3: Import Your Data
You can import data via the UI or API. Common paths:
- Upload local files (drag‑and‑drop)
- Provide URLs to remote files
- Connect cloud storage (S3, GCS, Azure Blob) via settings
- Use the REST API for programmatic ingestion
Data records usually include a data payload that points to your asset (e.g., "image": " or "text": "This is a sentence."`). Keep filenames stable to simplify mapping during export.
Quality tip: Version your dataset and keep a manifest of source → annotation export so you can reproduce training runs.
Step 4: Configure the Labeling Interface
The labeling interface defines tools and classes. You’ll see XML‑like config where you select components such as RectangleLabels, PolygonLabels, KeyPointLabels, TextArea, Choices, Audio, TimeSeries, etc.
Examples:
Image Object Detection
<View>
<Image name="img" value="$image"/>
<RectangleLabels name="label" toName="img">
<Label value="Product" background="#34D399"/>
<Label value="PriceTag" background="#60A5FA"/>
</RectangleLabels>
</View>
Text Named Entity Recognition (NER)
<View>
<Text name="txt" value="$text"/>
<Labels name="label" toName="txt">
<Label value="ORG"/>
<Label value="PERSON"/>
<Label value="LOC"/>
</Labels>
</View>
Audio Region Labeling
<View>
<Audio name="audio" value="$audio"/>
<Labels name="label" toName="audio">
<Label value="Speech"/>
<Label value="Noise"/>
<Label value="Music"/>
</Labels>
</View>
Start with the template closest to your task and iterate. Keep class names stable across versions to ease dataset merges.
Step 5: Labeling Best Practices
- Define clear guidelines: Include examples of correct vs. incorrect annotations and edge cases.
- Use hotkeys: Train speed and consistency by learning the keyboard shortcuts for your tools.
- Calibrate early: Have 2–3 labelers annotate the same 50–100 items, compare results, and refine the guide.
- Add pre‑annotations: If you have a baseline model, import predictions to speed up corrections.
- Balance throughput and quality: Use consensus or review queues when stakes are high.
By the way, for writing crisp, consistent annotation guidelines or converting domain knowledge into labeler‑friendly checklists, Sider.AI can draft and refine instructions quickly while keeping a changelog teams can follow. Step 6: Manage Labelers, Reviews, and QA
Label Studio supports teams:
- Assign tasks to specific annotators
- Enable review/approval workflows
- Track progress and labeler performance
- Use consensus (multiple annotations per task) to measure agreement
Set explicit acceptance criteria (e.g., IoU threshold for boxes, span boundary rules, minimum audio region duration) and enforce them during review.
Common QA checks:
- Missing labels or wrong classes
- Inconsistent bounding box tightness
- Overlapping entities in NER
- Drifting definitions over time (update the guide!)
Step 7: Export Annotations
When your batch is ready, export annotations for training. Label Studio stores annotations in JSON internally and lets you export to multiple formats. See the official export docs for the current list and steps.
Typical formats include:
- Raw Label Studio JSON (most complete and lossless)
- COCO (for detection/segmentation)
- YOLO (for object detection)
- CSV/TSV for simpler tasks
Important notes:
- Some tools (e.g., brush/segmentations) don’t map cleanly to certain formats—COCO and YOLO may not support free‑form brushes directly. See community guidance on segmentation export caveats.
- Converters exist for transforming Label Studio JSON to YOLO, but gaps can occur depending on the labeling tool used and the metadata you retained.
Practical export flow:
- Run a small test export early; validate that your training script parses it.
- Lock your export preset (class order, resolution assumptions, etc.).
- Document any conversion steps (scripts, version hashes) for reproducibility.
Step 8: Integrate With Your ML Pipeline
- Use the API to pull completed annotations into your training jobs.
- Keep splits deterministic: attach metadata like
split: train/val/test to tasks.
- Version everything: dataset manifests, annotation exports, model configs.
- Close the loop: run error analysis, identify failure clusters, and schedule relabeling rounds.
Workflow pattern:
- Mine hard examples from model errors
This active‑learning loop boosts quality faster than brute‑force labeling.
Troubleshooting Common Issues
- “My export won’t load into YOLO/COCO.”
- Check tool compatibility (e.g., brushes vs. polygons). Convert to compatible shapes when possible and consult the export docs and community notes.
- “Labels don’t match my training class order.”
- Fix ordering early. Standardize label names and preserve mapping in your pipeline.
- “Annotators disagree a lot.”
- Add calibration rounds, clarify rules, and consider consensus or arbitration steps.
- Use pre‑annotations, hotkeys, and tool‑specific speedups (e.g., auto‑segment, snapping). Prune low‑value tasks.
A 30‑Minute Quick Start Checklist
- Install Label Studio (pip or Docker)
- Create a project with the most relevant template
- Import 50–100 sample items
- Draft guidelines with edge cases and examples
- Assign two labelers for a calibration batch
- Review disagreements and update rules
- Test export into your training code
For an official, concise walkthrough, revisit “Getting Started” and the “Quick Start” guide.
Advanced Tips for Power Users
- Custom widgets: Extend the interface for domain‑specific tools.
- Webhooks: Trigger jobs (e.g., kick off conversions or model training) when tasks are completed.
- Model‑assisted labeling: Use pre‑labels from your in‑house or cloud models to reduce manual work.
- Data privacy: Run on‑prem, restrict exports, and log access for regulated datasets.
- Analytics: Track per‑class distribution and per‑labeler metrics to spot skew.
Conclusion: From Prototype to Production‑Ready Datasets
Label Studio helps you move quickly from concept to consistent training data: pick a template, define your schema, calibrate your team, and export in the formats your models need. Keep your guidelines living, validate exports early, and close the loop with active learning. With those habits, you’ll spend less time wrestling with formats and more time shipping models that work.
For deeper dives and templates, see:
- Export formats and caveats
FAQ
Q1:What is Label Studio used for?
Label Studio is an open‑source platform for annotating images, text, audio, time series, and video. It lets you design custom labeling interfaces and export annotations to formats your ML training pipelines can use.
Q2:How do I start a new project in Label Studio?
Create a project from the UI, select a template that matches your task, and customize the labeling config. Then import data (local files, URLs, or cloud storage) and assign tasks to annotators.
Q3:Which export formats does Label Studio support?
You can export raw JSON as well as formats like COCO, YOLO, Pascal VOC, and CSV/TSV. Some tools (like brush masks) may not map to all formats; check the export docs for details.
Q4:How can I speed up labeling in Label Studio?
Use pre‑annotations from a baseline model, learn hotkeys, and simplify your label schema. Run calibration rounds to reduce rework and set review criteria to catch errors early.
Q5:Can I run Label Studio with a team?
Yes. Assign tasks to annotators, enable reviews, and use consensus to measure agreement. Store data and annotations in reliable backends and automate exports with webhooks or the API.