How to Define Regions of Interest for Computer Vision — A Practical Guide

Computer vision systems don’t process entire images uniformly. A person-detection model running at a busy entrance doesn’t need to watch the ceiling or the car park three blocks away — it needs to watch the door. Defining where the model should focus is the job of regions of interest (ROIs), and getting them right has a measurable impact on detection accuracy, false-positive rates, and compute cost.

This guide explains the different types of spatial regions you’ll encounter in CV deployments, how to draw them precisely without writing coordinate arrays by hand, and how to export them into your pipeline.

What Is a Region of Interest?

A region of interest is a polygon, rectangle, or other shape that delimits a meaningful area within an image or video frame. In practice, four types of regions cover the vast majority of CV use cases:

Detection zones — areas where objects should be detected. A polygon drawn around a doorway tells the model “count people here.” A rectangle covering a conveyor section tells a quality-inspection system “inspect only these items.”

Exclusion zones — areas to ignore. A security camera pointed at a street will generate constant false positives from traffic if you don’t exclude the road. Drawing an exclusion region and filtering out any detections that fall inside it reduces noise immediately.

Tripwires (counting lines) — a polyline that acts as a virtual boundary. Objects crossing it trigger a count or an alert. Two-directional crossing detection (in vs. out) requires two closely spaced parallel lines.

Coverage boundaries — closed polygons that document what area a camera or sensor actually covers. Used in planning documents, audit trails, and handover packages for security and facility management projects.

Why Not Just Pass the Full Frame?

Running inference on a full 4K frame for a use case that only concerns a 200×300-pixel doorway is wasteful. Cropping to the ROI before inference reduces:

Latency — smaller tensors process faster
False positives — the model sees less irrelevant content
Bandwidth — if frames are transmitted before processing

Even when cropping isn’t possible (e.g. the model is pre-trained on fixed input sizes), post-filtering detections by checking whether the bounding-box centroid falls inside an ROI polygon is a standard and effective approach.

Drawing Regions Without Writing Coordinate Arrays

Manually computing polygon vertices for anything more complex than a rectangle is impractical. The standard workflow is:

Load a representative still frame from the camera or sensor
Draw the regions visually on that frame
Export the polygon coordinates in the format your pipeline expects

RegionKit handles all three steps in the browser — no install, no account, no server upload.

Step 1 — Load the Frame

Drag a JPEG or PNG camera frame onto the canvas, paste it from your clipboard, or load it from a URL. The image becomes the background layer; your annotation layers sit on top.

Step 2 — Draw Your Regions

Use the Polygon tool (P) for detection zones and coverage areas — click each vertex, then double-click or press Enter to close the shape. For tripwires, use the Polyline tool (L) — open-ended, so it doesn’t close automatically.

For rectangular detection zones, the Rectangle tool (R) is faster: click the first corner, then click the opposite corner.

Name each region using the label field in the Properties panel. Add tags like zone_type:detection or camera_id:cam-03 for downstream metadata filtering.

Step 3 — Organise by Type

Use layers to separate region types: one layer for detection zones, one for exclusion zones, one for tripwires. You can toggle layer visibility to check coverage without deleting anything, and lock finished layers so you don’t accidentally move regions while continuing to add others.

Step 4 — Export Coordinates

Export via the toolbar:

Native JSON — full fidelity, includes layers, labels, tags, and all polygon vertices. Best format for storing and re-importing the scene.
COCO JSON — standard format for polygon and bounding-box annotations, compatible with many CV frameworks.
YOLO TXT — normalised bounding boxes, one line per shape. Use this if your pipeline specifically requires YOLO coordinate format.
PNG — flat composite image with regions drawn on the frame, useful for documentation and handover.

Coordinate Systems

COCO and YOLO use pixel coordinates relative to the image dimensions. When you use these coordinates to filter detections at runtime, apply them to images with the same resolution as the one you annotated on. If your pipeline downscales frames before processing, scale the ROI coordinates by the same factor.

For native JSON exports, coordinates are in image pixels from the top-left origin. A Python snippet to load and use them:

import json

with open('scene.json') as f:
    scene = json.load(f)

zones = [
    ann for ann in scene['annotations']
    if ann['metadata'].get('label') == 'detection-zone'
]

# Each polygon zone has:
# ann['data']['points'] — flat list [x0, y0, x1, y1, ...]

Best Practices

Draw on the actual operating frame. Regions drawn on a wide-angle uncropped frame don’t transfer correctly to a cropped or warped stream. Always annotate on the frame as the model will see it.

Overlap detection and exclusion zones deliberately. A detection zone can overlap an exclusion zone — the intersection is “detect here but filter the result if it falls in the exclusion area.” This layered approach is more flexible than trying to draw exclusion cutouts inside detection polygons.

Version your scenes. Export native JSON after each significant change. The file contains the background image reference, all regions, layers, and metadata — it’s a complete record of your zone configuration at a point in time.

Use shared vertices for adjacent zones. When two regions share a boundary (e.g. Zone A and Zone B meet at a common wall), link their shared vertices in RegionKit. Moving one endpoint updates both polygons simultaneously, keeping the boundary clean.

What’s Next

Once you have a set of exported region coordinates, the next step is integrating them into your detection pipeline. The typical pattern is to load the ROI polygons once at startup, run inference on each frame, and use a point-in-polygon test to decide which detections to act on.

For a worked example with YOLO-format output and a Python filtering snippet, see the companion post on YOLO annotation with RegionKit.