AI Commercial Detection at Setplex - 97% Precision in Live and VOD

2023–2025 · Setplex · AI Video · AdTech

Context

In OTT and FAST channels, knowing where ads are in a video stream is foundational. Ad replacement, dynamic insertion, contextual targeting, and compliance reporting all depend on accurate segmentation between content and advertisement. Traditional approaches rely on metadata or reference fingerprints - they fail on long-tail content, live channels, and creative variations the system has never seen.

We took a different approach at Setplex: detect ads from the signal itself, using deep learning over audio and visual features.

Technical approach

The detection model combines CNN-based visual feature extraction with LSTM-based temporal modelling to classify each frame (or short window) as content or advertisement. Audio features add a complementary signal - ad audio has distinct loudness, dynamics, and structural patterns versus long-form content.

Critically, the model is non-reference: it does not need a library of known creatives to detect that something is an ad. That makes it usable in scenarios where inventory changes constantly or where reference data isn’t available.

The production system runs as an edge-optimised inference service (Python + FastAPI + Redis) achieving sub-second latency across distributed OTT nodes. CI/CD pipelines (GitLab + MLflow) drive continuous retraining with automated rollback and environment versioning, and Grafana/Prometheus dashboards track latency, model drift, and campaign KPIs.

Production scale

Deployed across 80+ live channels, classifying advertisements in real time at 97% precision. Used by downstream services for:

Ad replacement - swap a default ad with a personalised or geo-targeted creative at playback time.
Contextual targeting - match ads to surrounding content based on detected scene/audio context.
Compliance and reporting - verify ad delivery and break structure across live and VOD inventory.

What this taught me

AI in AdTech is often presented as plug-and-play. The reality is harder: live streams, manifest quirks, edge cases in transcoding, and the gap between “model accuracy in a notebook” and “model accuracy in production at scale” make end-to-end delivery a different problem from model development. Bridging that gap - from research to deployable product, with retraining, observability, and rollback baked in - is where the real engineering work lives.

بحث

AI Commercial Detection at Setplex - 97% Precision in Live and VOD

Outcomes

Context

Technical approach

Production scale

What this taught me