Case Study

AI Commercial Detection at Setplex - 97% Precision in Live and VOD

Spearheaded a CNN+LSTM commercial-detection engine for live and on-demand video, deployed across 80+ live channels in production at 97% precision - supporting automated ad replacement, contextual matching, and continuous monitoring at scale.

2023 - 2025 · Head of AVOD - AI & Backend Systems Lead · AI · Video Intelligence · AdTech · Deep Learning · OTT

Outcomes

  • Achieved 97% precision in real-time advertisement identification across live and VOD content
  • Deployed across 80+ live TV channels in production, enabling continuous ad monitoring, automated replacement, and contextual matching
  • Built CI/CD with continuous model retraining (GitLab + MLflow), automated rollbacks, and environment versioning
  • Designed an edge-optimised inference service (Python + FastAPI + Redis) achieving sub-second response latency across distributed OTT nodes
  • Automated retraining and calibration pipeline (PyTorch + MLflow), maintaining precision and recall consistency across dynamic video datasets
  • Operated observability across Grafana and Prometheus - latency, model drift, and campaign KPIs in real time
2023–2025 · Setplex · AI Video · AdTech

Context

In OTT and FAST channels, knowing where ads are in a video stream is foundational. Ad replacement, dynamic insertion, contextual targeting, and compliance reporting all depend on accurate segmentation between content and advertisement. Traditional approaches rely on metadata or reference fingerprints - they fail on long-tail content, live channels, and creative variations the system has never seen.

We took a different approach at Setplex: detect ads from the signal itself, using deep learning over audio and visual features.

Technical approach

The detection model combines CNN-based visual feature extraction with LSTM-based temporal modelling to classify each frame (or short window) as content or advertisement. Audio features add a complementary signal - ad audio has distinct loudness, dynamics, and structural patterns versus long-form content.

Critically, the model is non-reference: it does not need a library of known creatives to detect that something is an ad. That makes it usable in scenarios where inventory changes constantly or where reference data isn’t available.

The production system runs as an edge-optimised inference service (Python + FastAPI + Redis) achieving sub-second latency across distributed OTT nodes. CI/CD pipelines (GitLab + MLflow) drive continuous retraining with automated rollback and environment versioning, and Grafana/Prometheus dashboards track latency, model drift, and campaign KPIs.

Production scale

Deployed across 80+ live channels, classifying advertisements in real time at 97% precision. Used by downstream services for:

  • Ad replacement - swap a default ad with a personalised or geo-targeted creative at playback time.
  • Contextual targeting - match ads to surrounding content based on detected scene/audio context.
  • Compliance and reporting - verify ad delivery and break structure across live and VOD inventory.

What this taught me

AI in AdTech is often presented as plug-and-play. The reality is harder: live streams, manifest quirks, edge cases in transcoding, and the gap between “model accuracy in a notebook” and “model accuracy in production at scale” make end-to-end delivery a different problem from model development. Bridging that gap - from research to deployable product, with retraining, observability, and rollback baked in - is where the real engineering work lives.