What latency do we need for our live broadcast?

It depends on interactivity requirements. Standard latency (20-45 seconds) works for concerts and lectures. Low latency (3-8 seconds) suits most professional events with live polls and chat. Ultra-low latency (under 1 second via WebRTC) is only needed for interactive auctions or live shopping.

What redundancy is required for professional live streaming?

Redundancy is needed at every level: dual encoders, at least two separate ingest points in different data centers, active-active transcoding configuration, multi-CDN setup with automatic failover, and a primary fiber connection with 4G/5G backup at the production site.

What is the difference between live streaming and video on demand (VOD)?

VOD files are already transcoded and distributed when the viewer presses play - failures are invisible. With live streaming, content is produced in real time with no opportunity for retakes. It requires time synchronization, real-time scaling, fault tolerance without a buffer, and active latency management.

Do we need interactive features for our live broadcast?

For most events, yes. Real-time chat, live polls, and Q&A features distinguish a streamed experience from passive consumption and drive engagement data that remains valuable long after the event. They do require separate infrastructure outside the video pipeline.

Live streaming for events: infrastructure, technology, and decisions that hold

There is a moment in every live broadcast that no one on the production team mentions afterward but that everyone remembers: the three seconds when the signal drops, when commentators fill the silence with half-formed sentences, and tens of thousands of viewers stare at a spinning buffer icon. Preventing that moment is what this article is about.

Live streaming for events - conferences, sports competitions, concerts, product launches, shareholder meetings - places fundamentally different demands on infrastructure than video on demand. Errors happen in real time, in front of an audience, with no opportunity to take another shot. The technical stack must be correctly dimensioned before the broadcast begins, because there is no room for improvisation once it is running.

This guide is written for those responsible for making the broadcast work: technical directors, product managers, and event producers who need to understand the infrastructure well enough to make the right decisions and place the right demands on their vendors.

Live versus VOD - the critical differences

Video on demand and live streaming share a significant amount of surface-level technology - codecs, container formats, CDN delivery - but they differ fundamentally in error model and system design requirements.

With VOD, the file is already transcoded and distributed by the time the viewer presses play. If a CDN node fails, another can serve the same content. Buffers can be generous. The failure is invisible to the end user.

With live streaming, the content is a continuous stream of segments produced in real time. There is no file to fall back on. Every segment that is not delivered on time is a visible interruption. The system must handle:

Time synchronization. Every part of the pipeline - encoder, ingest, transcoding, origin, CDN, player - must maintain a coordinated timeline.
Real-time scaling. A large conference can go from zero to hundreds of thousands of concurrent connections in minutes. The CDN must handle that without advance notice.
Fault tolerance without a buffer. There is no pre-produced material to fall back on. Redundancy must be built into every component.
Latency sensitivity. Depending on the use case, the difference between ten seconds and one second of delivery lag can be critical to the user experience.

These differences are what justify treating live streaming as a distinct infrastructure problem, separate from how you approach VOD delivery.

Infrastructure: encoding, ingest, and CDN

A live streaming pipeline has three primary components: encoding, ingest, and distribution. Each introduces latency and potential failure points. Understanding them separately is the prerequisite for dimensioning correctly.

Encoding

Encoding is the process of converting a raw signal - camera output, HDMI feed, screen capture - into a compressed format suitable for network delivery. This happens in either hardware encoders (dedicated devices at the production site) or software encoders running on server or cloud infrastructure.

For events with high image quality requirements and low latency demands, hardware encoding is the standard choice. Hardware encoders handle H.264 and H.265 encoding using dedicated circuits, providing lower and more predictable processing times than software alternatives under load.

Multi-bitrate encoding - producing several parallel streams of the same content at different resolutions and bitrates - happens either in the encoder itself or in a downstream cloud transcoding layer. The result is that client-side players can switch quality levels adaptively based on available bandwidth, which is standard for all professional live solutions.

Ingest

The ingest point is where the encoded signal is received by the streaming infrastructure. The dominant protocols are RTMP (still widespread due to broad encoder support), SRT (Secure Reliable Transport, which handles network instability more effectively), and RIST (Reliable Internet Stream Transport, used for critical broadcasts over unreliable network links).

The choice of ingest protocol affects latency and fault tolerance. SRT and RIST build error correction in at the protocol level, making them well-suited for broadcasts over public internet links where packet loss is a real scenario - such as from a field studio or a remote location.

Ingest servers should be geographically positioned close to the source to minimize network latency. For global events with multiple production sites, separate ingest points per region are used with an internal backbone network to transport the signal to a central transcoding layer.

CDN

The CDN layer is responsible for distributing the finished segments to viewers regardless of where they are located. For live streaming, CDN performance is not primarily a question of file size - it is about the ability to handle a massive number of concurrent connections with minimal cache fragmentation.

Live streaming segments are short (typically 2-10 seconds depending on the latency target) and have a very short cache lifetime. This places demands on the CDN network to have sufficient capacity and for segment delivery to be coordinated effectively to avoid origin overload when thousands of player clients request the same segment almost simultaneously.

Choose a CDN with demonstrated capacity for live streaming at scale - not only VOD - and verify with load testing before the event day.

Latency tiers: standard, low, and ultra-low

Latency in live streaming is measured from camera capture to the image appearing on the viewer's screen. There are three established tiers, each with their own technical consequences and appropriate use cases.

Standard latency (20-45 seconds)

HLS (HTTP Live Streaming) in its standard configuration with segment lengths of 6-10 seconds typically yields 20-45 seconds of latency from source to screen. This is fully acceptable for events where interactivity is not critical - a concert, a ceremony, a lecture - and provides maximum stability and CDN efficiency.

The high latency means spoiler risk is real for sports events where viewers are communicating socially in real time. If that is a concern, choose a lower latency tier.

Low latency (3-8 seconds)

Low-latency HLS (LL-HLS) and low-latency DASH achieve 3-8 seconds of latency by delivering segments in partial chunks and using push protocols rather than having the client poll the origin. This enables more meaningful interactivity - live polls, real-time comments, synchronized shared experiences - without requiring a radically different infrastructure.

This is the tier we recommend for most professional event broadcasts. Infrastructure complexity is manageable, and the user experience is sufficiently synchronized for most interactive use cases.

Ultra-low latency (under 1 second)

WebRTC enables sub-second latency and is used in video conferencing, interactive gaming experiences, and broadcasts where real-time viewer response is central - such as interactive auctions or live shopping.

The trade-off is scalability. WebRTC is point-to-point in nature and requires specialized media server architectures (SFU - Selective Forwarding Unit) to scale to large viewer counts. It is more expensive to operate and requires more infrastructure expertise. For an event with hundreds of thousands of viewers, standard or low-latency tier is almost always the correct choice unless the interactivity requirement explicitly justifies WebRTC.

Redundancy and failover

A live streaming infrastructure without redundancy is not a live streaming infrastructure - it is a single point of failure waiting to occur. Redundancy must be planned at every level of the pipeline.

Encoder redundancy. Run primary and backup encoders in parallel from the same source signal. If the primary unit fails, the backup takes over without manual intervention. This requires the ingest layer to accept dual streams and switch automatically.

Ingest redundancy. Send the signal to at least two separate ingest points, ideally in different data centers. If one ingest point drops, the other continues delivering.

Transcoding redundancy. Cloud transcoding in managed services typically handles internal redundancy, but verify that your vendor runs active-active (not active-passive) configuration for critical broadcasts.

CDN redundancy. A multi-CDN setup - distributing traffic across two or more CDN providers with automatic failover - is standard practice for high-stakes events. It is more expensive but eliminates CDN outage as a single point of failure.

Network redundancy at the production site. A primary dedicated fiber connection with 4G/5G backup is the minimum for events outside dedicated broadcast studios. Consider bonded cellular for critical remote productions.

Always test failover scenarios explicitly in advance. A redundant setup that has never been tested is a setup that will probably not function as intended when it matters.

Interactive features: chat, polls, and Q&A

Live streaming is not television - audiences expect the opportunity to participate. Interactive features are not merely nice-to-have; they are what distinguishes a streamed experience from passive consumption and they drive engagement data that remains valuable long after the event concludes.

Chat

Real-time chat at large scale requires infrastructure separate from the video pipeline. WebSocket-based chat systems with back-pressure handling and a moderation layer are standard. Key considerations include:

Automated moderation to handle flows moving faster than human moderators can follow.
Message throttling per user to prevent chat from being dominated by individual actors at scale.
Integration with identity systems if chat should be linked to authenticated users.

Polls

Live polls are one of the most effective tools for increasing engagement and collecting data. They require a separate data infrastructure to handle a potentially massive number of simultaneous responses within a short time window - think tens of thousands of responses within ten seconds of a question being presented.

Connect polling data to analytics to understand which parts of an event generate the most engagement. That is valuable information for future production decisions.

Q&A

Structured Q&A - where the audience submits questions that are then moderated and answered by speakers - is particularly valuable at conferences and shareholder meetings. Implement upvoting features that let the audience surface the questions they consider most important, which streamlines the moderator's work and gives speakers signals about what the audience actually wants to know.

After the event: automatic VOD archival

The broadcast is over. The work is not.

A well-designed live streaming solution automatically archives the broadcast as VOD immediately after the stream concludes. This means the same content delivered live - with the same metadata, chapter markers, and interaction data - is available for on-demand consumption without manual post-production.

This creates immediate residual value: viewers who missed the live broadcast can watch immediately, speakers can share specific sessions, and marketing efforts can continue driving traffic to the content for weeks and months afterward.

Plan for automatic transcription and keyword indexing of archived material. This improves searchability, accessibility, and the ability to repurpose content segments in other formats.

Link the VOD archive to your streaming platform solution and ensure your player supports timestamp sharing and chapter navigation - those are the features that determine how effectively the archived content is actually used.

Summary

A professional live streaming solution for events is not a product - it is a pipeline of technical decisions that must hold together. Encoding, ingest, CDN, latency tier, redundancy, and interactivity are not separate product choices; they are interconnected decisions where each one affects the others.

The key decisions that determine the outcome:

Choose your latency tier based on actual interactivity requirements, not on what sounds most impressive.
Build redundancy at every level - plan for the failure, not against it.
Test failover scenarios explicitly, not theoretically.
Design for VOD archival from the start, not as an afterthought.
Choose ingest protocol based on the network conditions at the production site.

Broadcast infrastructure is a domain where the competency to place the right demands on vendors is as important as choosing the right vendors. At Shapp, we work with live streaming and OTT solutions and help organizations build infrastructure that holds when it counts.

Want to discuss the infrastructure for your next event? Get in touch and we will walk through the requirements together.