Sports leagues are busy announcing billion dollar rights deals with different streaming providers who are on the quest for more eyeballs and ad revenue amid falling subscriptions. MLS announcement, UEFA Rights Announcement.
This move provides fans with many benefits; from non-binding long-term contracts to canceling any service anytime, and the ability to watch games anywhere, on any device.
But streaming live sports is one of the hardest video delivery tasks and sports fans are the least forgiving and the most vocal audience, especially if the video quality is not up to their expectations. That’s no surprise: these subscribers are the most emotionally engaged; they invest in expensive live TV/streaming packages and ever-larger TVs; and they often watch games socially. That puts extra pressure on services to have quality comparable to what viewers are accustomed to with VOD content.
This weekend, the New York Yankees took 2 of the 3 games vs. the Boston Red Sox. But when it comes to live sports streaming, who won in the Viewer Experience battle?
To answer this question, we performed a comparative ranking analysis of 5 of these 8 live streaming services that showed Yankees-Red Sox games throughout the July 15-17 weekend: YouTube TV, Hulu, Sling, Prime Video, Sportsnet, FuboTV, DirecTV Stream, and MLB.TV. We made sure that top profile is viewed/scored and delivery quality (such as rebuffering, profile switching) did not impact Viewer Experience. In other words, the scores provide a complete understanding of the best quality the streaming services offered. Using the SSIMPLUS Viewer Score (SVS) that ranks viewer quality from 0-100, here is what we found:
- The average SSIMPLUS Viewer Score of all eight live streamed games was 74.5, indicating less-than-stellar video quality. None of the services delivered a viewer experience that would make fans feel as if they are at the stadium, no matter the TV screen size.
Date Yankees vs. Red Sox Game Services SSIMPLUS Viewer Score July 15 Game 1 Provider 1 79.3 July 15 Game 1 Provider 2 73.3 July 15 Game 1 Provider 3 76.1 July 16 Game 2 Provider 1 76.2 July 16 Game 2 Provider 2 70.3 July 17 Game 3 Provider 3 75.9 July 17 Game 3 Provider 4 72.6 July 17 Game 3 Provider 5 72.3
*Note: Provider 1 for Game 1 is not the same as Provider 1 on Game 2
- Significant variability in the quality within each game and across providers. The overall SSIMPLUS Viewer Score for each game masks a bigger problem: there are huge differences in the video quality throughout the games, with quality dipping as low as 40 SVS. Also, no provider performed consistently well across all of the games throughout the weekend.
Why is a 40 Viewer Score an issue? There are numerous studies that show that poor video quality leads to negative emotions, less engagement and more frustration by viewers. These reactions are more strongly felt among sports fans who are emotionally tied to their favorite team and expect to indulge in the game as if they are at the stadium.
Below is a quick overview with visual examples of what we measured this weekend - the quality was not up to what subscribers would expect:
SSIMPLUS Viewer Score of 54 – A view of the dugout after a Yankee homerun; the faces are fuzzy and blurred, disappointing for fans who would like to see clearly the celebration of their team’s success.Here is a closer look at the quality on the screen.
SVS of 66 – Subscribers are unable to see clearly players’ faces.
SVS of 71 – Shows how frustrating it is to subscribers to watch a player up to bat.
SVS of 57 – Fans feel as though they are watching a video game from the 1980s; the diamond, the faces, the bleachers are very blurry, which makes the viewer experience frustrating.
3. When it comes to the video quality of the ads inserted during the games, the scores were even worse; in one of the games we measured ads with a SSIMPLUS Viewer Score of 30. The visual below demonstrates the poor quality. No brand manager would be happy that their ad is displayed with this level of quality.
Ad engagement will be very low when viewers cannot clearly read the website URL in the video advertisement.
SVS of 67 was measured for this ad, which is for one of the biggest Gen Z brands in the world.
An ad with very visible banding in the background. Banding is one of the most frustrating video quality impairments.
This is an ongoing analysis and we will add more measurements. Subscribe to our blog to receive the next analysis.
What and how we are monitoring
- How we perform the test:
The goal of the report is to assess content processing and encoding performance across a sports workflow. The performance of content delivery and playback is "excluded" by ensuring ideal conditions for delivery and playback while scoring the content. When the content was streamed, we made sure that we were fetching top profiles at all times - which means that the performance of ISP(s), content delivery, and players did not negatively affect viewer experience. As the video encoding and processing performance has huge room for improvement, assessing content delivery and playback performance is part of future work. ABR profile was fixed to the top profile only and the results (including visual examples) are provided just for the top profile. The content is captured and scored in raw form using HDMI capture cards installed in a computer.
SSIMPLUS Scale: what's a good score?
The SSIMPLUS Viewer Score is a linear scale from 0-100 where 0 means bad (a worse quality cannot be imagined) and 100 excellent (without any visible impairments). SVS is highly correlated to how an actual human being - an average viewer watching on their device - will evaluate the video quality. An overall score of 74.5 (the average we measured for one of the MLB weekends) across the streaming platforms monitored is "good enough", even if not please to the eye and mildly annoying, which is why they continue to serve millions of viewers. The viewing experience (and objective scores) are significantly worse than premium SVOD platforms (generally above an SVS of 80). Also, there is significant variation in quality across various streaming options. In addition to that, there is significant variation in the quality of each streaming platform across a game as evident from the visual examples.
Impairment identification and localization (where in the delivery chain are the impairments originating)
We can localize any video quality issue if we are at the appropriate points. The current monitoring setup for benchmarking streaming services "watches" content with software probes like viewers do at the last point of a delivery chain. The score does not localize drops in video quality (if the drops are due to bad sources or poor compression, to do that we need to monitor at upstream points in the workflow) but it does identify what viewers are perceiving, which is a great first step towards improving viewer experience. Based on our vast experience in content delivery chains and perceptual quality assessment, we can predict where an issue may be originating from after a deeper analysis of impairments.
- Technical specs (resolutions, codec, bitrate)
The services we compared are monitored at the highest resolution they are capable of delivering and forcing that resolution in respective apps while ensuring that the setup supports very high bitrates. Sports streaming services generally deliver at 1280x720, 1920x1080, and 3840x2160 resolutions at 59.94 or 60 frames per second. The most common codec used is H.264/AVC. The most common ABR standard is MPEG-DASH. Typically bitrates are between 5 and 8 Mbps using variable bitrate approaches.
- Other measurement considerations
SSIMPLUS scores are device adaptive. The data presented in the blog posts correspond to viewer experience on a 65" 4K TV. Higher resolutions do benefit from higher content detail, if available. 4K resolution has a much higher likelihood to get to a 100 SVS and it would get to a 100 when there are no perceptually visible impairments in the content.
The scores provided in this blog are overall scores across the event. We have in-depth per-second and per-frame scores that show quality variation across content.
- How does the score compare from game to game or content to content?
The score considers saliency by modeling visual attention when assessing perceptual quality. Generally speaking, impairments in areas that matter more to viewers drag the score down further than impairments in areas that do not matter as much. As a result of this unique ability of the score, baseball scores are comparable to F1 scores and to any other sport we monitor.
- How we take the screen captures and what their measurements actually show
Content is scored in raw format real-time using HDMI capture cards. Content shared in this blog are frames extracted from the captures triggered based on quality thresholds. The scores for the frames presented in this blog post are “video scores” measuring overall viewer experience when a specific frame is played on a 4K 65” TV screen and not the score just for the exact frame, which can also be provided.