Streaming Platforms Fail to Provide All-Star Quality

By July 20, 2022
72.6_SVS-01

Last night, July 19th, our software virtual eyes watched the MLB All-Star game across two different streaming platforms. This is what they saw: the average quality was 72.6 SSIMPLUS Viewer Score (SVS), compared to the weekend Yankees vs. Red Sox games’ average of 74.6 SVS.

Provider 1 - 68.7 SVS

Provider 2 - 76.4 SVS

Download the captured frames in high resolution (zip file).

AllStar_SSIMPLUS_51_TVProvider 1 - A lot of blurred, fuzzy images, especially during player movement. 51 SSIMPLUS Viewer Score.

AllStar_SSIMPLUS_60_TVProvider 1 - Encoders don’t like confetti. 60 SVS.

AllStar_SSIMPLUS_68_TVProvider 2 - Even static situations like the before pitching prep time had sub-par quality: the faces in the batter's box are smudged, you can hardly make out the features on the batter’s face, the infield dirt looks very fuzzy, the people in the bleachers are blurred, and the grass is one green sea versus clearly seeing the blades of grass when you have good video quality. The whole picture lacks crispness, the details are smeared, and details provide a more engaging and pleasing viewer experience.

AllStar_SSIMPLUS_68_TV_zoomCapture zoom - Who is this player? You can hardly make the NY Yankees logo on the cap.

AllStar_SSIMPLUS_70_TV2The Umpire camera gives an interesting angle to the game, but its quality of 70 SVS is not good.

AllStar_SSIMPLUS_70_TVADProvider 1 - Ad quality was no better - SVS of 70. Consumer research shows that people engage more with ads when the video quality is high. Having a poor viewer experience does not help with the advertiser’s goals of building better relationships with consumers and delighting their senses.

What and how we are monitoring

  1.           How we perform the test:

    The goal of the report is to assess content processing and encoding performance across a sports workflow. The performance of content delivery and playback is "excluded" by ensuring ideal conditions for delivery and playback while scoring the content. When the content was streamed, we made sure that we were fetching top profiles at all times - which means that the performance of ISP(s), content delivery, and players did not negatively affect viewer experience. As the video encoding and processing performance has huge room for improvement, assessing content delivery and playback performance is part of future work. ABR profile was fixed to the top profile only and the results (including visual examples) are provided just for the top profile. The content is captured and scored in raw form using HDMI capture cards installed in a computer.

  2. SSIMPLUS Scale: what's a good score?

    The SSIMPLUS Viewer Score is a linear scale from 0-100 where 0 means bad (a worse quality cannot be imagined) and 100 excellent (without any visible impairments). SVS is highly correlated to how an actual human being - an average viewer watching on their device - will evaluate the video quality. An overall score of 74.5 (the average we measured for one of the MLB weekends) across the streaming platforms monitored is "good enough", even if not please to the eye and mildly annoying, which is why they continue to serve millions of viewers. The viewing experience (and objective scores) are significantly worse than premium SVOD platforms (generally above an SVS of 80). Also, there is significant variation in quality across various streaming options. In addition to that, there is significant variation in the quality of each streaming platform across a game as evident from the visual examples.

  3. Impairment identification and localization (where in the delivery chain are the impairments originating)

    We can localize any video quality issue if we are at the appropriate points. The current monitoring setup for benchmarking streaming services "watches" content with software probes like viewers do at the last point of a delivery chain. The score does not localize drops in video quality (if the drops are due to bad sources or poor compression, to do that we need to monitor at upstream points in the workflow) but it does identify what viewers are perceiving, which is a great first step towards improving viewer experience. Based on our vast experience in content delivery chains and perceptual quality assessment, we can predict where an issue may be originating from after a deeper analysis of impairments.

  4.           Technical specs (resolutions, codec, bitrate)

    The services we compared are monitored at the highest resolution they are capable of delivering and forcing that resolution in respective apps while ensuring that the setup supports very high bitrates. Sports streaming services generally deliver at 1280x720, 1920x1080, and 3840x2160 resolutions at 59.94 or 60 frames per second. The most common codec used is H.264/AVC. The most common ABR standard is MPEG-DASH. Typically bitrates are between 5 and 8 Mbps using variable bitrate approaches.

  5.           Other measurement considerations

    SSIMPLUS scores are device adaptive. The data presented in the blog posts correspond to viewer experience on a 65" 4K TV. Higher resolutions do benefit from higher content detail, if available. 4K resolution has a much higher likelihood to get to a 100 SVS and it would get to a 100 when there are no perceptually visible impairments in the content.

    The scores provided in this blog are overall scores across the event. We have in-depth per-second and per-frame scores that show quality variation across content.

  6.           How does the score compare from game to game or content to content?

    The score considers saliency by modeling visual attention when assessing perceptual quality. Generally speaking, impairments in areas that matter more to viewers drag the score down further than impairments in areas that do not matter as much. As a result of this unique ability of the score, baseball scores are comparable to F1 scores and to any other sport we monitor.

  7.           How we take the screen captures and what their measurements actually show

    Content is scored in raw format real-time using HDMI capture cards. Content shared in this blog are frames extracted from the captures triggered based on quality thresholds. The scores for the frames presented in this blog post are “video scores” measuring overall viewer experience when a specific frame is played on a 4K 65” TV screen and not the score just for the exact frame, which can also be provided.

Read our detailed analysis of the Yankees vs. Red Sox weekend games viewer experience.

This is an ongoing analysis and we will add more measurements. Subscribe to our blog to receive the next analysis.

Learn how you can improve the viewer experience of your live sports streaming service.

Download the streaming sports white paper