When you’re committed to quality, it shows.
Over the weekend we monitored three different streaming services for one of the toughest sports to televise – Formula 1. With fast motion, switches between helmet and onboard cams, swooping helicopter shots and more, F1 could be a video quality expert’s worst nightmare. But F1 TV’s stream of the French Grand Prix lapped the field with an average SSIMPLUS Viewer Score (SVS) of 80.2 - higher than any sports streaming services we have benchmarked so far. We subscribed to an F1 TV Pro account,which is their option to stream F1 races live. The other two services delivered scores of 68.8 SVS and 67 SVS respectively.
F1 telecasts are massive endeavors: 2 hours of breathtaking races delivered by 126 high-tech cameras around the track. F1 is a high-end product/sport with a global high-end audience – 440 million in 2021 – and top-level sponsors and advertisers whose requirements are highly demanding.
F1 TV’s success was notable because it is the first one that scored above an SVS of 80 from all the games we have monitored this summer (read about our MLB All-Star comparison and the Yankees vs. Red Sox game analysis).
In recent MLB telecasts, the average SSIMPLUS Viewer Score (SVS) was 73 for the All-Star Game and 75 for July 15 weekend games. These scores are far from ideal and can leave viewers unhappy with their subscription. Over the weekend two other F1 telecasters averaged SVS of 69 and 67; the 67 was the lowest we have seen so far from all the monitored sports streams and overall we saw an average of 72 SVS for the streams we reviewed. F1 TV’s SVS of 80 is good enough quality but it can be significantly improved to deliver a more pleasing experience to viewers.
What does quality above a SSIMPLUS Viewer Score of 80 look like? This shot from the F1 TV stream with an SVS of 85 shows crisp lines, no blur in the background, and cars that are clearly seen.
There was still a lot of variability in the F1 TV quality - for example, the feed from this onboard camera had issues in the background where the spectator stands are one hazy blur and the sky is full of banding.
This report presents the viewers' perspective of video quality (accurately measured using SSIMPLUS Viewer Score) to help streaming providers. Sports enthusiasts in the quest for better quality may switch to F1 TV and may not give any leeway to the other streaming providers for what may be source or encoding issues. Streaming providers can tackle the video quality issue head-on by knowing how to compare the other options available to their subscribers. We can also help services assess source quality by deploying the same product (SSIMPLUS Live Monitor) that is used for this comparison.
We don't discuss bitrate in this report as we've seen many instances, through analyzing millions of video hours, where higher bitrate did not equal better video quality. Thus, focusing on bitrate can be very misleading. What we observed is that F1 TV had the same resolution, codec and similar bitrate as Provider 1 who had a 69 SVS, so bitrate is not the reason for the big difference in the viewer experience.
Read more about the way we measure in the section below.
This was the usual quality from Provider 1 - this frame is with a 67 SVS - the cars are very blurred.
The cars and logos are badly distorted: the advertisers in this chicane would not be happy to see their logos so blurred that you can hardly make them out (Pirelli is slightly visible but we could not even tell what the second logo is as it’s very blurred and unreadable).
This pit stop preparation of the Red Bull team was all blurred in the Provider 1 stream with a SSIMPLUS Viewer Score of 64.
Provider 2 performed even worse with an average SVS of 67 - this is actually the lowest score for an event we have seen in any of the sports streams we’ve monitored so far.
In this image from Provider 2, no element is of good quality - the pit crew is blurred, as are the people in the stands above.
This closeup reveals how bad the video quality is at an SVS of 52. Imagine F1 fans watching such a fuzzy picture for 2 plus hours on their 65” TVs.
Why is F1 TV having a 12-13 SVS pts lead vs. the other two streaming services?
F1’s emphasis on quality is similar to the philosophies of IMAX and The Golf Channel. These companies care about video quality and they differentiate their premium product vs. the rest of the market. It will be interesting to see if this also holds true in the case of the NFL+ D2C service that would be released soon. Even though the new service will deliver games only to phones and tablets, the SSIMPLUS Viewer Score is device adaptive and it can be compared to the scores from TV.
Besides the overall approach to video quality, which we think F1 TV has, what are other reasons for their result? And why is their result not higher? We are measuring the viewer experience at the last point of the delivery chain as perceived by the viewer. To do an analysis of how each part of the delivery chain is performing we would need to deploy our virtual probes in the upstream points - source and post-encoding. We can absolutely help fix and automate quality by monitoring upstream and localizing viewer experience issues in real-time.
What can the other streaming services do to catch up?
- These scores show that when a service is committed to quality, it can be done properly. Yes, there is still room for improvement, but the differences between F1 TV and the rest are colossal and can help such a service differentiate by starting with the Viewer Experience in mind when building the delivery chain.
- There are simple checks that can be done with the proper software probes in specific areas like source and post-encoding that can show where quality is degrading the most and appropriate action taken. First is localization - where did the issue occur and then identification - what is the problem, why did it occur and what is the reason behind it.
Subscribe to our blog to receive the next sports streaming report.
What and how we are monitoring
- How we perform the test:
The goal of the report is to assess content processing and encoding performance across a sports workflow. The performance of content delivery and playback is "excluded" by ensuring ideal conditions for delivery and playback while scoring the content. When the content was streamed, we made sure that we were fetching top profiles at all times - which means that the performance of ISP(s), content delivery, and players did not negatively affect viewer experience. As the video encoding and processing performance has huge room for improvement, assessing content delivery and playback performance is part of future work. ABR profile was fixed to the top profile only and the results (including visual examples) are provided just for the top profile. The content is captured and scored in raw form using HDMI capture cards installed on a computer.
SSIMPLUS Scale: what's a good score?
The SSIMPLUS Viewer Score is a linear scale from 0-100 where 0 means bad (a worse quality cannot be imagined) and 100 excellent (without any visible impairments). SVS is highly correlated to how an actual human being - an average viewer watching on their device - will evaluate the video quality. An overall score of 74.5 (the average we measured for one of the MLB weekends) across the streaming platforms monitored is "good enough", even if not pleasing to the eye and mildly annoying, which is why they continue to serve millions of viewers. The viewing experience (and objective scores) are significantly worse than premium SVOD platforms (generally above an SVS of 80). Also, there is significant variation in quality across various streaming options. In addition to that, there is significant variation in the quality of each streaming platform across a game as evident from the visual examples.
The SSIMPLUS Viewer Score (SVS) is the most accurate measurement of how an actual human being watching video content on their device would rate the quality of the video. It is based on more than 20 years of research into the human visual system by some of the most renowned experts in the field, including the inventor of the original SSIM, Professor Zhou Wang, who is also our co-founder with more than 80,000 citations in the field. Our team has won numerous awards for this work including two Emmys. Currently, our products SSIMPLUS LIVE Monitor and SSIMPLUS VOD Monitor are deployed by 5 of the top streaming services in the world. You can read more here.
Impairment identification and localization (where in the delivery chain are the impairments originating)
We can localize any video quality issue if we are at the appropriate points. The current monitoring setup for benchmarking streaming services "watches" content with software probes like viewers do at the last point of a delivery chain. The score does not localize drops in video quality (if the drops are due to bad sources or poor compression, to do that we need to monitor at upstream points in the workflow) but it does identify what viewers are perceiving, which is a great first step towards improving viewer experience. Based on our vast experience in content delivery chains and perceptual quality assessment, we can predict where an issue may be originating from after a deeper analysis of impairments.
- Technical specs (resolutions, codec, bitrate)
The services we compared are monitored at the highest resolution they are capable of delivering and forcing that resolution in respective apps while ensuring that the setup supports very high bitrates. Sports streaming services generally deliver at 1280x720, 1920x1080, and 3840x2160 resolutions at 59.94 or 60 frames per second. The most common codec used is H.264/AVC. The most common ABR standard is MPEG-DASH. Typically bitrates are between 5 and 8 Mbps using variable bitrate approaches.
- Other measurement considerations
SSIMPLUS scores are device adaptive. The data presented in the blog posts correspond to viewer experience on a 65" 4K TV. Higher resolutions do benefit from higher content detail, if available. 4K resolution has a much higher likelihood to get to a 100 SVS and it would get to a 100 when there are no perceptually visible impairments in the content.
The scores provided in this blog are overall scores across the event. We have in-depth per-second and per-frame scores that show quality variation across content.
- How does the score compare from game to game or content to content?
The score considers saliency by modeling visual attention when assessing perceptual quality. Generally speaking, impairments in areas that matter more to viewers drag the score down further than impairments in areas that do not matter as much. As a result of this unique ability of the score, baseball scores are comparable to F1 scores and to any other sport we monitor.
- How we take the screen captures and what their measurements actually show
Content is scored in raw format real-time using HDMI capture cards. Content shared in this blog are frames extracted from the captures triggered based on quality thresholds. The scores for the frames presented in this blog post are “video scores” measuring overall viewer experience when a specific frame is played on a 4K 65” TV screen and not the score just for the exact frame, which can also be provided.