Where does YouTube TV rank for live sports video quality? Part 1
How does YouTube TV quality compare to the competition?
We’ve seen a massive shift in live sports viewing from broadcast to streaming over the last few years. Billion-dollar streaming rights deals for Major League Baseball, professional and college football, European football, and other sports are headlines almost daily.
Most sports fans can choose to watch their favorite games from at least two to three different streaming providers—all without the burden of year-long cable contracts. The move from PayTV to streaming subscriptions provides greater flexibility and convenience for sports fans to watch their games on any screen. Still, that convenience will not be enough to keep them happy and engaged if it does not go hand-in-hand with excellent video quality.
Sports fans are one of the most demanding audiences. They are quick to slam providers on social media the moment perceptual video quality is not up to their high standards. While broadcast rights kept viewers watching regardless of the quality, the ability to stream the same content on different providers means they can quickly cancel one subscription and switch to another. Streaming journalists and analysts are also closely scrutinizing streaming providers and often posting about their video quality on social media, blogs and podcasts.
Benchmarking Live Sports Streaming Viewer Experience
To help streaming providers provide better perceptual video quality vs. their competition—and reduce churn—we’ve launched a live sports video quality measurement service to find out:
- How good are the best sports streaming services (and how bad are the worst ones?)
- How do streaming providers compare to each other regarding the perceptual video quality they deliver?
- Is there a service that consistently provides better video quality than the rest?
- Which are the best and worst performers for each sport?
- Are there significant gaps and variations in video quality?
YouTube TV was a natural choice for this live sports monitoring project because it delivers content across several sports, including MLB, NFL, F1, English Premier League, golf, and tennis. It’s also the leading vMVPD with over 5 million users and one of the services with over 100 channels, including access to the most-watched North American and European sports.
Read our blog to learn more about the video quality comparison for each sport.
What was YouTube TV’s video quality performance compared to its competition?
Here is our summary of YouTube TV’s performance across five monitored sports compared to the best-performing service for each sport. We’ve also provided the difference in perceptual quality between YouTube TV and the best-performing service for the content.
SSIMPLUS Viewer Score (SVS) is the most accurate perceptual quality measure. SVS uses a linear scale from 0-100, where 0 means very poor, and 100 is a stream without any visible impairments. SVS is highly correlated to how average viewers, watching on their device, will evaluate the video quality. A difference of more than five Viewer Score points is readily noticeable. Read more details on how we measure.
Overall, YouTube TV video quality performance was in the mid-60s to high 70s Viewer Scores. There was not a single game, out of the 21 that were monitored, where its quality went above 80 Viewer Score. For each of the monitored sports, the best streaming services deliver at 80 Viewer Score or above. Thus, if a service wants to be among the top providers, 80 and above is the Viewer Score level that needs to be targeted. YouTube TV performance was compared to other providers across more than 20 games and races; the video quality results were consistent across all sports, so this cannot be explained as a one-off issue or precedent.
The table below contains some of the highlights from each sport.
|Sport||YouTube TV Viewer Score||Best performing service Viewer Score||Difference*|
September 7, 2022
Hungarian Grand Prix
August 28, 2022
August 30, 2022
September 4, 2022
**In this case YouTube TV performed better than the other monitored service which scored 75.8
Major League Baseball
- YouTubeTV came in last out of four different providers monitored over eight different games with an average score of 72 SSIMPLUS Viewer Score out of 100 Viewer Score. YouTube TV's lowest score was for the All-Star Game - 68.7 Viewer Score. The best provider delivered at 80 View Score.
- MLB All-Star Game - YouTube again performed the worst with an Viewer Score of 68.7.
What do these scores mean for the viewer? How big is the difference between a 70 and an 80 Viewer Score? The best way is to look at some visual examples from the games.
YouTube TV with a 54 Viewer Score from the MLB Yankees Game.
YouTube TV’s average SSIMPLUS Viewer Score of 70 means that the quality was even lower in different parts of the game. Refer to Image 1 to see what a 54 Viewer Score looks like for viewers. This blurred image, with significant compression artifacts, is even more frustrating for fans given the context - this is right after the Yankees scored. Fans want to experience the moment of celebration, and poor video quality can diminish emotional engagement. The best way to view these images is on a large screen TV - 55” or bigger as this is how most viewers experience sports content. The blurriness is even more pronounced.
This is how the image will look on a large-screen TV in subscribers’ homes.
We monitored four Grand Prix races starting in July—French, Hungarian, Belgian, and Dutch. YouTube TV’s average Viewer Score from the four is 67.8. The service always came last from all the monitored providers with 67, 64, 70.5, and 69.8 Viewer Scores respectively for each race. The best service from all monitored always delivered at 80 or above Viewer Score. In a couple of the races, we observed critical video quality impairments like interlacing effects, which made the content difficult to watch.
For example, when two cars were racing close to each other, you could not tell which one was ahead because of this impairment. Logos on the chicanes were obscured and unreadable. Even the speedometer alerts appeared fuzzy - you could not tell if the speed was 100km/h or 150km/h.
F1 sports fans have several streaming providers to choose from, and we identified that F1 TV delivers the highest quality which sets a high bar for the rest to catch up. F1 TV consistently delivered 80 and above Viewer Scores for each race. In many situations, the video quality difference between YouTube TV and their competition was more than 30 or even 50 points. Delivering content at lower quality scores than competitors puts a streaming service at a higher risk of churn.
YouTube TV with 45 Viewer Score from the F1 French Grand Prix. If you are seeing double, it is not you, it's the streaming service; this is due to interlacing issues that we discuss in more detail in this blog post.
F1 TV delivers this frame with an excellent Viewer Score of 87. The image is sharper and crisp and the logos have clear lines.
GolfYouTube had better quality than another provider, but the Viewer Score was 77.4 vs. 75.8; both scores are below what should be expected for a premium sport. Golf fans are some of the most demanding in terms of viewer experience due to a combination of:
- The high expectations driven by the quality that the Golf Channel set in the 1980s.
- They comprise one of the most affluent demographics that can purchase bigger and better screens.
- It is a challenge to track a small ball with video quality impairments on the screen.
Is Scheffler playing for real or in a video game? Is this a green screen behind him? This is what a Viewer Score of 52 means to the viewer - the grass behind is “burned” and gives the surreal feeling of the player pasted on a green screen.
YouTube TV’s quality was not stellar for tennis—another sport with demanding viewers. The average quality of the monitored YouTube TV US Open matches was a 70 Viewer Score, again falling below the 80 and above that top providers are delivering.
Viewer scores in the low 70s mean an inferior experience that can make viewers switch to another provider.
A Viewer Score of 51 at this very emotional moment as Serena Williams wins her first round match at her last US Open. The audience behind is one blur of white and blue dots.
Viewer Score of 44 - this is a replay from a previous US Open. Serena’s face is fuzzy.
Viewer Score of 56 - lower quality generates lower viewer engagement and viewership. Everyone loses - the viewer, the league, the advertisers, and the streaming service. See how distorted the players are; you can hardly make out that this is Serena Williams playing in the far end.
This is what US Open fans would be seeing for part of the 1-hour and 30-minute long matches.
You would think that talking heads shots would be easier to stream in higher quality, but this studio on Central Court had a Viewer Score of 62. Look at how the faces of both interviewer and player are fuzzy, and there is an interlacing effect.
YouTube TV delivered an average Viewer Score of 76 for the monitored games, compared to an average score of 82 for Peacock. For example, the difference in the Man United vs. Arsenal game was 76.5 Viewer Score for YouTube TV vs. 83.3 for Peacock. The two numbers might look close to each other, but the lower score means that for parts of the game, YouTube TV’s quality, as shown in the images below, was in the low 50s. Past research has shown that big deviations in video quality can lead to higher viewer disengagement, even if the average quality score is in the high 70s. Having sections with perceptual video quality in the 60s and low 50s from the high 70s is very frustrating to viewers, according to an Akamai study.
Some of the major infractions include:
- Tackles in front of the goal were difficult to watch, again with interlacing issues like in F1, which caused players to look as if they have four legs and hands.
- It was hard to tell players apart or follow the ball. It is essential for fans to clearly see if players are jostling for the ball without making fouls. Was there more rough action that should have ended with a yellow or red card? Should a penalty kick have been awarded? Soccer fans can discuss such situations for years. Just think of Maradona’s “Hand of God” situation in the 1990 World Cup. Being able to see these actions clearly is crucial for fans.
This emotional moment for Nottingham Forest was ruined with a Viewer Score of 48.
Viewer Score of 61 - Crystal Palace vs. Arsenal was difficult to watch. Look how the colors of the Crystal Palace jersey flow into each other as if they are floating, the grass turf looks like a sea, and the faces are badly distorted.
Viewer Score of 50 - we don’t even want to comment on this one; it was difficult to watch, trying to make out who’s who on the soccer field.
Compare this with the Peacock Viewer Score of 76, still below the 80 Viewer Score benchmark that we advise providers to target, but the players, the field, and the logos on the side panels are much crisper, and not a strain to watch.
YouTube TV Viewer Score Recap
YouTube TV’s overall quality across the five sports was not always up to their claim of a “Premium Live Streaming Service”, at least not for the video quality of their sports content.
Just last week, YouTube TV announced a $10 cut in its monthly price for new subscribers and a 21-day free trial, signaling the start of more fierce competition in the vMVPD part of the streaming market. But will subscribers stay if the video quality is subpar? Viewers will vote with their eyes and wallets.
Be the first to know how YouTube TV compared to other providers for NFL games. Subscribe to our blog.
In the next blog series, we will investigate the potential reasons behind these video quality issues and what can be done to fix them.
Want to learn more about how you can ensure your live streaming business does not allow bad video quality to creep in? Download our Live Sports Benchmarking White Paper.
What and how we are monitoring
- How we perform the test:
The goal of the report is to assess content processing and encoding performance across a sports workflow. The performance of content delivery and playback is "excluded" by ensuring ideal conditions for delivery and playback while scoring the content. When the content was streamed, we made sure that we were fetching top profiles at all times - which means that the performance of ISP(s), content delivery, and players did not negatively affect the video quality. As the video encoding and processing performance have huge room for improvement, assessing content delivery and playback performance is part of future work. ABR profile was fixed to the top profile only and the results (including visual examples) are provided just for the top profile. The content is captured and scored in raw form using HDMI capture cards installed on a computer.
SSIMPLUS Scale: what's a good score?
The SSIMPLUS Viewer Score is a linear scale from 0-100 where 0 means bad (a worse quality cannot be imagined) and 100 - excellent (without any visible impairments). SVS is highly correlated to how an actual person - an average viewer watching on their device - will evaluate the video quality. An overall score of 74.5 (the average we measured for one of the MLB weekends) across the streaming platforms monitored is "good enough", even if not pleasing to the eye and mildly annoying, which is why they continue to serve millions of viewers. The perceptual video quality (and objective scores) are significantly worse than premium SVOD platforms (generally above an SVS of 80). Also, there is significant variation in quality across various streaming options. In addition to that, there is significant variation in the quality of each streaming platform across a game as evident from the visual examples.
The SSIMPLUS Viewer Score (SVS) is the most accurate measurement of how an actual human being watching video content on their device would rate the quality of the video. It is based on more than 20 years of research into the human visual system by some of the most renowned experts in the field, including the inventor of the original SSIM algorithm, Professor Zhou Wang, who is also our co-founder, with more than 80,000 citations in the field. Our team has won numerous awards for this work including two Emmys. Currently, our products - SSIMPLUS LIVE Monitor and SSIMPLUS VOD Monitor, are deployed by five of the top streaming services in the world and the company was recently acquired by IMAX. You can read more here.
Impairment identification and localization (where in the delivery chain are the impairments originating)
We can localize any video quality issue if we are at the appropriate points. The current monitoring setup for benchmarking streaming services "watches" content with software probes like viewers do at the last point of a delivery chain. The score does not localize drops in video quality (if the drops are due to bad sources or poor compression, to do that we need to monitor at upstream points in the workflow) but it does identify what viewers are perceiving, which is a great first step towards improving viewer experience. Based on our vast experience in content delivery chains and perceptual quality assessment, we can predict where an issue may be originating from after a deeper analysis of impairments.
- Technical specs (resolutions, codec, bitrate)
The services we compared are monitored at the highest resolution they are capable of delivering and forcing that resolution in respective apps while ensuring that the setup supports very high bitrates. Sports streaming services generally deliver at 1280x720, 1920x1080, and 3840x2160 resolutions at 59.94 or 60 frames per second. The most common codec used is H.264/AVC. The most common ABR standard is MPEG-DASH. Typically bitrates are between 5 and 9 Mbps using variable bitrate approaches.
- Other measurement considerations
SSIMPLUS scores are device adaptive. The data presented in the blog posts correspond to a viewer experience on a 65" 4K TV. Higher resolutions do benefit from higher content detail, if available. 4K resolution has a much higher likelihood to get to a 100 SVS and it would get to a 100 when there are no perceptually visible impairments in the content.
The scores provided in this blog are overall scores across the event. We have in-depth per-second and per-frame scores that show quality variation across content.
- How does the score compare from game to game or content to content?
The score considers saliency by modeling visual attention when assessing perceptual quality. Generally speaking, impairments in areas that matter more to viewers drag the score down further than impairments in areas that do not matter as much. As a result of this unique ability of the score, baseball scores are comparable to F1 scores and to any other sport we monitor.
How are the screen captures taken and what do their scores actually show?
Content is scored in raw format, in real-time, using HDMI capture cards. Content shared in this blog are frames extracted from the captures triggered based on quality thresholds. The scores for the frames presented in this blog post are “video scores” measuring overall video quality when a specific frame is played on a 4K 65” TV screen and not the score just for the exact frame, which can also be provided.