Where does YouTube TV rank for live sports video quality? Part 2
We recently shared the results of YouTube TV’s video quality performance across five different sports — MLB, F1, golf, tennis, and the English Premier League (soccer). As promised, we’re continuing the series and turning our virtual eyes to the National Football League (NFL).
The importance of NFL to PayTV and streaming providers
NFL football is the most popular sports league in the U.S.. The 2021 regular season attracted more than 17 million at-home viewers per game, with over 370 billion total minutes consumed*. The most watched game for the 2021 season was a CBS broadcast of the Las Vegas Raiders playing the Dallas Cowboys, with 40.8 million viewers tuning in.
Which providers are competing for the NFL sports fans?
In the 2022 season, NFL viewership is split among several TV channels and streamers, through a combination of various PayTVs, vMVPDs, and streaming platforms:
- Amazon Prime has exclusive Thursday Night Football rights for at least 11 years at $1 billion USD per year**
- The remainder of the games are split between CBS, FOX, NBC and ESPN, the NFL Network and numerous local networks***
- These channels are included in vMPVD packages from YouTube TV, Hulu+LIVE TV, SlingTV, FuboTV, as well as in streamers’ offerings like Paramount+ and Peacock Premium***
This extensive list means NFL fans can choose to watch their favorite teams from various providers. There are exceptions, of course, like the exclusive Thursday Night Football and Sunday Night Football games and local blackouts, so not all games can be directly compared, for viewer experience, and no service provides access to all games.
Most nights, there are more than six different options to choose from to watch a game. Thus, NFL fans can easily compare services and see which provider delivers the best video quality with their own eyes. Since the regular season began, fans have been very vocal about their viewer experience frustrations on Twitter and other social media platforms. Industry analysts like Dan Rayburn have also been keeping a close tab on the providers’ video delivery quality.
Where does YouTube TV rank in terms of NFL video quality?
We wanted to find out if YouTube TV’s NFL video quality was similar to the levels the service delivered for the other five sports we covered in our previous blog.
The average Viewer Score was 78 for all the monitored YouTube TV NFL games. For each of the monitored sports, including NFL, the best streaming services deliver an 80 Viewer Score or above. Thus, if a service wants to be among the top providers, 80 and above is the Viewer Score level that needs to be targeted. For NFL, the top service was consistently delivering above an 85 Viewer Score with an average of 85.5 across four different games.
We compared YouTube TV performance to other providers across more than 15 observations.
Here are some highlights from a few games.
|Week||Game||YouTube TV||Provider 2|
|Preseason||Broncos vs Cowboys||80||78|
|Preseason||Bears vs Seahawks||73||76|
|Sept 29||Dolphins vs Bengals||82||85|
These Viewer Scores are better than what we identified for other sports, but again we see a lot of variabilities, with some parts of the delivery dipping to a low 27 Viewer Score.
What was the video quality of the ads?
We also identified many quality issues in the ad breaks.
Advertisers spend a lot to be featured in the NFL ad blocks - data from the Standard Media Index (SMI) suggests that 30-second ads range from $480,000 to $600,000 USD. Advertisers spend a lot of time and effort making ads that sometimes cost millions of dollars to produce. They use extensive marketing research to ensure each shot delivers the desired message. As such, advertisers hope the quality of their finished product will be preserved. Some of the ads in the monitored NFL games had a loss of detail, banding, pixelization, compression artifacts, and other video quality issues with very low Viewer Scores of 27. These ads were hard to watch and are falling short of delivering viewer engagement.
The visual examples below will give you an idea of the Viewer Scores and what they mean for the viewer:
Where is the ball? Viewer Score of 53.
You cannot make out the players and who is tackling who in the left, nor can you tell if the player on the sideline has the ball. 65 Viewer Score.
It is hard to watch this, probably due to interlacing issues, everything is blurred. Viewer Score of 47.
This is how Image 3 would look on a big-screen TV.
This ad had a lot of visual impairments: blurred grassland, heavy compression artifacts, and blockiness - 27 Viewer Score.
This ad is also difficult to watch - 40 Viewer Score. This frame is full of blockiness, false contour and severe loss of detail.
That YouTube TV has video quality issues across all the monitored six sports (read our blog on the other five sports) shows that there are some overall technical issues or problems in the workflow setup that deteriorate the viewer experience regardless of the sport or channel the vMVPD is delivering.
Next week we'll take a deep dive into these observed problems and look at what the main culprits could be for the observed poor live-streaming video quality. This analysis would be based on our extensive monitoring of more than 18 million hours of live video monitored every month.
Be the first to know what could be the root causes for YouTube TV’s live streaming troubles - Subscribe to our blog.
In the next blog series, we will investigate the potential reasons behind these video quality issues and what can be done to fix them.
Want to learn more about how you can ensure your live streaming business does not allow bad video quality to creep in? Download our Live Sports Benchmarking White Paper.
What and how we are monitoring
- How we perform the test:
The goal of the report is to assess content processing and encoding performance across a sports workflow. The performance of content delivery and playback is "excluded" by ensuring ideal conditions for delivery and playback while scoring the content. When the content was streamed, we made sure that we were fetching top profiles at all times - which means that the performance of ISP(s), content delivery, and players did not negatively affect the video quality. As the video encoding and processing performance have huge room for improvement, assessing content delivery and playback performance is part of future work. ABR profile was fixed to the top profile only and the results (including visual examples) are provided just for the top profile. The content is captured and scored in raw form using HDMI capture cards installed on a computer.
SSIMPLUS Scale: what's a good score?
The SSIMPLUS Viewer Score is a linear scale from 0-100 where 0 means bad (a worse quality cannot be imagined) and 100 - excellent (without any visible impairments). SVS is highly correlated to how an actual person - an average viewer watching on their device - will evaluate the video quality. An overall score of 74.5 (the average we measured for one of the MLB weekends) across the streaming platforms monitored is "good enough", even if not pleasing to the eye and mildly annoying, which is why they continue to serve millions of viewers. The perceptual video quality (and objective scores) are significantly worse than premium SVOD platforms (generally above an SVS of 80). Also, there is significant variation in quality across various streaming options. In addition to that, there is significant variation in the quality of each streaming platform across a game as evident from the visual examples.
The SSIMPLUS Viewer Score (SVS) is the most accurate measurement of how an actual human being watching video content on their device would rate the quality of the video. It is based on more than 20 years of research into the human visual system by some of the most renowned experts in the field, including the inventor of the original SSIM algorithm, Professor Zhou Wang, who is also our co-founder, with more than 80,000 citations in the field. Our team has won numerous awards for this work including two Emmys. Currently, our products - SSIMPLUS LIVE Monitor and SSIMPLUS VOD Monitor, are deployed by five of the top streaming services in the world and the company was recently acquired by IMAX. You can read more here.
Impairment identification and localization (where in the delivery chain are the impairments originating)
We can localize any video quality issue if we are at the appropriate points. The current monitoring setup for benchmarking streaming services "watches" content with software probes like viewers do at the last point of a delivery chain. The score does not localize drops in video quality (if the drops are due to bad sources or poor compression, to do that we need to monitor at upstream points in the workflow) but it does identify what viewers are perceiving, which is a great first step towards improving viewer experience. Based on our vast experience in content delivery chains and perceptual quality assessment, we can predict where an issue may be originating from after a deeper analysis of impairments.
- Technical specs (resolutions, codec, bitrate)
The services we compared are monitored at the highest resolution they are capable of delivering and forcing that resolution in respective apps while ensuring that the setup supports very high bitrates. Sports streaming services generally deliver at 1280x720, 1920x1080, and 3840x2160 resolutions at 59.94 or 60 frames per second. The most common codec used is H.264/AVC. The most common ABR standard is MPEG-DASH. Typically bitrates are between 5 and 9 Mbps using variable bitrate approaches.
- Other measurement considerations
SSIMPLUS scores are device adaptive. The data presented in the blog posts correspond to a viewer experience on a 65" 4K TV. Higher resolutions do benefit from higher content detail, if available. 4K resolution has a much higher likelihood to get to a 100 SVS and it would get to a 100 when there are no perceptually visible impairments in the content.
The scores provided in this blog are overall scores across the event. We have in-depth per-second and per-frame scores that show quality variation across content.
- How does the score compare from game to game or content to content?
The score considers saliency by modeling visual attention when assessing perceptual quality. Generally speaking, impairments in areas that matter more to viewers drag the score down further than impairments in areas that do not matter as much. As a result of this unique ability of the score, baseball scores are comparable to F1 scores and to any other sport we monitor.
How are the screen captures taken and what do their scores actually show?
Content is scored in raw format, in real-time, using HDMI capture cards. Content shared in this blog are frames extracted from the captures triggered based on quality thresholds. The scores for the frames presented in this blog post are “video scores” measuring overall video quality when a specific frame is played on a 4K 65” TV screen and not the score just for the exact frame, which can also be provided.