Where Does YouTube TV Rank For Live Sports Video Quality? Part 3

In our first two blog posts, we reviewed YouTube TV’s performance over six of the most popular sports in the U.S.—the NFL, MLB, Formula One, English Premier League, tennis, and golf.

Our key finding: YouTube TV’s overall quality across the six sports was not always up to their claim of a “Premium Live Streaming Service” at least not for their sports content. View some of the scores from each sport.

Why is YouTube TV’s video quality performance lower compared to the competition?

Media supply chains consist of many different parts from content processing to playout. The numerous stages can leave them prone to a wide range of issues impacting video quality. 

Our comprehensive testing shows that YouTube TV’s average video quality performance is in the low 70s. We’re going to look at some of the potential reasons for that and why they are falling behind the top services that manage to deliver sports at a Viewer Score of 80 and above. 

We monitor more than 18 million hours of live video every month to give us a good base to draw some solid conclusions.

When our virtual probes are deployed at different points in the Live workflow, we can pinpoint with great accuracy where the video quality drop happened and the main reason for this. We do not have inside information for YouTube TV’s specific Live Workflow setup, but we can predict the reason for the issues from the nature and behavior of the impairments. 

Issue #1: Lack of transparency and data on perceptual quality. 

Media supply chains are generally divided into many separate stages from content creation, contribution, distribution, preparation, delivery, playback, and consumption. More often than not, there are internal and external organizations that are responsible for one or more of these stages. 

Almost always, these organizations are not aware of the performance of the stages they’re not responsible for. This lack of transparency and real-time data sharing often results in a poor or suboptimal video experience for subscribers. 

Without actionable data to say with confidence whether Process A is better than Process B, important workflow decisions are made in the dark and operational teams end up working blind. Having correlated data from the stadium to the playout device that tells the effect of each process on the perceptual video quality brings many business and operational opportunities to streaming services, from less time to resolve issues to better workflow performance. In the end, this will lead to more happy subscribers and more ARPU.

end to end dashboardFigure 1: This is a dashboard from SSIMPLUS LIVE Monitor UI, click on the image for a larger view. It shows a real sports streaming workflow monitoring end-to-end: MP1 (monitoring point 1) is measuring the quality of the source with a Viewer Score of 83. MP2 is measuring post-encoder; HLS ladder with Viewer Scores ranging from 44 to 74. MP3 is HDMI of the set-up box; what the player is selecting. In this case, the player selected the highest bitrate variant, but this is not always the case and viewers might experience poor video quality. 

Issue #2: Having issues with the source and/or not using the best source

Sports streaming services typically can choose from at least two different source feeds for each respective game. Having the ability to measure which source has better video quality is crucial. As content goes through the different stages of the live workflow, there are more opportunities for it to degrade or worsen as it is processed.

Choosing the best source is an important ability. But classical metrics like VMAF, PSNR, and original SSIM are full-reference based metrics, and therefore always assume that the source is “pristine” which is almost always not true for live workflows. 

Only a no-reference metric that accurately predicts the actual viewer experience like SSIMPLUS can assess the quality of a source across all content, a wide range of codecs and bitrates. 

If there is only one source available, it becomes even more important to ensure that what YouTube TV is receiving from the different live broadcasters does not have video quality issues that would deteriorate the quality even further.

SSIMPLUS Live Monitor DashboardFigure 2: When we monitor multiple services, we have the broadcaster logo and based on this, we can say whether the issue is with the source by comparing the different feeds that originate from different locations.

Issue #3: Not testing your video processing workflow extensively across content types and not setting them up with the end-viewer quality in mind 

PayTV providers and streaming services generally test their encoders and media processing workflows before they select and configure them. But that testing effort is often done with a limited amount of content. These trials typically do not cover all sports or even content variation inside a single game either. If we look at soccer matches, for example, there are a wide array of camera shots, from wide panning motion to zoom in on the action in front of the goal.

Encoding settings geared towards saving bits

It is commonly observed that the encoding configurations that delivered the “best quality” at a specific bitrate for a test reel may result in much lower video quality for other forms and nature of sports content. In YouTube TV’s case, we subjectively and objectively assessed that the video quality was low across most sports with a few exceptions. 

The results of the assessment could mean that YouTube TV’s overall encoding approach is mostly geared towards saving bits by using a variable bitrate (VBR) approach that does not always perform well across all sports content. 

Not using a perceptual quality-aware system

A higher bitrate is not always needed to deliver higher quality. This is where having a perceptual quality-aware automated system can help a service choose the best configuration for each sport that achieves the targeted video quality with the minimum amount of bits. 

Various content-aware encoding approaches claim that they are adjusting encoding behavior based on content complexity, but none of them have the right video quality metric, at the center of it, that correlates well with the human visual system to measure how successfully the encoders are able to achieve target quality. 

Figure 2 shows the performance of a source and encoder at a bitrate of ~20Mbps across various content types. The top and bottom curves plot the video quality of the source and output of an encoder, respectively. The source quality varies significantly between the three scenes— a live NBA game, in-studio analysis, and a remote interview. 

Not surprisingly, the source performs the best when the content is captured in a controlled environment at a studio and performs the worst when the remote interview is conducted. The difference in the score between the source and output is driven by encoding performance. The higher the difference, the lower the encoding performance and vice versa. The encoder fails to deliver consistent performance across the content types, even though it provides a content-aware encoding approach. SSIMPLUS is the only metric that can provide such an in-depth analysis of content quality across multiple monitoring points in an apples-to-apples fashion.

Annoying impairments due to lack of a proper video quality assessment

Video processing workflows often introduce annoying impairments that make it difficult for viewers to enjoy the content. We observed this behavior repeatedly when assessing YouTube TV’s video quality. A key example of this is de-interlacing artifacts that were commonly observed across multiple sports and games delivered by YouTube TV. Such impairments were not always present in streams delivered by other streaming services delivering the same content. The best way to ensure that your workflow is free of annoying impairments such as combing effect, caused by poor deinterlacing, is to monitor and automate video quality assessment.

F1_SSIMPLUS_47_TVFigure 3: The above is an example of interlacing in YouTube TV Formula 1 streaming, most commonly caused by bad deinterlacing performance.

Golf_SSIMPLUS_52_TVFigure 4: Another example of YouTube TV’s encoder struggling to keep details crisp and clear. The grass details are completely lost.

What’s next?

Each service has its own unique workflow elements and setup that could affect video quality. The best way to know which part of the workflow is deteriorating the sports fans’ viewer experience is to measure methodically at each critical point in an automated fashion.

In our next post, we will continue to evaluate YouTube TV’s performance looking at four additional issues along with sharing ways that sports streamers can optimize and automate their workflows to improve video quality and reduce churn. Subscribe to our blog

Subscribe

Want to learn more about how you can ensure your live streaming business does not allow bad video quality to creep in? Download our Live Sports Benchmarking White Paper.

 

The table below contains some of the highlights from each sport.

Sport YouTube TV Viewer Score Best performing service Viewer Score Difference*

National_Football_League_logo.svg

October 16, 2022

 

78

 

83

 

-5

Major_League_Baseball_logo.svg

September 7, 2022

 

71.5

 

85.5

 

- 14

f1-tv-logo

Hungarian Grand Prix

 

64.1

 

79.5

 

- 15.4

PGA_Tour_logo.svg

August 28, 2022

 

77.4

 

77.4**

 

0

US_OPEN_Logo

August 30, 2022

 

68.1

 

76

 

- 7.9

Premier_League_Logo.svg

September 4, 2022

 

76.5

 

83.3

 

- 6.8

*A difference of 5 or more Viewer Score points is readily noticeable.
**In this case YouTube TV performed better than the other monitored service which scored 75.8