TVV Ep 12 - The evolution of web video elements with Steve Heffernan, creator of Video. JS and co-founder of Mux/Zencoder.

Welcome to our podcast! Today we have Steve Heffernan, Creator of JS and Co-Founder of Mux joining us. We'll be talking about the challenges that come with scaling video technology for websites, and how this technology has become an integral part of web design in recent years. Steve will share his experience on what it takes to make videos more affordable and efficient, as well as how businesses can leverage these technologies for their own growth. Get ready for a fascinating journey into the world of video streaming!

Watch the full video version.

Learn more about Visionular and get more information on AV1.

[Announcer] Welcome to The VideoVerse.

Nathan: Hey, everyone. Welcome to this episode of The VideoVerse. As usual, I am joined here e with my esteemed colleague, Zoe Liu. And today, we have a guest. I'm really excited to talk to him. Some of you will know who he is, but I almost guarantee every one of you will know what he's done if you haven't heard of him yet. But Steve, why don't you give us a really quick intro, who you are and what you currently do, and then we'll go and dive back a little bit into your past too.

Steve: Cool, great. Yeah, great to be here. I'm Steve Heffernan. I'm one of the co-founders of Mux, where we build APIs and other tools for developers working with video. Before Mux, I was a co-founder at Zencoder, the cloud encoding service, and alongside Zencoder, I created Video.js, the open source web video player.

Nathan: And I guarantee everyone has used at least one of those. One of those products in the past year. If you have anything to do with video delivery, you've at least used one of them for sure. So Steve obviously, you have a really interesting history when it comes to video delivery. I'll just use that as my broad term. Whether it's player, whether it's the encoding, what Mux is doing now on the development side. If we go way, way back, as far as we can think, what was your first encounter with video encoding, video delivery, how did you fall into this.

[00:01:41 First encounter with video encoding and delivery]

Steve: Yeah, totally. So I got into streaming video in the early 2000s when we still still had the browser plug-ins. We still had Windows Media Player, QuickTime, RealPlayer. Flash was on the scene. It was gaining user percentage. Flash came in and became that one plugin that 99% of users had and so you could finally stream video to one single experience as opposed to across all these different plugins. But that's the context of when I got into it and early 2000s, I was working at my university, I went to Azusa Pacific University and I was working on the website there. And we wanted to post user testimonials, student testimonials, why they loved the college and we wanted to post videos online. And so they set me to task doing that and nobody had done that before, and YouTube-

Nathan: I was gonna say, pretty cutting edge, right?

Steve: Yeah, a little bit like, we definitely weren't the first university doing anything with video, but YouTube didn't exist, none of the tools existed. And you really were building from scratch. And I of course, as a young developer, totally over architected the solution and built way more than I need to solve, just get a video playing on the website. But yeah, Flash, so Flash was there.

And so first, I had to learn how to build a Flash video player. Which Flash while it supported across all browsers, it actually didn't come with built-in controls. There were no default video controls for Flash. You had to build the play button and the time seek and all of that, you had to build that from scratch, including all of the logic behind the scenes to track how that worked and all that.

And so it was a really good primer for just like, this is how video works, video playback works, and the fundamental details of how UIs come together at least. But then at the same time, I had to learn video encoding 'cause we had to create these video formats that would play on the web. And I forget, I think at the time, there was two, there was Sorenson Spark, which was the original Flash Codex.

Nathan: Yes. I used that. Yeah, yeah.

Steve: And VP8. VP8 was the one that... No, VP6, it was VP6. And then H.264 came on the scene at some point in Flash. But that was the kinda progression of Codex that Flash could play at the time. And so like, I was in, what was the tool? So Sorenson Squeeze at the time was the tool I was using. We had an office iMac that was the one that lived, well, I'm trying to remember. I would set Sorenson Squeeze to encode these 480p videos and it would take all night long. So I just set it go overnight and then they would encode on the iMac overnight and I'd come back in the morning and the videos would be done and then I'd be able to upload them to the servers and see them play back in this Flash video player. So yeah, that was a really cool experience, a really big learning experience and...Yeah, I'm from there.

Nathan: And so I'm guessing, and boy, I tell you what I had... so I was telling Zoe just before we got on here, what we were about to talk to, I'm about to ask you about, I feel I was the beneficiary of so much of what you worked on. I was doing what you were doing where you encode it overnight a bunch of times, only to come in the next morning and realize that it looked like garbage and you have to try again. And you code it again if you have time. But I'm guessing, so around that point, the next phase was your dive into what became Video.js, the HTML player, right? Tell us why that was a critical moment in video delivery. Or am I jumping too far ahead?

[ 00:06:19 Critical moment in video delivery ]

Steve: Well, I'm trying to think if there's a few bits of context before there, I'm just trying to, if it's relevant. I'll just go through it and let me see if it's interesting. So before, I started on Video.js in the early 2010 essentially. And so right before that we had two things happening. One was the cloud in general. 2008 to 2010, this is the birth of the cloud, of Amazon S3 and EC2. And that's kinda where Zencoder ultimately came from. But even before that, my co-founders at Zencoder and I partnered with On2 Technologies around 2008. On2 Technologies is the creator of the VP Codex, so VP6, VP8s. And we had a partnership with them around 2008 to build a cloud encoding service called Flix Cloud at a time.

And it was like, API to Cloud transcoding and it was powered by Flix engine, the On2 command line based encoder. And we had that running on EC2 in the cloud and people could send videos to it and it would process it in the same way that cloud encoding services work today. That was one of the first cloud encoding services. And around that time, the exciting thing was VP8, it was the new codec that was better compression than the previous predecessors. Certainly at the time, they were claiming it was better compression than H.264. And we built a whole service around it. And then On2 was acquired by Google.

And because Google wanted to open source VP8 and basically make it world royalty-free for everybody. And VP8 became the WebM codec, not codec, WebM format. VP8 was what was... And so there's this interesting progression of that. And so Google bought On2, they wanted nothing to do with our little Flix Cloud service that was running on the side of what On2 was doing otherwise. And so they were like, "Hey, shut that down." And so, we did that, but at the same time we decided, okay, we'll build Zencoder instead. And so that's how Zencoder was born.

And in that we applied to Y Combinator, which is the popular startup incubator out in Mountain View. And we got in, we were surprised, but we got in and that gives you the... Getting into an incubator like that gives you the excuse to just shirk all other responsibilities, move to Mountain View and just focus on one thing for three months. And so we did that. We left our families and girlfriends, they were around the country, moved to Mountain View for three months and just built Zencoder. We rented this really interesting house in the forest in the Santa Cruz mountains. You take a small road to a smaller road to a smaller road in the middle of these redwood trees, with the banana slugs and the turkeys, and we're just hacking on Zencoder. And my two co-founders have families and they were on the weekends, they would go to home to visit their families and leave me in the middle of this forest just doing whatever.

At the time, so that's when HTML5 video was this new thing that early 2010 is probably supported in 20% of browsers at most. 20% of users' browsers, you had to have the most recent version of Safari and things that. And so I took one of these weekends to hack on HTML5 and see if I could build a player on HTML5 in the same way that I built the controls for Flash back in the early odds.

And so, yeah. That was a really fun exercise. I turned that into tutorial. That tutorial got picked up by, I don't know if, Daring Fireball, they were a popular blog at the time. And got picked up by Daring Fireball. 20,000 people visited the site. And it was like, oh my gosh, that's so many people. And Turned that into a library that became Video.js essentially. So that's the origin of Video.js. But at the time, I can get y'all, if you want me to, I can then just kind of spill into HTML5 and why that's kind of the important thing. Does that make sense?

Nathan: Yeah, let's talk about that. I remember as somebody who was implementing it, I was a customer, if you will. I mean, it was open, but it was a huge deal for us. But I don't wanna say why it was a huge deal for us. I'm curious on your end why this was such a big deal to the industry 'cause obviously, it changed everything.

Zoe: Yeah, I really to mention that I didn't know that. Even I found some over overlap with Steve in some way because On2 was acquired by Google and become the web team. So I was with that team for 3 1/2 before. And I'm co-founded . That was a team I was working.

Steve: Oh, that's great. That's great.

Zoe: Yes. 'Cause that was a team that they got VP9 out of VP8 and finally got AV1 and together with a few other members.

Steve: Yeah. So we were working with the John Luther at the time.

Zoe: I didn't work with him, but I knew the name really well.

Steve: That's really great. Small industry.

Nathan: Such a small one.

Zoe: Yes. Because I was really... It's just like you told the story before I was born, kind of like that.

Nathan: Exactly.

Steve: That's great.

Zoe: Well, I didn't touch the VPX series of a codex until I joined Google. I knew that, but prior to that, I was mainly working on H.263 first and H.264. That was my prior experience. I never got my hands on any of the open source video codecs until I joined Google.

Steve: That's great. That's really cool.

Nathan: It is interesting.

Zoe: Go ahead. You're good to talk about more fun stories down here.

Nathan: Yeah, I was gonna say, it is interesting that back then, Google was trying to do the open source thing and some things seems has never changed, right? They're still pushing that. That's cool. Go ahead.

[00:13:30 Importance of HTML5]

Steve: Well, yeah, your question was like, what was the impact? What was the importance of a HTML5. I'll say honestly, from my perspective initially, it just kinda felt right to begin with. The reason I started playing with a HTML5 video was not because I was expecting a big industry change or opportunity or anything that. It was that like, here is a native HTML element that is meant to play back video and it is built into the browser. It doesn't require this third party application to pull in the browser and it just use JavaScript and HTML and CSS to build your video player as opposed to, like, I liked Flash a lot, but it felt always felt like this third party other ecosystem that you had to deal with in order to do video in the browser.

And so for me as just a developer, that's what was most initially most exciting to me about HTML5 video . The fundamental positives about it is that because you didn't have to run this essentially separate application of Flash to get video working. It could be more performant than the Flash video player, at least certainly, in this the scheme of what was required of your computer to run both the browser and Flash in order to get video to play. And that's essentially what Apple pointed to. They may have had a lot of other reasons, but one of the main things that Apple pointed to when they decided to basically put their foot down and say Flash will not exist on the iPhone, they pointed to the performance aspects. Of Flash. And so that's ultimately what killed Flash was that decision by Apple to not allow it on the iPhone.

I still think HTML5 video would've ultimately taken over just because of that nativeness to the browser. But I think it would've taken many more years to get there. Whereas Apple really helped fast forward that progress. But it did mean, HTML5 video was still really immature in, I'm trying to remember when they really kind of cut it off. But in Flash, we had built this huge ecosystem of player technology in Flash. We had advertising, we had DRM, we had lots of interesting interactive features that were all working in Flash and open source proprietary stuff. Across the board, there's this big thriving ecosystem around video player development.

And HTML5 video didn't support any of that for many years. We couldn't do adaptive streaming in HTML5 video until media source extensions, which was maybe 2015. You couldn't do advertising easily. DRM didn't come out until around 2015, 2016. So we were playing catch up with HTML5 video for a lot of years. So there's this really awkward phase where, we had all this in Flash and now we've gotta transition from Flash to HTML5 video and we've gotta do it fast because iPhone doesn't support it. But not everything's there.

And so, that was an interesting period. But certainly before iOS put their foot on the ground, said, "We're not gonna support Flash," the industry was really reticent about HTML5 video. Most people I talked to, they said like, "Oh, only 20% of browsers support it. And we have all this stuff in Flash that we already have." HTML5 is never gonna take over. There's that kind of attitude. And so, after building this library, I kind of went out and did a lot of speaking at stream media and things and conferences trying to convince people like, HTML5 video, it's gonna be a thing. Get ready for it. And some people were excited about it and some people were like, "I don't wanna touch that. That's just more complexity for our video players that we don't need. " But yeah. That's something.

Nathan: Yeah, it feels there's a kind of history repeating itself over and over again because technology is technology, everything you just described, we're kind of in that same position right now with some of the new codecs coming out and some of the new technologies within all of that. AV1, everything, there's a lot of, "Oh, it's too complex, it's too resource intensive." And yet it's changing so fast that I think we're probably gonna see a similar type of situation there. You had shared kind of a little bit of this history repeating itself approach about, well, basically the concept of the web components in the browser, which is different than what Flash was. We all remember those, you gotta update your Flash notifications, but talk to us a little bit about the, yeah, we don't enjoy those. Talk to us about in today's world, technology world, what does this mean for us video delivery-wise with these web components in the browser? And I'm purposely being vague so that you can unpack it.

Steve: Thank you, appreciate that. Yeah, I think, again... Okay, firstly, here's this new technology and browsers called web components. And it is basically, there's great underlying technologies underneath that term, but the summary is like, it allows us to finally create native-looking HTML tags, HTML elements in the browser ourselves as opposed to relying on the default set of HTML tags that have been there for the last 30 years like so.

Before web components, everything was a div. A div tag, like every custom button, every custom UI thing was like UI built onto this generic div HTML tag. And you ended up with a div soup where it's divs embedded in divs and all of the semantic and readable structure of HTML just kind of went out the door because we just didn't have the tools to build our own HTML components into the browser. And so web components unlocks that ability. And the significance of that for video players is that, like up until this point, we've really been... The video element, the HTML video element is the center of the universe for video in a browser.

And there's a lot of good APIs that unlock what you can do with it, like media source extensions and you can build controls on top of custom controls on top of that. But it's still kind of limiting in ways because you can't do video outside of the video. You couldn't do a video outside of the video element easily. There's a few things that I think web components really unlocks for... Well, let me take a step back there. Because of the way the video element is, you end up with all of these proprietary systems built to create all of these different controls and features on top of the video element.

So like, the YouTube player is like, you have the video element at the base, but then there's this whole proprietary system built on top of that that only YouTube uses. And then Video.js is open source, but it's this whole proprietary system built on top of the video element that is not compatible with other video players because you really do have to build everything from scratch there because of the limitations of API here. But what web components allows us to do is essentially point to the video element API and say, every video player out there, if you're gonna expose an API in JavaScript, make it work exactly the video element API.

What that does is then make it so that if I create a custom play button, I can make an HTML tag that is play button, I can just drop that in the browser, and point it at either the video element or one of these players that's mimicking the video element API, that play button will work across both players. And in the same way, you can build additional video players that work with an ecosystem of UI components that are all expecting this one API. And it's kind of a complex idea, but at the end of the day, what it allows for is an ecosystem of development where as long as we're aligning on this central API, this central API design, then we can now start working together.

All of these different player development groups that are solving the same problem in the browser over and over and over again can start sharing the code more easily in an ecosystem that we can find these different things and apply them together. It's interesting, a parallel is kind of like, I'm not an expert on iOS video player development in the native IO in Swift or ExoPlayer on Android. But you have these a little bit more closed ecosystems where you just know, if I'm working on Android, I'm using these components. I'm using ExoPlayer, I'm using these components. If I'm on iOS, I'm using the built-in iOS buttons and I'm really relying on the Apple team to unlock different player features.

And so there's been this benefit of those ecosystems to actually pushing the video player progress forward because it's a little bit closed, but you can build in this predictable ecosystem. Whereas on the web, the web is meant to be open and yet, we're building all of these proprietary video players. We're not actually working together on the video components. Whereas what web components actually opens up is for all of these different groups to now start working on the same problems together. So I don't know if that explains it a little bit better. Does that make sense?

Nathan: Yeah, I think so. And I'm trying to wrap my head around it. It is really interesting is in one way it's opening things up because people can start working together more, the whole open source community concept. Which of course then, the danger is with anything open source, you get some really good stuff, you get some not so good stuff 'cause it's everybody and anybody. But this almost feels the Apple way is to lock everything down, which makes it much more controlled but much more reliable and predictable. This feels a weird kind of mixture of those two, am I thinking right?

Steve: Yeah, it's definitely on the end of the spectrum where we can now lean more heavily on the idea that we are working on the open web and we should be sharing things more than building proprietary systems.

Nathan: I was gonna say that that would speed up development time considerably I imagine, 'cause we're not reinventing the wheel over.

Steve: Yeah, that's the thought. We have some work to get there, but I feel as strongly about web components and what is unlocking for video as I did around HTML5 and what HTML5 unlocked for video. And so, it's in a place where I think it's for a lot of people it's still blurry, how that ultimately comes together. But I'm really excited for the potential of where we're going. And we've started to build some things around it.

Nathan: Are we gonna find you at conferences on your soapbox preaching this like HTML5?

Steve: Yeah, I have a few talks out there and I think I'm definitely beating a dead horse at this point. I'm probably getting very annoyed on subject, but yeah, I'm excited about it, telling people about it. We've started build some projects around it. We have a few, for example, we have a few media HTML elements where we have like a wrapper for the YouTube player that exposes the same API as the video element and then we have a set of controls that we call media Chrome and that's controls that will work with the video element or anything, any other player that works with the same API as the video element. And so that's this interesting kicking off point where we're trying to kick off this ecosystem and help people understand what what it means and start using these things.

Nathan: Yeah, that's interesting. Well, Steve, I wanna shift gears just slightly here. This has been a really fun, I walking down kind of and video delivery memory lane and obviously, the player.js was a huge piece of this, Zencoder is piece of this. But today, you're at Mux. And Mux, if I'm not mistaken, they a big part where Mux got their start or don't know how you wanna phrase it, had to do with video data for the developer. Is that fair to say? And talk to us a little bit about that.

[00:27:45 Start solving for people who want to build more video-related things]

Steve: Cool, yeah. So this around end of 2015, Mux's co-founders and I came together. We first decided we wanna stay in video, which was a choice in itself. Like, do we wanna go do some random social startup or something that or do you wanna stay in video? And we decided to stay in video. A, because we had all this expertise and knowledge and B, just video is just so interesting. There's no end to the depth of what you might learn when it comes to advanced codec math and things that to video player UIs and the complexity around that. There's this infinite depth when it comes to video technology.

And so I think we were just really excited in general to stay in video technology and we were kind of exploring, well, where's a problem in video technology that we could jump in and start solving for people like us who want to build more video-related things? And at the time, my co-founder Matt McClure, I think it was 2013, he created, kicked off this San Francisco video technology meetup, which is I think it's been running, it's had a meetup every single month since 2013 without a miss. That's pretty amazing. No thanks to me. Matt is the brain behind that. And that's been been really cool to see.

But in that meetup, in San Francisco, so there's a lot of people working on video technology specifically in San Francisco. And so we were able to develop friendships with engineers working at YouTube and at Netflix and understand kind of a little bit more of what are they working with. At that level of scale and that professionalism around video.

And one of the things that stood out to us was this just access to data and the power that they had around data platforms, which we had not seen otherwise really outside of specifically YouTube and Netflix. And there's this ability to A, just see everything that was happening around the world on video players, on client devices and understand quality of experience metrics. Like, how much rebuffing are people seeing? And what playback failures are happening and details of that.

But then also just being able to experiment and make a tweak to the UI or to the adaptive algorithm or to the codex that are being used and see the ultimate impact on watch time. So the overall engagement of the platform of the video. Like do people watch more? Do they watch less if I make this change to my adaptive algorithm? And that was just such a powerful concept to be able to make those changes and see the impact and continue to optimize your platform to the best experience you possibly can. And so we were excited about that and, and decide to build a service around that, allowing other people to get access to the same type of data.

And so that's what today, that's our product's called Mux Data and we launched that in 2016 and yeah, we've helped stream the last couple of Super Bowls with that product and World Cup and then a lot of other services are using that product to kind of do a lot of the same things that YouTube and Netflix can do with their powerful data they can do with the Mux Data product.

And that's been really cool just to get in. It was a completely different angle on video than video players and video encoding. It was like, okay now it's, now it's video data and it's a whole new world like analytics and metrics and the math behind that. And so it was both a sweet spot of video, but a whole new problem area to learn and understand how to do that well and so, yeah.

Nathan: No kidding. Yeah, and it's actually interesting to think that number one, there was a day when that wasn't readily available to people. They couldn't get those kind of insights. They streamed a video and then it was like, well I hope it did well. I hope I don't really know anything about it. So number two then while that has become, I don't know if I can call video data table stakes yet. To me, it's like, well how can you do anything with video, both from a marketing to a business to a development standpoint without the data? It does still seem in the video world, correct me if I'm wrong, there's still a lot of people that don't realize the insight that's available or if they do realize that they don't know what to do with it. You think that's a fair assumption?

Steve: It definitely depends on what type of data we're talking about. Because we talk about video data To your point, there's this wide swath of anything from just pure usage metrics to engagement metrics and what are the most popular parts of this video to rebuffering and player errors and things like that on that end of the spectrum.

And so it really depends on like, we definitely see... Anybody that's doing a video streaming platform at scale, they're deep into the metrics probably across the board both in the engagement and the quality experience side of things. But you get to like, we talk to a lot of smaller startups too, or just building a video feature into their social social application or something along those lines.

And especially at the startup stage, you just have all these fires and big initiatives that are ahead of reducing rebuffering by 10%. To where that level of data is, you're not doing much with it with yet. And you're probably gonna be first looking at the engagement, just understanding, are people actually watching video in this application? And is it boosting engagement of the app? Before you're really focused on improving the overall experience.

Zoe: Yeah, I have actually quite a few questions about, respective of time. So I just want to throw away, because you have been talking starting with Video.js and then now we are talking about Mux data. So I'm just curious why you use your own developed Mux data to gauge Video.js, for example, compared with other JavaScript or players?

Steve: Like why don't we use Mux data to basically...

Zoe: No, I'm just curious that using Mux data then there's quite something you want to validate. For example, you just mentioned it's mainly the quality of experiences with engagement of the users. So now, 'cause just now we talk about players. So I just want to know from the player perspective, and gauged by Mux data how different players perform and manifest it by the Mux data.

[00:35:17 How different players perform and manifest it by the Mux data]

Steve: Yeah, it's a good question. For example, there's one complexity first with Video.js and it's open source and so, well, I would love to just drop Mux data onto every video just install and be able to understand the performance around the world. People don't tend to that. In an open source project to look for it to come with a data beacon that's built into it. And so we certainly haven't done that with Video.js and we don't plan to.

But certainly, we often work with specific customers at Mux who are evaluating different players and comparing Video.js versus Bitmovin and THEOplayer and some of the other professional players out there. And so it really is interesting to say to see the performance comparisons that they do there. And it's interesting, it is very context-specific. I wish I could tell you like, oh this player is always number one. But it's very context-specific. It depends on the type of content. You have the length of content you have. Are you trying to go for low latency or do you not mind latency? It's all these details ultimately come in to the specific context of what players gonna be right for you, so.

Zoe: Yeah, so that's why I think there's a reason that you have the Mux data, right? Because you said it's really hard to justify whether these player's always the best choice because the different scenarios, different requests. And so that's how Mux data have different perspective like to manifest the performance of a different scenario.

Steve: Yeah, I will say we just launched a beta of, we have a new open source player specifically for our Mux video product. And so now we are actually in the place where we get to use our Mux Data product to optimize a player that we are in full control of. And so it's actually, five years later after building Mux Data, we're in a place now where we can really put it to its paces with a project that we're building ourselves. So that's been really fun.

Zoe: Yeah, so is basically close loop. Dig into player because along the line for example, 'cause we are basically like for my experience, so for example, we are basically doing the encoder, it's just mainly the center side. And so we are wondering the player side, for example, we always talk about time to first frame what the frozen, that kind of performances. So from the player side, because you created the Video.js in the beginning, this is mainly about technology. So what do you think what would matter the most? For example, regarding the time to first frame. And what kind of` technologies down there? And for example, also frozen rate because video, we are talking about this a three-dimensional signal.

If you just get a very good quality, that's such a slideshow. But video is more about smoothness and more about the flow. So the frozen is also very important. And overall also, we know that Mux is doing Mux video not only for file but also for live. And the reason you also announced that you're doing the really low delay RTC, that kind of streaming. So that means that there's also delay down there. So it's a delay for the first frame and the frozen rate and the overall delay. And then, so just from the player point of view, can you just get some technologies how this kind of like performance features can be further optimized by the player?

And then further, I would be more interested that, I know Mux can have all different manifest how this... So one is, how the player is being optimized to make this kind of performance even better? And then on the other side, how Mux data kind of give the feedback to the player to make it even better.

[00:39:46 Optimize performance from player perspective]

Steve: Great. So basically, how can we optimize some of these time to first frame and rebuffering or freezing metrics more from the player perspective, right?

Zoe: Right, just little bit of details from the technology point of view. I guess many players are trying to achieve that.

Steve: Absolutely, let me think about that. There's one layer that we start from. On the Mux data side of things, we try and take at the high level the perspective of the viewer first as opposed to the technology. It's really easy to start looking at the technology and just measuring the technology, but we try and stay at the level of like, what did the viewer experience?

And so even time to first frame, there's a subtle difference in this time to playback, time to first frame startup time metric. Whereas if you're just looking at the time the manifest was requested to when the play button or to when the frame first showed up, you're really measuring the technology, that stamp at that point. As opposed to if you're measuring it from when the user clicks play or the player decides to autoplay, the subtitle difference there is the opportunity to preload the video. You can have a strategy where you preload the video ahead of time.

And then when the user clicks play, it starts off immediately. 'Cause the video's already been loaded. And that's kinda the subtle, if we're just measuring the manifest load time, that would give you a different story than if it was like, what did the viewer experience? And so we definitely try and start from that perspective and just preloading the video is the obvious thing where if you're really trying to create a good startup experience, that's the easy one to point to.

But yeah, there's lots of then complexity, you have interfaces on mobile we're swiping between many videos and trying to get that different applications that I know aim for sub one second time to first frame where they really want, if it's above one second, that's a failure on the time to start up. And so they're doing, they're outside of even just the player itself, you're juggling the memory of multiple players on the page and how you're swapping between those players and preloading the videos in the background.

If you're predicting this person's likely gonna swipe to this next one or this is the obvious next one that they're gonna go to. You're at some point, predicting to start preloading that video so that when they do swipe, it's just ready to start going right away. So there's lots of interesting things you can get into from the client side to start optimizing the viewer experience and getting ahead of what you think the user is gonna do.

When you get into the realtime video technologies, that's a whole nother level where we certainly found that most people don't think of a difference between let's say live streaming and realtime video. When you used to say live video. What we're doing right now, people would call live video. But this is like, from a one degree more technical, I would call this real time video where it's like, I would call the Super Bowl live streaming. And the Super Bowl can have 30 seconds worth of latency. They're fine with that. They're fine with that, they're fine with, they might want to bleep out a cuss word that a news person says and have the space to do that.

And so they're like, latency is okay. Not too much latency, but way more than a Zoom call, a Google hangout call, you really have to be sub 300 milliseconds to have a really good conversation. And those are just vastly different scenarios. And at the end of the day, completely different technologies. The real time video side of things is WebRTC and you're making direct connections between the client or the media server in between.

You're not using traditional CDNs. And so traditional CDNs have dramatically optimized the distribution of content and commoditized the price of delivering video at scale. And you can't use any of that with WebRTC. And so there's lower latency, but added cost when you start using that technology compared to then, HLS and Dash, which are are still chunking up the files, putting them on a CDN, caching them around the world and using that commodity hardware to make video more efficient.

And so, we are in a really interesting place right now as an industry where we are both trying to make WebRTC more scalable and affordable. More scalable and affordable so you can reach tens of thousands of people with WebRTC at that sub one second latency. And at the same time, we're trying to make the traditional technologies of late HLS and Dash and the segmented streaming. We're trying to decrease the latency of those technologies to try and get to the same place.

And we're, either way, we just wanna get there with both of those technologies. And so there's some strategies you can use in between where you're like strategically, if I was gonna build a platform Twitch today, which is, you have lots of long tail streamers, there might be zero to one person watching this live stream. You don't wanna send that through an expensive encoding process where you're making multiple renditions and no one's ever gonna watch that video for potentially millions of streamers.

That would be inefficient. And so you might start with like, okay, well let's just connect the streamer directly through WebRTC and connect the first five to 10 viewers to the streamer directly. And then once you get beyond that, then we'll then we'll kick in HLS and start creating an encoded stream and start scaling up the stream from there. Yeah, so it's complicated from that standpoint. So I'm kind of rambling on that one, but that's a whole interesting landscape in itself.

Zoe: But you already mentioned quite a bit down there. So you mentioned that because people talking about delay is always that one, the delay keep reducing to a certain level and then completely, I think the use cases and scenarios are completely different and which also requires the underlying technology to become completely different like you mentioned. At least some of the technology talk about, at least nowadays that you mentioned even low delay may not be able to address and then we have to go with WebRTC. And then you also mentioned that I think it can be quite, some bit info down there like CDNs or you also even mentioned that some of the renditions. Well, it's nice to have, but you also mentioned if it's a long tail. Then, maybe we start from being conservative. It's just content that in several we just mentioned connect with one, two, three, five to 10 streamers. Let's see what's going on. And if there's more coming than we may just convert to this solution.

Steve: Yeah, just kind of the fundamental thing about videos is that it's still hard to scale, but it's also still really expensive at the end compared to text images and things that. Videos are still really expensive. You put in these places where you have to make strategic economic decisions around how you're doing encoding, when you're doing encoding, how you're doing the streaming.

And I'm really excited about where I feel just the video, the video industry is going because it feels we're moving to a place where it is just a primitive of the web. In the same way that text and images are just like, you just expect them to be part of a website. I think video is going that same direction where you just expect video to be part of a website and what every company is doing, you're gonna be doing something with video. But it's that expensive video, that scalable video that we're still pushing up against. And if we can unlock some of that efficiency and bring the costs down so that it is more on the realm of affordability of just hosting images than I think we're gonna see a lot more use of the video and I think it'll be be really cool.

Nathan: Yeah. And it seems we're on the cusp of a bunch of those things coming together here too. There's a lot of new tech that is doing just that, is what it seems to me. Very cool.

Zoe: The demand is definitely high and even though we all knew that it used to be the past, I think two or three years that in the pandemic, a lot of videos have been consumed. Now people start to talk about maybe at least some of us on some of the days, we're going back to the office, but the world is ready like this. For myself, I think the world is ready. We can't fully get away from online anymore. We can't go off where we were before.

Nathan: Yeah, right. Exactly, it is what it is.

Zoe: And yeah, with more videos are being consumed, there's a lot of challenging, but really rewarding and fun. And fascinating problems down there.

Nathan: Yeah, absolutely. Steve, this has been super fun. Thank you so much for coming on and being on the show and just being willing to unpack your experience, share the highs and the lows, the challenges. I know that there's probably a lot of listeners that either had no clue. "Wow, I had no idea that's where it came from." Or, "We're right in there with you and are happy to hear somebody else was fighting those same battles."
Thank you so much. And we look forward to talking to you.

The VideoVerse

TVV Ep 12 - The evolution of web video elements with Steve Heffernan, creator of Video. JS and co-founder of Mux/Zencoder.

Listen to this podcast on