Future proofing video with high-res stills

On Saturday I wrote about how we're now capturing the world so completely that people of the future will be able to wander around it in accurate VR. Let's go further and see how we might shoot the video resolutions of the future, today.

Almost everybody has a 1080p HD camera with them -- almost all phones and pocket cameras do this. HD looks great but the future's video displays will do 4K, 8K and full eye-resolution VR, and so our video today will look blurry the way old NTSC video looks blurry to us. In a bizarre twist, in the middle of the 20th century, everything was shot on film at a resolution comparable to HD. But from the 70s to 90s our TV shows were shot on NTSC tape, and thus dropped in resolution. That's why you can watch Star Trek in high-def but not "The Wire."

I predict that complex software in the future will be able to do a very good job of increasing the resolution of video. One way it will do this is through making full 3-D models of things in the scene using data from the video and elsewhere, and re-rendering at higher resolution. Another way it will do this is to take advantage of the "sub-pixel" resolution techniques you can do with video. One video frame only has the pixels it has, but as the camera moves or things move in a shot, we get multiple frames that tell us more information. If the camera moves half a pixel, you suddenly have a lot more detail. Over lots of frames you can gather even more.

This will already happen with today's videos, but what if we help them out? For example, if you have still photographs of the things in the video, this will allow clever software to fill in more detail. At first, it will look strange, but eventually the uncanny valley will be crossed and it will just look sharp. Today I suspect most people shooting video on still cameras also shoot some stills, so this will help, but there's not quite enough information if things are moving quickly, or new sides of objects are exposed. A still of your friend can help render them in high-res in a video, but not if they turn around. For that the software just has to guess.

We might improve this process by designing video systems that capture high-res still frames as often as they can and embed them to the video. Storage is cheap, so why not?

I typical digital video/still camera has 16 to 20 million pixels today. When it shoots 1080p HD video, it combines those pixels together, so that there are 6 to 10 still pixels going into every video pixel. Ideally this is done by hardware right in the imaging chip, but it can also be done to a lesser extent in software. A few cameras already shoot 4K, and this will become common in the next couple of years. In this case, they may just use the pixels one for one, since it's not so easy to map a 16 megapixel 3:2 still array into a 16:9 8 megapixel 4K image. You can't just combine 2 pixels per pixel.

Most still cameras won't shoot a full-resolution video (ie. a 6K or 8K video) for several reasons:

  • As designed, you simply can't pull that much data off the chip per unit time. It's a huge amount of data. Even with today's cheap storage, it's also a lot to store.
  • Still camera systems tend to compress jpegs, but you want a video compression algorithm to record a video even if you can afford the storage for that.
  • Nobody has displays to display 6K or 8K video, and only a few people have 4K displays -- though this will change -- so demand is not high enough to justify these costs
  • When you combine pixels, you get less noise and can shoot in lower light. That's why your camera can make a decent night-time video without blurring, but it can't shoot a decent still in that lighting.

What is possible is a sensor which is able to record video (at the desired 30fps or 60fps rate) and also pull off full-resolution stills at some lower frame rate, as long as the scene is bright enough. That frame rate might be something like 5 or even 10 fps as cameras get better. In addition, hardware compression would combine the stills and the video frames to eliminate the great redundancy, though only to a limited extent because our purpose is to save information for the future.

Thus, if we hand the software of the future an HD video along with 3 to 5 frames/second of 16megapixel stills, I am comfortable it will be able to make a very decent 4K video from it most of the time, and often a decent 6K or 8K video. As noted, a lot of that can happen even without the stills, but they will just improve the situation. Those situations where it can't -- fast changing objects -- are also situations where video gets blurred and we are tolerant of lower resolution.

It's a bit harder if you are already shooting 4K. To do this well, we might like a 38 megapixel still sensor, with 4 pixels for every pixel in the video. That's the cutting edge in high-end consumer gear today, and will get easier to buy, but we now run into the limitations of our lenses. Most lenses can't deliver 38 million pixels -- not even many of the high-end professional photographer lenses can do that. So it might not deliver that complete 8K experience, but it will get a lot closer than you can from an "ordinary" 4K video.

If you haven't seen 8K video, it's amazing. Sharp has been showing their one-of-a-kind 8K video display at CES for a few years. It looks much more realistic than 3D videos of lower resolution. 8K video can subtend over 100 degrees of viewing angle at one pixel per minute of arc, which is about the resolution of the sensors in your eye. (Not quite, as your eye also does sub-pixel tricks!) At 60 degrees -- which is more than any TV is set up to subtend -- it's the full resolution of your eyes, and provides an actual limit on what we're likely to want in a display.

And we could be shooting video for that future display today, before the technology to shoot that video natively exists.

Add new comment