Previous | Next --- Slide 86 of 88
Back to Lecture Thumbnails
Sneaky Turtle

What's the input they're using to train this network?

mithrandir

Found this nice summary from NVIDIA. Seems they're have the capacity to generate very nicely optimized output frames, though not in real-time. So, they pre-render a scene with all the desired anti-aliasing, feature enhancement, image sharpening and display scaling.

At runtime, they want to render a lower resolution in real-time, then use the DLSS model to output the superior quality frame.

The input seems to be the low resolution real-time rendered frame and a motion vector image computed with the previous superior quality frame.

For training, this is paired with the pre-rendered target output. The model itself is a CNN-based autoencoder.

Source (https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-2-0-a-big-leap-in-ai-rendering/)

jochuang

As an extension of this idea taken to the extreme you could forego any detail at all and have a network "hallucinate" textures to render a game/environment - https://news.developer.nvidia.com/nvidia-invents-ai-interactive-graphics/?ncid=so-you-ndrhrhn1-66582. Obviously not the same pipeline here but in spirit kinda similar

gtier

Are these models stored with the GPU in VRAM or in firmware?

jochuang

@gtier most likely uses VRAM sharing video memory with the rest of the graphics stack. They're usually executed using specialized tensor cores separate from rendering CUDA cores and the like.

Please log in to leave a comment.