BVH updates using treelet sampling for
real-time raytracing of animated scenes

Wilhem BarbierMathias Paulin

Bounding volume hierarchies


Bounding volume hierarchies for animated scenes

primitives move over time, no insertion/deletion
  • Rebuild every frame
    • consistent BVH quality
    • doesn't scale to large scenes
  • Refitting: keep tree topology, recompute bounding boxes
    • Fast
    • The BVH can degrade over time
  • Refitting + BVH optimization
    • Idea: refitting, then "repair" the degraded hierarchy
    • BVH optimization: take a BVH as input, and output a better BVH

BVH quality measure: Surface Area Heuristic (SAH)

$$SAH = c_T \sum_{\substack{N_i\text{ interior}\\\text{node}}} Area(N_i) + c_I \sum_{N_l\text{ leaf}} |N_l|Area(N_l)$$

BVH optimization on the GPU

Tree rotations

  • fast
  • struggles on some scenes with complex motion

Treelet restructuring

  • more powerful optimization
  • GPU-friendly

Treelet restructuring

The optimization takes place over the entire BVH: too slow

Idea: sample treelets only where the BVH is degraded

Choosing candidate treelets

Objective: sample treelets that benefit from restructuring

Sample roots, then grow treelets

Root sampling:

$$\begin{align*}p(N) &= \Delta Area(N_L) \\&+ \Delta Area(N_R)\end{align*}$$
$$\begin{align*}p(N) &= Area(N_L) - CachedArea(N_L) \\&+ Area(N_R) - CachedArea(N_R) \end{align*}$$
$$\begin{align*}p(N) &= Area(N_L) - CachedArea(N_L) \\&+ Area(N_R) - CachedArea(N_R) \\&+ \alpha(p(N_L) + p(N_R))\end{align*}$$
$p$ is computed bottom-up during refitting, then used as sampling weights

Growing treelets: choose nodes with max area

We obtain a set of candidate treelets $\mathbf{T}$

Conflicts between treelets

Treelets intersect $\implies$ can't restructure in parallel

Solution:

  • Restructure treelets in parallel without writing to the global BVH
  • Compute reward (SAH decrease) for each treelet
  • Use this reward to compute the best subset of non-intersecting treelets $\mathbf{T'} \subset \mathbf{T}$
  • Finally apply the modifications for the subset $\mathbf{T'}$

Defining T'

$\mathbf{T'}$ must maximize the SAH reduction after restructuring:

\[ \begin{align*} &\operatorname*{maximize}_{\mathbf{T'} \subset \mathbf{T}}\sum_{i=1}^{|\mathbf{T'}|} \Delta SAH(T'_i) \\ & \text{s.t } T'_i \cap T'_j = \empty \quad\forall i,j \in \{ 1, \dots, |T'| \}, i \neq j\\ \end{align*} \]

Bruteforce: $2^{|\mathbf{T}|}$ possibilities

Naive greedy conflict resolution doesn't work well in practice

Computing T'

$T_1$
$T_2$
$T_3$
$T_4$
$T_5$
Build the conflict graph $G = (V,E)$ where \[ \begin{align*} V &= \{1, ..., |\mathbf{T}| \} \\ E &= \{ \,\{i,j\} \,|\, T_i \cap T_j \neq \empty\,\} \\ w(v) &= \Delta SAH(T'_v) \end{align*} \]
\[ \begin{align*} &\operatorname*{maximize}_{V' \subset V}\sum_{v \in V'} w(v) \\ & \text{s.t }\, \forall (i,j) \in V'^2, i \neq j, \{ i, j \} \not\in E\\ \end{align*} \]

Maximum weight independent set

Maximum Weight Independent Set

Input: Graph $G = (V,E)$ with weights $w$ on the vertices

Output: \[ \begin{align*} &\operatorname*{maximize}_{V' \subset V}\sum_{v \in V'} w(v) \\ & \text{s.t }\, \forall (i,j) \in V'^2, i \neq j, \{ i, j \} \not\in E\\ \end{align*} \]

Well known combinatorial optimization problem 😀

NP-hard 😨for general graphs

Our conflict graphs actually have a very special structure that we can exploit...

Chordal graphs

Our conflict graphs are intersection graphs of subtrees

not chordal
chordal

MWIS of chordal graphs: reduction

Based on the Isolated Weight Transfer reduction [Lamm2018]

Isolated vertex: $v$ is an isolated vertex if all neighbors of $v$ are connected by an edge
Maximal isolated vertex: $v$ is a maximal isolated vertex if it is isolated and has the maximum weight among its isolated neighbors

1
2
8 6
4
5 1

Let $v$ be a maximal isolated vertex,

  1. $\forall u \in N(v)$, if $w(u) \leq w(v)$: remove $u$
  2. Remove $v$ and $\forall u \in N(v)$,
    change its weight to $w(u)−w(v)$

Property 1: chordal graphs have at least one isolated vertex.

Property 2: a subgraph of a chordal graph is chordal.

This means that the reduction will converge to the empty graph

MWIS of chordal graphs: expansion

1
2
6 8
4
5

$\mathcal{I}_3 = \{\}$

$\mathcal{I}_2 = \{$
$\}$
$\mathcal{I}_1 = \{$
$\}$

Sequence of graphs $G_1, ..., G_N$ with MWIS $\mathcal{I}_1, ..., \mathcal{I}_N$

$$\mathcal{I}_k = \mathcal{I}_{k+1} \cup M_k$$

$$\begin{align*}M_k = \{ v \in G_k,\,& v \text{ is maximal isolated} \\ &\text{ and } N(v) \cap \mathcal{I}_{k+1} = \emptyset\} \end{align*}$$


We propose a GPU algorithm based on this reduction

Results

BreakingLion ExplodingDragon ClothBall BigScene

Results

Update time:

  • Faster than rebuild ($\times$ 1.6-2.7 speedup)
  • Slower than rotations ($\times$ 1.5-9.0 slowdown)

SAH:

  • Rotations struggle on difficult scenes
  • Treelet sampling quality is consistent across scenes

Raytracing time:

  • Our method performs worse than predicted by SAH
  • We are currently investigating this phenomenon

Conclusion


We propose a novel GPU algorithm to update a BVH based on sampling treelets.

Choosing the best subset of non-intersecting treelets is equivalent to finding the MWIS of a chordal graph. We compute this MWIS in parallel on the GPU.

The resulting system gives hierarchies with consistently low SAH.

We are currently investigating why the raytracing performance is worse than predicted by the SAH.



Thank you for your attention!

Graph representation

We need efficient parallel implementations of the following operations:

  • Check if two vertices are neighbors
  • Iterate over the neighbors of a vertex
  • Delete a vertex
  • Reinsert a previously deleted vertex

We use both a dense and a sparse representation of the adjacency matrix

  • Dense representation: 2D matrix
  • Sparse representation (compressed sparse row): explicitly store the neighbors of every vertex

Vertices with nonpositive weights are considered deleted

MWIS computation: reduction phase

3 kernels for the reduction phase, 3 kernels for the expansion phase

Reduction (loop until the graph is empty):

  • Find the isolated vertices
    • Two threads per edge
    • Check if the neighbors of $v$ are connected to $u$
  • Check if max isolated vertex
    • One thread per vertex
    • If vertex is isolated: check if it is maximal
  • Reduce graph
    • One thread per max isolated vertex
    • Traverse neighbors and subtract weights
1 -1 -3
2 0 -2
8 2 0
4 0 -2
5 1 -1
Iteration 1:
4
2
Iteration 2:
2

MWIS computation: expansion phase

3 kernels for the reduction phase, 3 kernels for the expansion phase

Expansion (loop for the same number of iterations):

  • Expand graph
    • One thread per neighbor $u$
      of a max isolated vertex $v$
    • Add $w(v)$ to $w(u)$
  • Check if the neighbors are in the solution set
    • One thread per neighbor $u$
      of a max isolated vertex
    • Check if the solution set contains $u$
  • Add to solution set
    • One thread per max isolated vertex $v$
    • If none of the neighbors of $v$ are in
      the solution set, add $v$ to solution set
-3 -1 1
-2 0 2
0 2 8
-2 0 4
-1 1 5
Iteration 1:
4
2
Iteration 2:
2
Solution set: