this skips 40ms+ grid upload cost, and seems generally just a few ms more costly than sampling clusters.
however, there are sync glitches for reasons unknown.
for some reason joining them leads to "invalid spirv" validation errors (and broken lights).
split the bindings back making shaders essentially unchanged, while still keeping a single physical buffer
(look how easy that was! yay passes!
... still need to do spir-v parsing to extract bindings though)
perf (c1a0 lobby, 720p, 6900XT)
- total ray tracing time: 15.2ms
- primary: 0.7ms v:80/80 s:60/128 lds:2048 o:12/16 (-4v +1o)
- dir poly: 13.8ms v:256/256 s:98/128 lds:2048 o:4/16 (-28v +1o)
- dir point: 0.9ms v:85/96 s:68/128 lds:2048 o:10/16 (-6v +1o)
dir point and poly are not synchronized and overlap. but poly takes most
of the time, and point can only ramp up gradually at the very tail of
poly.
it stays roughly the same, vgpr 256, etc.
perf is a tiny bit better (12ms vs 14ms for all poly lights in c1a0
lobby /w shadows), but it may be sampling artifact
known issues:
- visible cluster boundaries which affect sampling outcomes
(essentially clusters act like very coarse shadows, and that's visible)
moving brush models are not supported yet, affects perf measurements
also change:
- pass plane equation instead of just normal
- pass area separately
- pack vertices offset+count into single integer
582us, 224(224)v, 97(128)v, 4096lds, 2/16o (-53v => +1)
known issues:
- noise is fixed
- overall light is too dark
- some lights are facing the wrong direction
test_brush2, green room, direct light:
336us, 192(192)v, 97(128)s, 4096lds, 2/16o (-45v => +1o)
also start refactoring light collection
broken:
- reloading lights after patching
- wagonchik lights (attached to non-static models)
missing:
- clustering the new poly lights
- proper sampling, only rough estimate for now
- shadows
probably a lot more
- fix normal2 packing
- work around desynced light cluster sizes
known issues:
- static seed for random
- no emissive
- no shadows
perf (direct): 6.4ms, 183(184)v, 59(128)s, lds=0, 2/16o (?!)