Fixing pool allocator to properly signal allocation failure uncovered an
existing issue where we were lacking enough memory for dynamic model
BLASes on Linux/amdgpu. Erroneously the same memory region was used for
>1 BLAS. Surprisingly this hasn't led to any noticeable issues so far.
Increasing accels buffer size fixes the issue.
Renames previous METRICS to COUNTERS. These are still reset to zero
every frame.
Adds new METRICS which are preserved, maintained externally to speeds,
and only sampled by speeds code once per frame.
Also adds new metrics:
- `studio.cached_submodels` -- number of submodels in cache
- `geom.used` -- memory used by long allocations
- `geom.{vertices,indices}` -- counts of vertices/indices for long
allocations
- `geom.dyn_{vertices,indices}` -- counts of vertices/indices for
single-frame dynamic allocations
- Add variable name and registration src:line to the
`r_speeds_list_metrics` output. Makes it easier to reason about where
does this metric come from.
- Group metrics by their modules, makes it easier to discover.
- Do not print the list immediately on command, do it later in the
frame. Makes it print correct latest frame values.
The intent is to manage long-vs-single-frame allocations better.
Previously long allocations were map-long bump allocations, and couldn't be freed
mid-map, as there was neither a reference to the allocated range, nor a
way to actully free it.
Add a two-mode block allocator (similar to previous debuffer alloc) that
allows making long and once allocations. But now long allocations are
backed by "pool" allocator and return references to the range.
This commit doesn't do the deallocation yet, so map chaning doesn't yet
work.
- explicitly group cache-related fields
- move kusochki allocation to where it's actually used
this makes a step towards better blas management from bottom up
Draft the new accel/blas apis. Consolidate everything accel-related into
vk_ray_accel.c. Start splitting into more atomic functions. Prepare for
blas-model+kusochki split. etc etc.
The new code isn't really used yet.
1. Rename models passed to TLAS to instances.
2. Remove BLAS validation: old, doesn't make sense anymore.
3. Draft general blas mgmt approach in NOTES.md