lofivor optimizations

organized by performance goal. see journal.txt for detailed benchmarks.

current ceiling

these target the rendering bottleneck since update loop is already fast.

technique	description	expected gain
increase batch buffer	raylib default is 8192 vertices (2048 quads). larger = fewer flushes	moderate
GPU instancing	single draw call for all entities, GPU handles transforms	significant
compute shader updates	move entity positions to GPU entirely	significant
OpenGL vs Vulkan	test raylib's Vulkan backend	unknown

technique	description	expected gain
frustum culling	skip entities outside view	depends on game design
LOD rendering	reduce detail for distant/small entities	moderate
temporal rendering	update/render subset per frame	moderate

currently not the bottleneck - update stays <1ms at 100k. these become relevant when adding game logic, AI, or collision.

technique	description	expected gain
uniform grid	spatial hash, O(1) neighbor lookup	high for dense scenes
quadtree	adaptive spatial partitioning	high for sparse scenes
broad/narrow phase	cheap AABB check before precise collision	moderate

technique	description	expected gain
SIMD (AVX2/SSE)	vectorized position/velocity math	2-4x on update
struct-of-arrays	cache-friendly memory layout for SIMD	enables better SIMD
multithreading	thread pool for parallel entity updates	scales with cores
fixed-point math	integer math, deterministic, potentially faster	minor-moderate

technique	description	expected gain
cache-friendly layout	hot data together, cold data separate	reduces cache misses
entity pools	pre-allocated, reusable entity slots	reduces allocation overhead
component packing	minimize struct padding	better cache utilization

see journal.txt for raw benchmark data.