Add release notes
This commit is contained in:
parent
0568204cb7
commit
a842800ede
10 changed files with 97 additions and 0 deletions
8
releases/0.1.0-unoptimized.txt
Normal file
8
releases/0.1.0-unoptimized.txt
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
the baseline: one draw call per entity, pure and simple
|
||||
|
||||
- individual rl.drawCircle() calls in a loop
|
||||
- ~5k entities at 60fps before frame times tank
|
||||
- linear scaling: 10k = ~43ms, 20k = ~77ms
|
||||
- render-bound (update loop stays under 1ms even at 30k)
|
||||
- each circle is its own GPU draw call
|
||||
- the starting point for optimization experiments
|
||||
8
releases/0.2.0-texture_blitting.txt
Normal file
8
releases/0.2.0-texture_blitting.txt
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
pre-render once, blit many: 10x improvement
|
||||
|
||||
- render circle to 16x16 texture at startup
|
||||
- drawTexture() per entity instead of drawCircle()
|
||||
- raylib batches same-texture draws internally
|
||||
- ~50k entities at 60fps
|
||||
- simple change, big win
|
||||
- still one function call per entity, but GPU work is batched
|
||||
9
releases/0.3.0-quad_batching.txt
Normal file
9
releases/0.3.0-quad_batching.txt
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
bypass the wrapper, go straight to rlgl: 2x more
|
||||
|
||||
- skip drawTexture(), submit vertices directly via rl.gl
|
||||
- manually build quads: rlTexCoord2f + rlVertex2f per corner
|
||||
- rlBegin/rlEnd wraps the whole entity loop
|
||||
- ~100k entities at 60fps
|
||||
- eliminates per-call function overhead
|
||||
- vertices go straight to GPU buffer
|
||||
- 20x improvement over baseline
|
||||
11
releases/0.3.1-batch_buffer.txt
Normal file
11
releases/0.3.1-batch_buffer.txt
Normal file
|
|
@ -0,0 +1,11 @@
|
|||
bigger buffer, fewer flushes: squeezing out more headroom
|
||||
|
||||
- increased raylib batch buffer from 8192 to 32768 vertices
|
||||
- ~140k entities at 60fps on i5-6500T
|
||||
- ~40% improvement over default buffer
|
||||
- fewer GPU flushes per frame
|
||||
- also added: release workflows for github and forgejo
|
||||
- added OPTIMIZATIONS.md documenting the journey
|
||||
- added README, UI panel with FPS display
|
||||
- heap allocated entity array to support 1 million entities
|
||||
- per-entity RGB colors
|
||||
13
releases/0.4.0-gpu_instancing.txt
Normal file
13
releases/0.4.0-gpu_instancing.txt
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
gpu instancing: a disappointing discovery
|
||||
|
||||
- drawMeshInstanced() with per-entity transform matrices
|
||||
- ~150k entities at 60fps - barely better than rlgl batching
|
||||
- negligible improvement on integrated graphics
|
||||
- why it didn't help:
|
||||
- integrated GPU shares system RAM (no PCIe transfer savings)
|
||||
- 64-byte matrix per entity vs ~80 bytes for rlgl vertices
|
||||
- bottleneck is memory bandwidth, not draw call overhead
|
||||
- rlgl batching already minimizes draw calls effectively
|
||||
- orthographic camera setup for 2D-like rendering
|
||||
- heap-allocated transforms buffer (64MB too big for stack)
|
||||
- lesson learned: not all "advanced" techniques are wins
|
||||
17
releases/0.5.0-ssbo_instancing.txt
Normal file
17
releases/0.5.0-ssbo_instancing.txt
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
ssbo breakthrough: 5x gain by shrinking the data
|
||||
|
||||
- pack entity data (x, y, color) into 12-byte struct
|
||||
- upload via shader storage buffer object (SSBO)
|
||||
- ~700k entities at 60fps (i5-6500T / HD 530)
|
||||
- ~950k entities at ~57fps
|
||||
- 5x improvement over previous best
|
||||
- 140x total from baseline
|
||||
- why it works:
|
||||
- 12 bytes vs 64 bytes (matrices) = 5.3x less bandwidth
|
||||
- 12 bytes vs 80 bytes (rlgl vertices) = 6.7x less bandwidth
|
||||
- no CPU-side matrix calculations
|
||||
- GPU does NDC conversion and color unpacking
|
||||
- custom vertex/fragment shaders
|
||||
- single rlDrawVertexArrayInstanced() call for all entities
|
||||
- shaders embedded at build time
|
||||
- removed FPS cap, added optional vsync arg
|
||||
5
releases/0.5.1-windows_build.txt
Normal file
5
releases/0.5.1-windows_build.txt
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
cross-platform release: adding windows to the party
|
||||
|
||||
- updated github release workflow
|
||||
- builds for both linux and windows now
|
||||
- no code changes, just CI/CD work
|
||||
10
releases/0.6.0-zoom_zoom.txt
Normal file
10
releases/0.6.0-zoom_zoom.txt
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
zoom and pan: making millions of entities explorable
|
||||
|
||||
- mouse wheel zoom
|
||||
- click and drag panning
|
||||
- orthographic camera transforms
|
||||
- memory panel showing entity buffer sizes
|
||||
- background draws immediately (no flicker)
|
||||
- tab key toggles UI panels
|
||||
- explained "lofivor" name in README (lo-fi survivor)
|
||||
- shader updated for zoom/pan transforms
|
||||
5
releases/0.6.1-q_to_quit.txt
Normal file
5
releases/0.6.1-q_to_quit.txt
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
quick exit: zoom out then quit
|
||||
|
||||
- q key first zooms out, second press quits
|
||||
- nice way to see the full entity field before closing
|
||||
- minor UI text fix
|
||||
11
releases/0.7.0-compute_shader.txt
Normal file
11
releases/0.7.0-compute_shader.txt
Normal file
|
|
@ -0,0 +1,11 @@
|
|||
compute shader: moving physics to the GPU
|
||||
|
||||
- entity position updates now run on GPU via compute shader
|
||||
- GPU-based RNG for entity velocity randomization
|
||||
- full simulation loop stays on GPU, no CPU roundtrip
|
||||
- new compute.zig module for shader management
|
||||
- GpuEntity struct with position, velocity, and color
|
||||
- tracy profiling integration
|
||||
- FPS display turns green (good) or red (bad)
|
||||
- added design docs for zoom/pan and compute shader work
|
||||
- cross-platform alignment fixes for shader data
|
||||
Loading…
Reference in a new issue