Commit 7ae46fb
authored
[rlsw] Micro-optimizations, tighter pipeline and cleanup (#5673)
* auto generates all combinations of blending factors
This adds a macro system that generate a function for each possible combination of blending factors, resulting in 11*11 functions, hence 121.
This then allows for only one indirection and function call instead of two previously (assuming the first call was inlined).
* rename dispatch tables for consistency
* change blend funcs validity check
Simplifies the validation of blend functions.
Can allow `SW_SRC_ALPHA_SATURATE` as dst factor, but hey
* disables blending when it requires alpha and there is none
* review immediate rendering functions and attribute layout
* prevent state changes during immediate record
* reduce number of op for each vertex push + review primitive struct
* simplified draw functions
* review `sw_vertex_t`
removes `float screen[2]`; each step stores the transformed coordinates in `float coord[4]`.
This also simplifies vertex interpolation during triangle rasterization.
* reduces unnecessary interpolation costs during triangle rasterization + cleanup
* extends the simd color conversion to more cases
* affine interpolation per blocks
* long side check for each triangle line
My mistake in a previous commit
* style tweaks
* select the read function on texture load
This removes the per-pixel switch; it's slightly more efficient on my hardware, but probably a poor prediction
Should remain profitable or at worst the same
* use optionnal LUT for uint8_t -> float conversion
* sets internal the number of vertices post-clipping and the epsilon clipping + a little cleanup
* moves color conversion to math part
* prevents sampling if it's a depth texture that is bound1 parent 04c5dc4 commit 7ae46fb
1 file changed
+1243
-1110
lines changed
0 commit comments