Skip to content

Commit 7ae46fb

Browse files
authored
[rlsw] Micro-optimizations, tighter pipeline and cleanup (#5673)
* auto generates all combinations of blending factors This adds a macro system that generate a function for each possible combination of blending factors, resulting in 11*11 functions, hence 121. This then allows for only one indirection and function call instead of two previously (assuming the first call was inlined). * rename dispatch tables for consistency * change blend funcs validity check Simplifies the validation of blend functions. Can allow `SW_SRC_ALPHA_SATURATE` as dst factor, but hey * disables blending when it requires alpha and there is none * review immediate rendering functions and attribute layout * prevent state changes during immediate record * reduce number of op for each vertex push + review primitive struct * simplified draw functions * review `sw_vertex_t` removes `float screen[2]`; each step stores the transformed coordinates in `float coord[4]`. This also simplifies vertex interpolation during triangle rasterization. * reduces unnecessary interpolation costs during triangle rasterization + cleanup * extends the simd color conversion to more cases * affine interpolation per blocks * long side check for each triangle line My mistake in a previous commit * style tweaks * select the read function on texture load This removes the per-pixel switch; it's slightly more efficient on my hardware, but probably a poor prediction Should remain profitable or at worst the same * use optionnal LUT for uint8_t -> float conversion * sets internal the number of vertices post-clipping and the epsilon clipping + a little cleanup * moves color conversion to math part * prevents sampling if it's a depth texture that is bound
1 parent 04c5dc4 commit 7ae46fb

File tree

1 file changed

+1243
-1110
lines changed

1 file changed

+1243
-1110
lines changed

0 commit comments

Comments
 (0)