Skip to content

Add v3 bootrom intrinsics (fsincos/dsincos) #969

@REALERvolker1

Description

@REALERvolker1

This would provide a significant performance boost for more math-heavy code.

I actually managed to get this working rather well on my own local fork of the HAL. Since the functions return extra registers, I found I had to return bit-packed primitive integer types to return both results, since #[repr(C)] structs wouldn't work for some reason. I don't know thumb assembly very well, so I don't know how to fix that. I copied/pasted the datasheet description into the doc comment verbatim so I wouldn't forget it.

// in float_funcs...
/// V3 bootloader fn
///
/// Calculates the sine and cosine of angle. angle is in radians, and must be in the
/// range -128 to 128. The sine value is returned in register r0 (and is thus the official
/// function return value), the cosine value is returned in register r1. This method is
/// considerably faster than calling _fsin and _fcos separately.
0x48 fsincos(angle: f32) -> u64;
// in double_funcs...
/// V3 bootloader fn
///
/// Calculates the sine and cosine of angle. angle is in radians, and must be in the range -1024
/// to 1024. The sine value is returned in registers r0/r1 (and is thus the official return value),
/// the cosine value is returned in registers r2/r3. This method is considerably faster than
/// calling _sin and _cos separately.
0x48 dsincos(angle: f64) -> u128;

This is my test function. I sort of shoehorned it into my existing project.

pub fn fsincos(angled: f64) -> ((f32, f32), (f64, f64)) {
    use ::rp2040_hal::rom_data::{double_funcs, float_funcs};

    let anglef = angled as f32;

    let func_sinf = float_funcs::fsin(anglef);
    let func_cosf = float_funcs::fcos(anglef);

    let packedf = float_funcs::fsincos(anglef);
    let packed_sinf = f32::from_bits(packedf as u32);
    let packed_cosf = f32::from_bits((packedf >> 32) as u32);

    if func_sinf.to_bits() == packed_sinf.to_bits() && func_cosf.to_bits() == packed_cosf.to_bits()
    {
        cdc_puts("yooooo");
    }

    let func_sind = double_funcs::dsin(angled);
    let func_cosd = double_funcs::dcos(angled);

    let packedd = double_funcs::dsincos(angled);
    let packed_sind = f64::from_bits(packedd as _);
    let packed_cosd = f64::from_bits((packedd >> 64) as _);

    if func_sind.to_bits() == packed_sind.to_bits() && func_cosd.to_bits() == packed_cosd.to_bits()
    {
        cdc_puts("YOOOOOOOOOOOOO");
    }

    ((packed_sinf, packed_cosf), (packed_sind, packed_cosd))
}

In my main function...

        // main loop continues above...
        if let Some(point) = ctx.touchscreen.read_point(&mut ctx.timer) {
            let next_color = color_iter.nextc();
            color_rect.top_left = point.point - HSZ;

            _ = ctx.lcd.fill_solid(&color_rect, next_color);
            cdc_puts("Got a point!");
            let zfloat = (point.z.get() as f64) % core::f64::consts::PI;
            let sc = fsincos(zfloat);
            writeln!(
                &mut CdcWriter,
                "fsin: {}, fcos: {}, dsin: {}, dcos: {}\r",
                sc.0.0, sc.0.1, sc.1.0, sc.1.1
            )
            .unwrap();
        }
        // main loop continues below...
Image I won't turn this into a PR because using a 128-bit int as a return type for a float function is pretty unwieldy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions