Proposal: Real Time Streaming MIDI Output Support

# Feature Proposal: Real Time Streaming MIDI Output Support

## Summary

This feature request proposes the addition of **real-time streaming MIDI output** to `basic-pitch`, allowing the system to process and output MIDI data concurrently with audio input. This would significantly expand the usability of `basic-pitch` in live performance, educational, and DAW integration contexts.

## Motivation

Currently, `basic-pitch` operates as a batch audio-to-MIDI converter, requiring the full audio file to be processed before producing MIDI output. While effective for offline applications, this architecture limits the tool’s applicability in live scenarios. Real-time audio-to-MIDI conversion has growing demand in:

- Live instrument-to-MIDI conversion for digital audio workstations (DAWs)
- Music education platforms requiring instant feedback
- Interactive composition and improvisation tools
- Low-latency MIDI controllers for experimental performance setups

Several commercial and research-grade tools provide real-time capabilities (e.g., JamOrigin MIDI Guitar, AIO MIDINet, and various ONNX-based pipelines), but few offer open-source solutions with the transcription accuracy that `basic-pitch` provides.

## Proposed Implementation

A modular, low-latency real-time streaming pipeline could be introduced as an extension of the existing model. Suggested steps include:

**Input Handling**  
   - Use `pyaudio`, `sounddevice`, or other low-latency libraries to stream audio input directly from a microphone or system source.
   - Implement windowed audio buffering with overlap to allow continuous model inference.

**Inference Adaptation**  
   - Adapt the inference loop to process fixed-size frames (e.g., 2048 or 4096 samples) in real time.
   - Introduce incremental model state management to preserve performance across audio frames.

**Streaming Output**  
   - Emit MIDI note events incrementally using a ring buffer or FIFO stream.
   - Optionally expose a MIDI output via `mido`, `rtmidi`, or similar libraries for live routing to DAWs or synthesizers.

**Latency and Performance Tuning**  
   - Introduce a tunable latency buffer to balance between transcription accuracy and real-time responsiveness.
   - Profile model inference to determine optimal window sizes and overlaps under typical hardware constraints.

**Optional Network Interface**  
   - For advanced use cases, expose the real-time inference through a lightweight WebSocket or gRPC API, enabling remote control and cloud deployment.

## Anticipated Challenges

- **Model Adaptability**: Ensuring the model performs well on partial inputs without full temporal context.
- **Latency Minimization**: Achieving real-time responsiveness while maintaining accuracy will require careful tuning.
- **False Positives**: Low-duration notes may introduce noise in real-time environments, so adaptive thresholding or smoothing may be necessary.

## Benefits to the Ecosystem

- Adds live performance capabilities to the `basic-pitch` ecosystem
- Opens opportunities for integration with VSTs, DAWs, and educational tools
- Fills a notable gap in the open-source music transcription landscape

## Conclusion

Adding real-time streaming MIDI output to `basic-pitch` would make the tool significantly more versatile and competitive with proprietary solutions. Given its high transcription accuracy and open architecture, `basic-pitch` is well-positioned to lead in this space. This feature would serve both the open-source community and professional musicians seeking reliable, low-latency audio-to-MIDI conversion.

I’d be happy to contribute or assist with prototyping this functionality.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Real Time Streaming MIDI Output Support #171

Feature Proposal: Real Time Streaming MIDI Output Support

Summary

Motivation

Proposed Implementation

Anticipated Challenges

Benefits to the Ecosystem

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Real Time Streaming MIDI Output Support #171

Description

Feature Proposal: Real Time Streaming MIDI Output Support

Summary

Motivation

Proposed Implementation

Anticipated Challenges

Benefits to the Ecosystem

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions