running the QTIP inference code

@turboderp I noticed you guys had some trouble running our inference code. The main blocker is usually installing fast-hadamard-transform due to a ninja issue. The way I set up the environment is:

1. Create a new conda env
2. Install torch and ninja=1.11.1 (**not 1.11.4** or whatever the latest is)
3. Clone fast-hadamard-transform, cd into it, and run `python setup.py install` (alternatively, `pip install -e .` also works)
4. Install the qtip kernels. Cd into the qtip-kernels folder and run `python setup.py install`
5. Install everything else in requirements.txt

To generate text with our kernels, run `interactive_gen.py` with the commands in the repo. To evaluate perplexity and zeroshot performance, use the scripts in the eval folder. 

Also, feel free to reach out to me (albert@cs.cornell.edu) if you have any questions about QTIP. We're excited to see our stuff adopted in bigger projects!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

running the QTIP inference code #49

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

running the QTIP inference code #49

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions