Groq is a platform for running large language models (LLMs) with token-based pricing and no infrastructure management. Groq uses LPU Inference Engines, a new type of end-to-end processing unit system that provides fast inference for computationally intensive systems. This project showcases a Streamlit app using Groq-hosted models combined with Mem0 for persistent conversational memory.
Sign up for an account at GroqCloud and get an API key, which you'll need for this project. You'll also need an OpenAI API key for the embeddings model used by Mem0.
- Groq (for chat response)
llama-3.3-70b-versatile
meta-llama/llama-4-scout-17b-16e-instruct
gemma2-9b-it
mistral-saba-24b
qwen-qwq-32b
deepseek-r1-distill-llama-70b
- Mem0 (for memory backend)
mixtral-8x7b-32768
: for semantic memory retrievaltext-embedding-3-small
: for embeddings
- Clone the repository. Alternatively, deploy to Railway, Render, or Google Cloud Run.
git clone https://github.com/alphasecio/groq.git
cd groq
- Set your API keys either as environment variables or via the Streamlit sidebar inputs.
- Run the app.
streamlit run streamlit_app.py