CosmicCoconut
CosmicCoconut

Running LLMs in browser

I’m developing a frontend portfolio website with an AI chatbot that can answer questions related to me.

The main challenge I’m facing is that I don’t want to invest in hosting a backend server that makes calls to open-source LLMs.

My plan is to use libraries like webllm or reactllm to run small LLMs directly in the browser. The process involves downloading a significant amount of data (around 300 MB) into the browser, and then the webGPU in modern browsers handles the LLM. I successfully implemented this approach using SmolLM2-360M and the webllm library, but the model is too small, and the chatbot’s responses are not relevant.

I need assistance in either preventing hallucinations by incorporating RAG techniques or exploring alternative ideas, such as cost-effective backend hosting options or alternative methods for running LLMs in the browser.

11d ago
Talking product sense with Ridhi
9 min AI interview5 questions
Round 1 by Grapevine
JazzyNarwhal
JazzyNarwhal

You don't want to pay to get a legitimate API key?

CosmicCoconut
CosmicCoconut

Not planning to pay open source LLMs are more then efficient to suit my case

SquishyBanana
SquishyBanana

Try Ollama with Mistral 4B

WigglyUnicorn
WigglyUnicorn

Highly unlikely

Discover more
Curated from across