Cloud-based LLMs are convenient, but they are not only insecure — many online services also limit the number of documents you can process. Running language models locally allows you to analyze code, detect bugs, and consult internal documentation directly on your computer, eliminating both security risks and processing restrictions
In this short note, I will show how easy it is to set up such a system and start using it in your workflow
For code
- https://ollama.com/ - a tool that runs any trained model locally
- After installing run terminal
- Run the easiest model
ollama run codellama
- Done! You may try use your own AI without sharing any data on the internet!
- Run the easiest model
$ ollama run codellama
pulling 3a43f93b78ec... 100% ████████████████████████████▏ 3.8 GB
success
>>> Why the sky is blue?
The sky appears blue because of a phenomenon called Rayleigh bla bla
If you want write more text: just use three quotes like:
>>> """
... Describe this code, please:
... if (app.Environment.IsDevelopment())
... app.MapOpenApi();
... """
The code above is a configuration for an API in a .NET Web API bla...
Ok, that's not all!
Do you use VS Code? Install this extension continue.dev And connect to your local model
I created a new .NET Web API project and asked the AI to write tests:
- Select the code
- Press
<CMD> + <L>
- Write your task
Simply click 'Insert at cursor' in the test file and accept the results
With a few adjustments, the tests will work, allowing you to use AI without any costs (except for electricity)
For documents
If your project contains documents as files, you can use any LLM (such as DeepSeek) to ask questions about the document or its specific subject
Alternatively, if your content is on the web, you can try the browser extension Page Assist
It works slowly and requires some settings to be enabled, but it's designed to work with your locally running models. This option is useful if you're comfortable installing browser extensions. ;)
At least, it's a great and very convenient way to have a user-friendly UI for communication. Just take a look:
Additionally, some models support working with documents and expanding their knowledge base. Let's try asking something new:
And add some definitions to the knowledge base, then check the result:
To research information across a dozen files.
For working with an array of documents, you can set the entire project as the context in VS Code and ask, for example, to find issues in the code:
Just for fun, I cloned the OWASP repo https://github.com/OWASP/CheatSheetSeries.git and am waiting for the indexing process to finish. I then asked the model (on the left side) and included all files from the cloned project (on the right side). As you can see below, the right side contains more useful information
Potentially, you can apply this approach to analyze your documents, such as procedures, guidelines, or policies, to get recommendations based on a dozen documents. For me, it's a really valuable benefit
PS: At the same time, I specifically used the small model gemma:2b in this test to clearly see the difference between the answers before and after enriching with data. Plus, these are quick responses, unlike with LRMs like deepseek-r1
, where you won’t get a fast reply. Although, of course, I understand that each model is tailored to its specific task, and for code, codellama
performed best on my computer
Top comments (0)