How to Install & Run DeepSeek R1 Locally with GUI on Windows, Linux, and macOS | Step-by-Step Guide

I'm a curious Geek with an insatiable thirst to learn new technologies and enjoy the process every day. I aim to deliver high-quality services with the highest standards and cutting-edge DevOps technologies to make people's lives easier.
What is Deepseek R1 model?
DeepSeek-R1 is an advanced open-source artificial intelligence model developed by the Chinese startup DeepSeek. It is designed to excel in complex reasoning tasks, including mathematics, coding, and logical problem-solving. Notably, DeepSeek-R1 achieves performance comparable to leading models like OpenAI's o1, but with significantly lower development costs and computational requirements.
Significance of DeepSeek-R1:
Cost Efficiency: Developed with a budget of less than $6 million, DeepSeek-R1 challenges the high-cost approaches of competitors, making advanced AI more accessible.
Open-Source Accessibility: By open-sourcing DeepSeek-R1, DeepSeek promotes transparency and collaboration, allowing researchers and developers worldwide to study, modify, and enhance the model.
Technological Impact: The model's emergence has prompted a reevaluation of AI development strategies, emphasizing efficiency and innovation over sheer computational power.
Advantages of Running DeepSeek-R1 Locally:
Data Privacy: Processing data on local machines ensures that sensitive information remains secure, mitigating risks associated with transmitting data to external servers.
Customization: Running the model locally allows for tailored modifications to meet specific project requirements, facilitating experimentation and optimization.
Reduced Latency: Local deployment eliminates the need for internet-based API calls, resulting in faster response times crucial for real-time applications.
Cost Savings: Operating the model on local hardware can reduce expenses related to cloud-based services and data transfer.
Key Considerations for Running DeepSeek-R1 Locally
Before proceeding, keep the following DeepSeek-R1 models and their corresponding sizes in mind:
| Parameters (B) | Size (GB) |
| 1.5B | 1.1 GB |
| 7B | 4.7 GB |
| 8B | 4.9 GB |
| 14B | 9.0 GB |
| 32B | 20 GB |
| 70B | 43 GB |
| 671B | 404 GB |
When running DeepSeek-R1 locally on your computer, you should consider the following factors:
1. Hardware Requirements
VRAM (GPU Memory):
The different model sizes range from 1.1GB (1.5B model) to 404GB (671B model).
If you have a consumer-grade GPU (e.g., RTX 3060, 3070, 4080), you should opt for the 7B model (4.7GB VRAM required) or the 8B model (4.9GB VRAM required).
Larger models (14B, 32B, 70B) require more powerful GPUs with at least 10GB+ VRAM.
CPU and RAM:
A powerful CPU (e.g., AMD Ryzen 9 / Intel i9) is recommended for inference.
You will need at least double the VRAM size in system RAM. For example, if you run the 7B model (4.7GB VRAM), you should have at least 16GB RAM for smooth performance.
Storage:
Ensure you have enough disk space. The 7B model alone requires 4.7GB, while the larger models (70B+) need hundreds of gigabytes.
An NVMe SSD is preferable for faster model loading.
2. Software Requirements
CUDA / ROCm (for GPU Acceleration)
If you have an NVIDIA GPU, install the latest CUDA and cuDNN.
- When you install the NVIDIA GeForce driver, it typically includes the CUDA runtime librariesโthe components needed to run CUDA-accelerated applications.
For AMD GPUs, you will need ROCm.
If you use CPU-only inference, performance will be significantly slower.
3. Model Selection
Choose a model that balances performance and hardware limitations:
1.5B: Very lightweight, suitable for older GPUs or CPU-only.
7B / 8B: Good for mid-range GPUs with 6GB+ VRAM.
14B+: Requires high-end GPUs (e.g., RTX 3090, A100, H100).
4. Optimization & Performance
Quantization: Reducing precision (e.g., GGUF 4-bit, 8-bit quantization) helps reduce VRAM usage.
Batch Size / Context Length: Adjust to balance response quality and speed.
Multi-GPU: If you have multiple GPUs, some inference frameworks support model sharding.
Ollama and Deekseek Installation
Before starting with the installation process make sure that if you are using windows, your nvidia graphic card is up to date. you can download and install the latest version of the graphic card from here: https://www.nvidia.com/en-us/geforce/drivers/
To find out whether Ollama supports your GPU you can visit: https://github.com/ollama/ollama/blob/main/docs/gpu.md
First, install Ollama and let it run in the background:
For Windows: https://ollama.com/download/windows
For macOS: https://ollama.com/download/mac
For Linux:
curl -fsSL https://ollama.com/install.sh | sh

Next, download and install the model version that best fits your needs based on the explanation above.
To do this:
Visit Ollamaโs DeepSeek-R1 Library: https://ollama.com/library/deepseek-r1
Choose your preferred model version (e.g., 8B).
Copy the provided command and paste it into your terminal.
This installation process is the same for Windows, macOS, and Linux.


At this stage, you can start using DeepSeek-R1 directly from the command line. However, to create a more ChatGPT-like experience, we will install AnythingLLM for an enhanced user interface.
AnythingLLM Installation and Configuration
Visit AnythingLLM Desktop: https://anythingllm.com/desktop
Download and install the appropriate version for Windows, Linux, or macOS.
Configure AnythingLLM:
Open AnythingLLM after installation.
Follow the configuration steps as shown in the screenshots below to set it up properly.
This setup will enhance your experience by providing a ChatGPT-like interface for interacting with DeepSeek-R1 locally. ๐
Go to settings:

Configure your LLM provider and then go back to its main page:

Create a new workspace, give it a name, and save it:


Go to the settings of your workspace and configure it according to the screenshot:

After that choose default or new thread to start a new conversation:

Congratulations! You now have a powerful OpenAI-O1-like model running locally on your machine! ๐
AnythingLLM Alternative
If you're looking for an alternative to AnythingLLM, you can also use LM Studio. One key advantage of LM Studio is that you can install models directly from within the app, eliminating the need for manual downloads or additional setup.
How to Get Started with LM Studio
Install LM Studio โ Download and install the software from https://lmstudio.ai/
Search for Your Model โ Use the built-in search feature to find DeepSeek R1 or any other model.
Install & Run โ Click to install the model directly from the app and start chatting instantly.
This makes LM Studio a convenient and user-friendly option for running local AI models with minimal hassle. ๐


Then start asking questions:

Ollama commands
To see the version of Ollama installed on your system:
ollama -v
To see a list of installed models with Ollama:
ollama list
To see how Deepseek is performing on your system:
# First run deepseek in teminal:
ollama run deepseek-r1:8b --verbose
# To exit chat mode:
/bye
Ask a question and check the stats at the end.

To determine if the model is running on your CPU or GPU, use the following command:
ollama ps
Output:
NAME ID SIZE PROCESSOR UNTIL
deepseek-r1:8b 28f8fd6cdc67 6.3 GB 26%/74% CPU/GPU 28 seconds from now
To make sure that your system has detected your GPU, you can check the server.log located at C:\Users\<username>\AppData\Local\Ollama

Feel free to drop any questions in the comments sectionโIโm happy to help! ๐

![Up and Running with kubectl-ai [PDF]](/_next/image?url=https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1753898584930%2F01739f18-1331-4d48-b709-fc2750685607.png&w=3840&q=75)
![How to deal with DNS caches [PDF]](/_next/image?url=https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1753101148740%2F9721c8d4-86d5-4ec8-b4f7-c317f7ccfe56.png&w=3840&q=75)
![Are Kubernetes Secrets Really Secure? [PDF]](/_next/image?url=https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1752308392314%2F25995822-24ef-4fab-afa4-f88a806d9e89.png&w=3840&q=75)

