Ollama - Local Large Language Model Integration¶

Ollama is a local large language model runtime framework that supports a variety of models including Deepseek, Qwen, and Gemma. QPython has built-in Ollama integration, enabling developers to explore GenAI development directly on their mobile devices.

Overview¶

Ollama allows you to run powerful large language models locally on your Android device. With QPython's integration, you can:

Run open-source LLMs directly on your phone
Use AI capabilities without internet connectivity
Experiment with different models for various use cases
Build AI-powered applications using familiar Python libraries

Supported Models¶

Ollama supports many popular open-source models:

Deepseek – Efficient reasoning models (recommended: deepseek-r1:1.5b for mobile)
Qwen – Alibaba's large language models
Gemma – Google's lightweight open models
And many more available on Ollama Library

Getting Started¶

Step 1: Access QPython Shell Terminal¶

Open QPython and go to the Dashboard
Long press the Terminal icon
Select QPython Shell Terminal

Step 2: Download a Model¶

In the Shell Terminal, use Ollama commands to download models. For mobile devices, we recommend smaller models for faster response times.

# Pull a model (example: deepseek-r1 with 1.5 billion parameters)
ollama pull deepseek-r1:1.5b

# Pull other models
ollama pull qwen:2.5
ollama pull gemma:2b

Step 3: Run the Model¶

Start the Ollama service to make the model available via API:

ollama serve

When running, Ollama will output the local port address (default: 11434).

Using Ollama with Python¶

Install OpenAI Library¶

Install the openai library from QPYPI:

# Using PIP Client (long press Terminal icon -> PIP Client)
pip install openai

Python Code Example¶

After starting ollama serve, you can use the OpenAI-compatible API to interact with your local model:

from openai import OpenAI

# Configure the client
client = OpenAI(
    api_key="deepseek",  # Can be any string
    base_url="https://localhost:11434/v1"  # Ollama's local address
)

# Chat with the model
response = client.chat.completions.create(
    model="deepseek-r1:1.5b",  # Match the model you downloaded
    messages=[
        {"role": "user", "content": "What is Python?"}
    ]
)

print(response.choices[0].message.content)

Recommended Models for Mobile¶

Model	Parameters	Best For
deepseek-r1	1.5b	Fast responses, general tasks
qwen:2.5	2.5b	Balanced performance
gemma:2b	2b	Lightweight tasks

Larger models will work but may respond slower on mobile devices.

Useful Ollama Commands¶

# List installed models
ollama list

# Remove a model
ollama rm deepseek-r1:1.5b

# Show model information
ollama show deepseek-r1:1.5b

# Create a custom model (Modelfile)
ollama create mymodel -f Modelfile

Learn More¶

Ollama Documentation – Official Ollama guides and command reference
Ollama Library – Browse available models
AIPyApp – AI-powered program generator in QPython
QPYPI Guide – Managing Python packages