Skip to content

Commit c42de18

Browse files
authored
ollamatutorial
1 parent d948abb commit c42de18

21 files changed

Lines changed: 388 additions & 0 deletions
Lines changed: 388 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,388 @@
1+
---
2+
parser: v2
3+
auto_validation: true
4+
time: 45
5+
tags: [ tutorial>beginner, topic>artificial-intelligence, topic>machine-learning, software-product>sap-business-technology-platform, software-product>sap-ai-core ]
6+
primary_tag: software-product>sap-ai-core
7+
author_name: Dhrubajyoti Paul
8+
author_profile: https://github.com/dhrubpaul
9+
---
10+
# Using Custom models on SAP AI Core VIA ollama
11+
<!-- description --> In this tutorial we are going to learn on how to deploy a custom LLM on AI core using ollama for the example we would be taking Gemma as a model from hugging face and deploy it on SAP AI core.
12+
13+
## You will learn
14+
- How to Deploy ollama on AI core
15+
- Add models to ollama and infrence models
16+
17+
## Prerequisites
18+
Ai core setup and basic knowledge: [Link to documentation](https://developers.sap.com/tutorials/ai-core-setup.html)
19+
Ai core Instance with Standard Plan or Extended Plan
20+
Docker Desktop Setup [Download and Install](https://www.docker.com/products/docker-desktop)
21+
Github Account
22+
23+
### Architecture Overview
24+
In this tutorial we are deploying ollama an open-source project that serves as a powerful and user-friendly platform for running LLMs on on SAP AI core. which acts as a bridge between the complexities of LLM technology and the desire for an accessible and customizable AI experience.
25+
26+
![image](img/solution-architecture.png)
27+
28+
We can pick any model from the above model hubs and connect it to AI core for the example we are going to deploy ollama on AI core and enable Gemma and inference the same.
29+
30+
### Adding workflow file to github
31+
Workflows for SAP AI Core are created using YAML or JSON files that are compatible with the SAP AI Core schema. Let’s start with adding a Argo Workflow file to manage: `ollama`.
32+
33+
In your Github Create a new repository, click **Add file** > **Create new file**.
34+
35+
![image](img/Picture1.png)
36+
37+
Type `LearningScenarios/ollama.yaml` into the Name your file field. This will automatically create the folder `workflows` and a workflow named `ollama.yaml` inside it.
38+
39+
![image](img/Picture2.png)
40+
41+
> CAUTION Do not use the name of your workflow file (`ollama.yaml`) as any other identifier within SAP AI Core.
42+
43+
![image](img/Picture3.png)
44+
45+
Now copy and paste the following snippet to the editor.
46+
```yaml
47+
apiVersion: ai.sap.com/v1alpha1
48+
kind: ServingTemplate
49+
metadata:
50+
name: ollama
51+
annotations:
52+
scenarios.ai.sap.com/description: "Run a ollama server on SAP AI Core"
53+
scenarios.ai.sap.com/name: "ollama"
54+
executables.ai.sap.com/description: "ollama service"
55+
executables.ai.sap.com/name: "ollama"
56+
labels:
57+
scenarios.ai.sap.com/id: "ollama"
58+
ai.sap.com/version: "0.0.1"
59+
spec:
60+
template:
61+
apiVersion: "serving.kserve.io/v1beta1"
62+
metadata:
63+
annotations: |
64+
autoscaling.knative.dev/metric: concurrency
65+
autoscaling.knative.dev/target: 1
66+
autoscaling.knative.dev/targetBurstCapacity: 0
67+
labels: |
68+
ai.sap.com/resourcePlan: infer.s
69+
spec: |
70+
predictor:
71+
imagePullSecrets:
72+
- name: <YOUR_DOCKER_SECRET>
73+
minReplicas: 1
74+
maxReplicas: 1
75+
containers:
76+
- name: kserve-container
77+
image: docker.io/<YOUR_DOCKER_USER>/ollama:ai-core
78+
ports:
79+
- containerPort: 8080
80+
protocol: TCP
81+
```
82+
Replace `<YOUR_DOCKER_SECRET>` with Default and replace `<YOUR_DOCKER_USER>` with your docker username.
83+
84+
**NOTE** - we'll generate the docker image referred here in the following steps.
85+
86+
### Create a Docker account and generate a Docker access token and Install Docker
87+
[Sign Up](https://www.docker.com/) for a Docker account.
88+
89+
Click on the profile button (your profile name) and then select **Account Settings**.
90+
91+
![image](img/Picture4.png)
92+
93+
Select **Security** from the navigation bar and click **New Access Token**.
94+
95+
![image](img/Picture5.png)
96+
97+
###Creating a Docker Image
98+
99+
Create a directory (folder) named `custom-llm`.
100+
Create a file `.dockerfile`. Paste the following snippet in the file.
101+
102+
```dockerfile
103+
# Specify the base layers (default dependencies) to use
104+
ARG BASE_IMAGE=ubuntu:22.04
105+
FROM ${BASE_IMAGE}
106+
107+
# Update and install dependencies
108+
RUN apt-get update && \
109+
apt-get install -y \
110+
ca-certificates \
111+
nginx \
112+
curl && \
113+
apt-get clean && \
114+
rm -rf /var/lib/apt/lists/*
115+
116+
# Install ollama
117+
RUN curl -fsSL https://ollama.com/install.sh | sh
118+
119+
# Expose port and set environment variables for ollama
120+
ENV ollama_HOST=0.0.0.0
121+
ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
122+
ENV LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64
123+
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
124+
125+
# Configure nginx for reverse proxy
126+
RUN echo "events { use epoll; worker_connections 128; } \
127+
http { \
128+
server { \
129+
listen 8080; \
130+
location ^~ /v1/api/ { \
131+
proxy_pass http://localhost:11434/api/; \
132+
proxy_set_header Host \$host; \
133+
proxy_set_header X-Real-IP \$remote_addr; \
134+
proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for; \
135+
proxy_set_header X-Forwarded-Proto \$scheme; \
136+
} \
137+
location ^~ /v1/chat/ { \
138+
proxy_pass http://localhost:11434/v1/chat/; \
139+
proxy_set_header Host \$host; \
140+
proxy_set_header X-Real-IP \$remote_addr; \
141+
proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for; \
142+
proxy_set_header X-Forwarded-Proto \$scheme; \
143+
} \
144+
} \
145+
}" > /etc/nginx/nginx.conf && \
146+
chmod -R 777 /var/log/nginx /var/lib/nginx /run
147+
148+
EXPOSE 8080
149+
150+
# Create directory for user nobody SAP AI Core run-time
151+
RUN mkdir -p /nonexistent/.ollama && \
152+
chown -R nobody:nogroup /nonexistent && \
153+
chmod -R 770 /nonexistent
154+
# chmod -R 777 /nonexistent/.ollama
155+
156+
# Start nginx and ollama service
157+
CMD service nginx start && /usr/local/bin/ollama serve
158+
159+
```
160+
161+
in the same directory open terminal and run the following commands:
162+
163+
164+
1.Login to docker hub
165+
```powershell
166+
docker login -u <YOUR_DOCKER_USER> -p <YOUR_DOCKER_ACCESS_TOKEN>
167+
```
168+
169+
2.Build the docker image
170+
```powershell
171+
docker build --platform=linux/amd64 -t docker.io/<YOUR_DOCKER_USER>/ollama:ai-core .
172+
```
173+
174+
3.Push the docker image to docker hub to be used by deployment in SAP AI Core
175+
```powershell
176+
docker push docker.io/<YOUR_DOCKER_USER>/ollama:ai-core
177+
```
178+
179+
### Storing docker secrets to AI core
180+
181+
This step is required once. Storing Docker credentials enables SAP AI Core to pull (download) your Docker images from a private Docker repository. Use of a private Docker image prevents others from seeing your content.
182+
183+
Select your SAP AI Core connection under the **Workspaces app**.
184+
185+
Click **Docker Registry Secrets** in the **AI Core Administration app**. Click Add.
186+
187+
A Pop up will appear on screen and add the following Json with the details to your Docker Creds.
188+
189+
```json
190+
{
191+
".dockerconfigjson": "{\"auths\":{\"YOUR_DOCKER_REGISTRY_URL\":{\"username\":\"YOUR_DOCKER_USERNAME\",\"password\":\"YOUR_DOCKER_ACCESS_TOKEN\"}}}"
192+
}
193+
```
194+
195+
### Onboarding Github and application on AI core
196+
197+
Select on your SAP AI Core connection under **Workspaces app** in the SAP AI Launchpad.
198+
199+
![image](img/Picture6.png)
200+
201+
Under the **Git Repositories** section in **AI Core Administration app**, click **Add**.
202+
203+
> WARNING If you don’t see the AI Core Administration app, check that you had selected your SAP AI Core connection from the Workspaces app. If it is still not visible then ask your SAP AI Launchpad administrator to assign roles to you so that you can access the app.
204+
205+
![image](img/Picture7.png)
206+
207+
Enter your GitHub repository details (created in the previous step) in the dialog box that appears, and click **Add**.
208+
209+
![image](img/Picture8.png)
210+
211+
Use the following information as reference:
212+
213+
- **URL:** Paste the URL of your GitHub repository and add the suffix /workflows.
214+
215+
- **Username:** Your GitHub username.
216+
217+
- **Password:** Paste your GitHub Personal Access Token, generated in the previous step.
218+
219+
> Note: Password does not gets validated at time of Adding Github Repository its just meant to save Github Creds to AI core. Passwords gets validated at time of creating Application or when Application refreshes connection to AI core.
220+
221+
You will see your GitHub onboarding completed in a few seconds. As a next steps we will enable an application on AI core.
222+
223+
![image](img/Picture9.png)
224+
225+
Go to your SAP AI Launchpad.
226+
In the AI Core **Administration app**, click **Applications** > **Create**.
227+
228+
![image](img/Picture10.png)
229+
230+
Using the reference below as a guide, specify the details of your application. This form will create your application on your SAP AI Launchpad.
231+
232+
![image](img/Picture11.png)
233+
234+
Use the following information for reference:
235+
236+
237+
- **Application Name:** An identifier of your choice. learning-scenarios-app is used as an example of best practice in this tutorial because it is a descriptive name.
238+
239+
- **Repository URL:** Your GitHub account URL and repository suffix. This helps you select the credentials to access the repository.
240+
241+
- **Path:** The folder in your GitHub where your workflow is located. For this tutorial it is LearningScenarios.
242+
243+
- **Revision:** The is the unique ID of your GitHub commit. Set this to HEAD to have it automatically refer to the latest commit.
244+
245+
### Creating configuration
246+
247+
Go to **ML Operations** > **Configurations**. Click on the **Create** button.
248+
249+
![image](img/Picture12.png)
250+
251+
Enter the model Name and choose the workflow with following parameters
252+
253+
```json
254+
"name": "ollama",
255+
"scenario_id": "ollama",
256+
"executable_id": "ollama",
257+
```
258+
259+
Then click on **next** > **review and create**.
260+
261+
### Deploying Ollama to AI core
262+
263+
In the model click on **create deployment**. A screen will appear
264+
265+
Set duration as standard and click on the **Review** button.
266+
267+
![image](img/Picture13.png)
268+
269+
Once you create the deployment, wait for the current status to be set to RUNNING.
270+
271+
![image](img/Picture14.png)
272+
273+
Once the deployment is running, you can access the LLM’s using ollama.
274+
275+
### Pulling Gemma inside Ollama deployment
276+
277+
Now we need to import Gemma to our ollama pod before we can infrence the model so here we would be using SAP AI API to call pull model call in Ollama.
278+
279+
[OPTION BEGIN [Postman]]
280+
281+
Setting up AI core Auth Creds
282+
![image](img/setup_auth_creds.png)
283+
284+
adding Resource groups to headers
285+
![image](img/setup-resource-group.png)
286+
287+
making the Model to import to POD
288+
```json
289+
{
290+
"name": "gemma:2b"
291+
}
292+
```
293+
294+
![image](img/pulling-model.png)
295+
296+
Once the model is pulled to AI core we can check the list of models deployed under ollama deployment via the following.
297+
![image](img/check-deployment.png)
298+
299+
[OPTION END]
300+
301+
302+
[OPTION BEGIN [Jupyter Notebook]]
303+
304+
**NOTE** - Before execution of the following code block, update the url, <ai-resource-group>, and <TOKEN> to the corresponding deployment url for the model in use.
305+
306+
```
307+
import requests
308+
import json
309+
310+
url = "https://api.ai.prasfodeuonly.aws.ml.hana.ondemand.com/v2/inference/deployments/d78749e2ab8c3/v1/api/pull"
311+
312+
payload = json.dumps({
313+
"model": "gemma:2b"
314+
})
315+
headers = {
316+
'AI-Resource-Group': <ai-resource-group>,
317+
'Content-Type': 'application/json',
318+
'Authorization': 'Bearer <TOKEN>'
319+
}
320+
321+
response = requests.request("POST", url, headers=headers, data=payload)
322+
323+
print(response.text)
324+
```
325+
326+
```
327+
# Check the model list
328+
endpoint = f"{inference_base_url}/api/tags"
329+
print(endpoint)
330+
331+
response = requests.get(endpoint, headers=headers)
332+
print('Result:', response.text)
333+
```
334+
335+
```
336+
completion_api_endpoint = f"{inference_base_url}/api/generate"
337+
338+
#test ollama's completion api
339+
json_data = {
340+
"model": model,
341+
"prompt": "What color is the sky at different times of the day? Respond in JSON",
342+
"format": "json", #JSON mode
343+
"stream": False #Streaming or not
344+
}
345+
346+
response = requests.post(url=completion_api_endpoint, headers=headers, json=json_data)
347+
print('Result:', response.text)
348+
```
349+
350+
[OPTION END]
351+
352+
### Inferencing Gemma
353+
354+
[OPTION BEGIN [Postman]]
355+
356+
```
357+
{
358+
"model": "gemma:2b",
359+
"prompt": "What color is the sky at different times of the day? Respond in JSON",
360+
"format": "json",
361+
"stream": false
362+
}
363+
```
364+
365+
![image](img/infrence.png)
366+
367+
[OPTION END]
368+
369+
[OPTION BEGIN [Jupyter Notebook]]
370+
371+
```
372+
completion_api_endpoint = f"{inference_base_url}/api/generate"
373+
chat_api_endpoint = f"{inference_base_url}/api/chat"
374+
openai_chat_api_endpoint = f"{deployment_url}/v1/chat/completions"
375+
376+
#test ollama's completion api
377+
json_data = {
378+
"model": model,
379+
"prompt": "What color is the sky at different times of the day? Respond in JSON",
380+
"format": "json", #JSON mode
381+
"stream": False #Streaming or not
382+
}
383+
384+
response = requests.post(url=completion_api_endpoint, headers=headers, json=json_data)
385+
print('Result:', response.text)
386+
```
387+
388+
[OPTION END]
68.1 KB
Loading
40 KB
Loading
110 KB
Loading
159 KB
Loading
127 KB
Loading
178 KB
Loading
47.4 KB
Loading
48 KB
Loading
44.1 KB
Loading

0 commit comments

Comments
 (0)