|
| 1 | +--- |
| 2 | +parser: v2 |
| 3 | +auto_validation: true |
| 4 | +time: 45 |
| 5 | +primary_tag: software-product>sap-business-technology-platform |
| 6 | +tags: [ tutorial>beginner, topic>artificial-intelligence, topic>machine-learning, software-product>sap-business-technology-platform ] |
| 7 | +author_name: Smita Naik |
| 8 | +author_profile: https://github.com/I321506 |
| 9 | +--- |
| 10 | + |
| 11 | +# Multimodal Response Assistant Chatbot Using SAP AI Core |
| 12 | + |
| 13 | +## Prerequisites |
| 14 | +1. **BTP Account** |
| 15 | + Set up your SAP Business Technology Platform (BTP) account. |
| 16 | + [Create a BTP Account](https://developers.sap.com/group.btp-setup.html) |
| 17 | +2. **For SAP Developers or Employees** |
| 18 | + Internal SAP stakeholders should refer to the following documentation: |
| 19 | + [How to create BTP Account For Internal SAP Employee](https://me.sap.com/notes/3493139), |
| 20 | + [SAP AI Core Internal Documentation](https://help.sap.com/docs/sap-ai-core) |
| 21 | +3. **For External Developers, Customers, or Partners** |
| 22 | + Follow this tutorial to set up your environment and entitlements: |
| 23 | + [External Developer Setup Tutorial](https://developers.sap.com/tutorials/btp-cockpit-entitlements.html), [SAP AI Core External Documentation](https://help.sap.com/docs/sap-ai-core?version=CLOUD) |
| 24 | +4. **Create BTP Instance and Service Key for SAP AI Core** |
| 25 | + Follow the steps to create an instance and generate a service key for SAP AI Core: [Create Service Key and Instance](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/create-service-key?version=CLOUD) |
| 26 | +5. **AI Core Setup Guide** |
| 27 | + Step-by-step guide to set up and get started with SAP AI Core: |
| 28 | + [AI Core Setup Tutorial](https://developers.sap.com/tutorials/ai-core-setup.html) |
| 29 | +6. An Extended SAP AI Core service plan is required, as the Generative AI Hub is not available in the Free or Standard tiers. For more details, refer to [SAP AI Core Service Plans](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/service-plans?version=CLOUD) |
| 30 | + |
| 31 | + |
| 32 | +### PREREAD |
| 33 | + |
| 34 | +#### What We’re Building |
| 35 | +A **web-based intelligent chatbot** capable of interacting with users via **text**, **audio**, **images**, and **video**. It returns **context-aware responses** using a **multimodal AI model**. |
| 36 | + |
| 37 | +#### Why Build This? |
| 38 | +Multimodal assistants are the future of enterprise automation. With varied input types and conversational memory, such assistants: |
| 39 | + |
| 40 | +- Increase productivity |
| 41 | +- Provide rapid technical support |
| 42 | +- Aid in education and training |
| 43 | +- Analyze rich media content |
| 44 | + |
| 45 | +#### What We’re Using |
| 46 | +- **SAP Generative AI Hub** for model management |
| 47 | +- **Gemini 2.0 Flash** for multimodal LLM capabilities |
| 48 | +- **Streamlit** for the user interface |
| 49 | +- **Python** backend |
| 50 | +- **Media processing** using Pillow, base64 |
| 51 | +- **Environment handling** with `python-dotenv` |
| 52 | + |
| 53 | +#### Demo Video |
| 54 | +> **Watch the walkthrough:** |
| 55 | +- You can find the demo video of the application [here](https://video.sap.com/media/t/1_4nixy23y). |
| 56 | + |
| 57 | +--- |
| 58 | + |
| 59 | +### Clone the GitHub Repository and Install Dependencies |
| 60 | + |
| 61 | +Clone the project from the repository: |
| 62 | + |
| 63 | +```bash |
| 64 | +git clone <your-repo-url> |
| 65 | +``` |
| 66 | + |
| 67 | +> [GitHub Link](https://github.tools.sap/MLF/aicore-examples.git) |
| 68 | +
|
| 69 | + |
| 70 | + |
| 71 | +#### 📦 Install Requirements |
| 72 | +In **VS Code**, open the cloned folder and install dependencies: |
| 73 | + |
| 74 | +```bash |
| 75 | +pip install -r requirements.txt |
| 76 | +``` |
| 77 | + |
| 78 | +### Application Environment Setup |
| 79 | + |
| 80 | +#### Deployment Steps |
| 81 | +To deploy **Gemini 2.0 Flash**, follow the steps from the official SAP tutorial: |
| 82 | + |
| 83 | +- **Follow Step 3 and Step 4** from this guide: [Create Deployment for a Generative AI Model](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/create-deployment-for-generative-ai-model-in-sap-ai-core) |
| 84 | + |
| 85 | +- To get the latest **model name** and **version information**, refer to this SAP Note: [3437766 - Generative AI Hub Models](https://me.sap.com/notes/3437766) |
| 86 | + |
| 87 | +Once deployed, note down the following: |
| 88 | + - **Model Name** (e.g., `gemini-2.0-flash`) |
| 89 | + - **Deployment ID** |
| 90 | +- Open **main.py** and navigate to the configuration section. Replace the placeholders for model name and deployment ID with your actual values: |
| 91 | + |
| 92 | + |
| 93 | + |
| 94 | +#### Add Your Credentials |
| 95 | + |
| 96 | +In the project root, create a `.env` file: |
| 97 | + |
| 98 | +```env |
| 99 | +AICORE_AUTH_URL="your-authentication-url" |
| 100 | +AICORE_CLIENT_ID="your-client-id" |
| 101 | +AICORE_CLIENT_SECRET="your-client-secret" |
| 102 | +AICORE_BASE_URL="your-base-api-url" |
| 103 | +AICORE_RESOURCE_GROUP="your-resource-group" |
| 104 | +``` |
| 105 | +> **Note:** Make sure to use the same SAP AI Core instance and resource group where you deployed the model. This ensures the application launches and connects successfully. |
| 106 | +
|
| 107 | +#### Launch the App |
| 108 | + |
| 109 | +In VS Code, open the terminal and run the below command: |
| 110 | + |
| 111 | +```bash |
| 112 | +streamlit run main.py |
| 113 | +``` |
| 114 | + |
| 115 | + |
| 116 | + |
| 117 | + |
| 118 | +### Video Input – Demo and Analysis |
| 119 | + |
| 120 | +#### ➕ Upload Video |
| 121 | +- Choose a short video file in formats: **MP4 or MOV** |
| 122 | +- After uploading, you can ask questions such as: |
| 123 | + - *"Please provide a concise summary of the events taking place in this video"* |
| 124 | + |
| 125 | + |
| 126 | + |
| 127 | +#### 📊 Output Observation |
| 128 | + |
| 129 | + |
| 130 | + |
| 131 | +--- |
| 132 | + |
| 133 | +### Audio Input – Demo and Analysis |
| 134 | + |
| 135 | +#### ➕ Upload or Record Audio |
| 136 | +- Use an audio file in formats: WAV or MP3, or record live. |
| 137 | +- Analyze transcription or tone |
| 138 | +- You can ask questions like: |
| 139 | + - *"What was said?"* |
| 140 | + - *"Detect the language."* |
| 141 | + |
| 142 | +#### 📊 Output Observation |
| 143 | + |
| 144 | + |
| 145 | + |
| 146 | +--- |
| 147 | + |
| 148 | +### Image Input – Demo and Analysis |
| 149 | + |
| 150 | +Upload an image in formats like JPG or PNG for analysis. |
| 151 | + |
| 152 | +#### ➕ Upload Image |
| 153 | +- Drag & drop or select an image file |
| 154 | +- Ask questions like: |
| 155 | + - *"Describe what is happening in this image and indentify any notabl branding or sponsorships visible."* |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | +#### Output Observation |
| 160 | + |
| 161 | + |
| 162 | + |
| 163 | +--- |
| 164 | + |
| 165 | +### Summary & Use Cases |
| 166 | + |
| 167 | +#### Summary |
| 168 | +The assistant delivers accurate, context-aware responses across media types. It’s an enterprise-ready, scalable solution leveraging SAP’s AI infrastructure. |
| 169 | + |
| 170 | +#### Real-world Use Cases |
| 171 | +| Scenario | Value Proposition | |
| 172 | +|----------------------|---------------------------------------------------------------------| |
| 173 | +| 🎓 Education | AI tutors analyzing diagrams and lecture recordings | |
| 174 | +| 🛠️ Tech Support | Use screenshots, logs, and voice memos for intelligent resolution | |
| 175 | +| 📈 Business Insights | Analyze media submissions for trends and decisions | |
| 176 | +| 🎤 Voice Assistants | Voice queries in workflows, meeting minutes, etc. | |
| 177 | + |
| 178 | +--- |
| 179 | + |
| 180 | +#### Demo Video |
| 181 | +> **Watch the walkthrough:** |
| 182 | +- You can find the demo video of the application [here](https://video.sap.com/media/t/1_4nixy23y). |
| 183 | + |
| 184 | +--- |
| 185 | + |
| 186 | +> Final Thoughts: This project demonstrates how multimodal AI can significantly enhance user interactions. By allowing inputs via text, audio, image, and video, and delivering intelligent, contextual responses, the assistant offers a flexible, enterprise-ready solution. It integrates seamlessly with SAP’s AI infrastructure and Google’s powerful models, enabling: |
| 187 | + |
| 188 | + > Smarter user interfaces |
| 189 | + |
| 190 | + > Scalable support systems |
| 191 | + |
| 192 | + > Rich media understanding |
| 193 | + |
| 194 | +> Whether for education, support, or analytics, this assistant provides a practical gateway into the future of digital enterprise interaction. |
| 195 | +
|
0 commit comments