Skip to main content

Python Adapter

Create custom integrations by uploading a Python script that implements your model logic. This approach gives you complete flexibility to test any AI application.

Use Cases

  • Proprietary models or custom-built AI systems
  • Applications with unique preprocessing or postprocessing requirements
  • Models accessed through non-standard authentication or invocation methods
  • Local inference pipelines

Configuration

FieldRequiredDescription
Application NameYesUnique identifier (e.g., my-python-adapter-app)
Application Template FileYesPython file (max 5MB)
Input ModalitiesYesText, Image, Video
Output ModalitiesYesText, Image, Video

Template Structure

Your Python script must implement a chat function with this signature:

def chat(chats):
"""
Process chat messages and return a response.

Args:
chats: List of message dictionaries with 'role' and 'content' keys
Example: [{"role": "user", "content": "Hello"}]

Returns:
str: The model's response text
"""
# Your implementation here
pass

Example Implementation

import os
from transformers import pipeline, LlamaTokenizer, LlamaForCausalLM

# Step 1: Load the LLaMA model and tokenizer
model_name = "meta-llama/Meta-Llama-3-8B"
tokenizer = LlamaTokenizer.from_pretrained(model_name)
model = LlamaForCausalLM.from_pretrained(model_name)

# Initialize the Hugging Face pipeline with the model and tokenizer
chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer)

# Don't change the name of the function or the function signature
def chat(chats):
"""
Generates a response from the language model based on a given list of chat messages.

Parameters:
chats (list): A list of dictionaries representing a conversation. Each dictionary contains:
- "role": Either "user" or "assistant".
- "content": A string (text message) or a list of dictionaries for multimodal input.
- The last entry should have a "prompt" without an "answer".

Returns:
str: The response from the language model.
"""

# Step 2: Generate the model's response
response = chatbot(chats, max_length=1000, num_return_sequences=1)

# Step 3: Extract and return the generated text
generated_text = response[0]['generated_text']
assistant_response = generated_text.split("Assistant:")[-1].strip()

return assistant_response

Handling Different Modalities

Input Processing

The chats parameter follows OpenAI's chat format, where multimodal content is represented as a list of dictionaries within the content field.

# Simple text message
{
"role": "user",
"content": "What is the capital of France?"
}

Processing Mixed Input Types

def chat(chats):
"""
Handle different input modalities in the chat function
"""
try:
# Process each message to handle different modalities
for message in chats:
if "content" in message and isinstance(message["content"], list):
# Handle multimodal content
for i, content_item in enumerate(message["content"]):
# Handle video input
if content_item.get("type") == "video_url" and "url" in content_item.get("video_url", {}):
video_url = content_item["video_url"]["url"]
if video_url.startswith("data:video/"):
base64_data = video_url.split(",", 1)[1]
# Process video data as needed for your model

# Handle image input
elif content_item.get("type") == "image_url" and "url" in content_item.get("image_url", {}):
image_url = content_item["image_url"]["url"]
if image_url.startswith("data:image/"):
base64_data = image_url.split(",", 1)[1]
# Process image data as needed for your model

# Handle text input
elif content_item.get("type") == "text":
text_content = content_item.get("text", "")
# Process text as needed

# Your model processing logic here
return generated_response

except Exception as e:
return f"Error processing input: {str(e)}"

Output Processing

The return value from your chat function depends on your model's output modalities:

# Simply return a string containing the generated text.
return "The capital of France is Paris."

Important Notes

  • Input Format: All media inputs (images and videos) are provided as Base64-encoded data URIs following the OpenAI chat format
  • Output Format: For media outputs, return only the Base64-encoded string without the data URI prefix (e.g., without data:image/jpeg;base64,)
  • Error Handling: Always implement proper error handling and cleanup for temporary files

Best Practices

  1. Error Handling: Wrap API calls in try-except blocks
  2. Timeouts: Set reasonable timeouts for external requests
  3. Dependencies: Use standard libraries when possible (requests, json, etc.)
  4. Security: Never hardcode credentials in your script

Setup Steps

  1. Navigate to AI Applications → New Application
  2. Select Custom Applications tab
  3. Click Python Adapter
  4. Enter your Application Name
  5. Upload your Python script (.py file)
  6. Select Input Modalities (Text, Image, Video)
  7. Select Output Modalities (Text, Image, Video)
  8. Review and submit