Python Adapter
Create custom integrations by uploading a Python script that implements your model logic. This approach gives you complete flexibility to test any AI application.
Use Cases
- Proprietary models or custom-built AI systems
- Applications with unique preprocessing or postprocessing requirements
- Models accessed through non-standard authentication or invocation methods
- Local inference pipelines
Configuration
| Field | Required | Description |
|---|---|---|
| Application Name | Yes | Unique identifier (e.g., my-python-adapter-app) |
| Application Template File | Yes | Python file (max 5MB) |
| Input Modalities | Yes | Text, Image, Video |
| Output Modalities | Yes | Text, Image, Video |
Template Structure
Your Python script must implement a chat function with this signature:
def chat(chats):
"""
Process chat messages and return a response.
Args:
chats: List of message dictionaries with 'role' and 'content' keys
Example: [{"role": "user", "content": "Hello"}]
Returns:
str: The model's response text
"""
# Your implementation here
pass
Example Implementation
import os
from transformers import pipeline, LlamaTokenizer, LlamaForCausalLM
# Step 1: Load the LLaMA model and tokenizer
model_name = "meta-llama/Meta-Llama-3-8B"
tokenizer = LlamaTokenizer.from_pretrained(model_name)
model = LlamaForCausalLM.from_pretrained(model_name)
# Initialize the Hugging Face pipeline with the model and tokenizer
chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer)
# Don't change the name of the function or the function signature
def chat(chats):
"""
Generates a response from the language model based on a given list of chat messages.
Parameters:
chats (list): A list of dictionaries representing a conversation. Each dictionary contains:
- "role": Either "user" or "assistant".
- "content": A string (text message) or a list of dictionaries for multimodal input.
- The last entry should have a "prompt" without an "answer".
Returns:
str: The response from the language model.
"""
# Step 2: Generate the model's response
response = chatbot(chats, max_length=1000, num_return_sequences=1)
# Step 3: Extract and return the generated text
generated_text = response[0]['generated_text']
assistant_response = generated_text.split("Assistant:")[-1].strip()
return assistant_response
Handling Different Modalities
Input Processing
The chats parameter follows OpenAI's chat format, where multimodal content is represented as a list of dictionaries within the content field.
- Text Input
- Image Input
- Video Input
# Simple text message
{
"role": "user",
"content": "What is the capital of France?"
}
# Image content (Base64 encoded)
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,{base64_image}"}}
]
}
# Video content (Base64 encoded)
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this video"},
{"type": "video_url", "video_url": {"url": "data:video/mp4;base64,{base64_video}"}}
]
}
Processing Mixed Input Types
def chat(chats):
"""
Handle different input modalities in the chat function
"""
try:
# Process each message to handle different modalities
for message in chats:
if "content" in message and isinstance(message["content"], list):
# Handle multimodal content
for i, content_item in enumerate(message["content"]):
# Handle video input
if content_item.get("type") == "video_url" and "url" in content_item.get("video_url", {}):
video_url = content_item["video_url"]["url"]
if video_url.startswith("data:video/"):
base64_data = video_url.split(",", 1)[1]
# Process video data as needed for your model
# Handle image input
elif content_item.get("type") == "image_url" and "url" in content_item.get("image_url", {}):
image_url = content_item["image_url"]["url"]
if image_url.startswith("data:image/"):
base64_data = image_url.split(",", 1)[1]
# Process image data as needed for your model
# Handle text input
elif content_item.get("type") == "text":
text_content = content_item.get("text", "")
# Process text as needed
# Your model processing logic here
return generated_response
except Exception as e:
return f"Error processing input: {str(e)}"
Output Processing
The return value from your chat function depends on your model's output modalities:
- Text Output
- Image Output
- Video Output
# Simply return a string containing the generated text.
return "The capital of France is Paris."
# Return the image as a Base64 encoded string (without the data URI prefix).
def chat(chats):
try:
prompt = chats[0]['content']
# Generate image using your image generation model
response = client.images.generate(
prompt=prompt,
model="black-forest-labs/FLUX.1-schnell",
width=1024,
height=768,
steps=4,
n=1,
response_format="b64_json",
stop=[]
)
# Return the base64 encoded image
# Frontend will prepend "data:image/jpeg;base64," as needed
return response.data[0].b64_json
except Exception as e:
return f"Error during image generation: {str(e)}"
# Return the video as a Base64 encoded string (without the data URI prefix).
import base64
import os
import uuid
from diffusers.utils import export_to_video
def chat(chats):
try:
prompt = chats[0]['content']
# Generate video using your video generation model
video_frames = pipe(
prompt=prompt,
width=1024,
height=576,
num_frames=49,
num_inference_steps=50,
).frames[0]
# Export video to a temporary file
temp_dir = "temp_videos"
os.makedirs(temp_dir, exist_ok=True)
temp_filename = os.path.join(temp_dir, f"video_{uuid.uuid4()}.mp4")
export_to_video(video_frames, temp_filename, fps=7)
# Read the video file and encode to base64
with open(temp_filename, "rb") as video_file:
video_bytes = video_file.read()
video_base64 = base64.b64encode(video_bytes).decode('utf-8')
# Clean up the temporary file
os.remove(temp_filename)
# Return raw base64 string
return video_base64
except Exception as e:
return f"Error during video generation: {str(e)}"
Important Notes
- Input Format: All media inputs (images and videos) are provided as Base64-encoded data URIs following the OpenAI chat format
- Output Format: For media outputs, return only the Base64-encoded string without the data URI prefix (e.g., without
data:image/jpeg;base64,) - Error Handling: Always implement proper error handling and cleanup for temporary files
Best Practices
- Error Handling: Wrap API calls in try-except blocks
- Timeouts: Set reasonable timeouts for external requests
- Dependencies: Use standard libraries when possible (requests, json, etc.)
- Security: Never hardcode credentials in your script
Setup Steps
- Navigate to AI Applications → New Application
- Select Custom Applications tab
- Click Python Adapter
- Enter your Application Name
- Upload your Python script (.py file)
- Select Input Modalities (Text, Image, Video)
- Select Output Modalities (Text, Image, Video)
- Review and submit