Chat Completion API
The chat completion API allows you to interact with GitHub's AI models through a conversational interface.
Simple Example
using OpenGithubModelsApi
# Create a client (using a placeholder token for documentation)
client = GithubModelsClient("ghp_example_token")
# Create a message history
messages = [
Message(role="system", content="You are a helpful assistant"),
Message(role="user", content="What is Julia programming language?")
]
# Create an inference request
request = InferenceRequest(
model="openai/gpt-4.1", # Example model ID
messages=messages,
temperature=0.7
)
# Get a response (returns just the content by default)
response = create_chat_completion(client, request)
println(response)Request Parameters
Required Parameters
model: ID of the specific model to use (format:{publisher}/{model_name})messages: Array of message objects withroleandcontent
Valid message roles are:
"system": For system instructions"user": For user messages"assistant": For assistant responses"developer": For developer-specific messages
Advanced Usage
Getting Full Response Details
By default, create_chat_completion returns just the content string. To get the full response object, use verbose=true:
full_response = create_chat_completion(client, request, verbose=true)Controlling Response Length and Creativity
request = InferenceRequest(
model="openai/gpt-4.1",
messages=messages,
temperature=0.3, # More deterministic (0.0-1.0)
max_tokens=100, # Limit response length
top_p=0.9 # Alternative to temperature
)Error Handling
Common errors include:
- Invalid model ID format
- Invalid message roles
- Parameter validation errors (e.g., temperature outside 0-1 range)
- API rate limits
Warning
Streaming responses are not supported by this package. Setting stream=true will result in an error.
Best Practices
- Always include a system message to guide the model's behavior
- Keep conversation history within the model's context window
- Adjust temperature based on your use case (lower for factual responses, higher for creative tasks)
- Validate inputs before making API calls