Configure GitLab to use self-hosted models
- Tier: Premium, Ultimate
- Offering: GitLab Self-Managed
Version history
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model. Disabled by default. - Enabled on GitLab Self-Managed in GitLab 17.6.
- Changed to require GitLab Duo add-on in GitLab 17.6 and later.
- Feature flag
ai_custom_modelremoved in GitLab 17.8 - Ability to set AI Gateway URL using UI added in GitLab 17.9.
- Generally available in GitLab 17.9.
- Changed to include Premium in GitLab 18.0.
Prerequisites:
- Upgrade GitLab to version 17.9 or later.
- You must be an administrator.
To configure your GitLab instance to access self-hosted models in your infrastructure:
- Configure your GitLab instance to access the AI Gateway.
- In GitLab 18.4 and later, configure your GitLab instance to access the GitLab Duo Agent Platform service.
- Add self-hosted models to your GitLab instance.
- Select a self-hosted model for a feature.
Configure access to the local AI Gateway
To configure access between your GitLab instance and your local AI Gateway:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Change configuration.
- Under Local AI Gateway URL, enter your AI Gateway URL.
- Select Save changes.
Note
If your AI Gateway URL points to a local network or private IP address (for example, 172.31.x.x or internal hostnames like ip-172-xx-xx-xx.region.compute.internal), GitLab might block the request for security reasons. To allow requests to this address, add the address to the IP allowlist.
Configure timeout for the AI Gateway
Version history
- Introduced in GitLab 18.7.
To conserve resources and prevent long-running queries, configure the timeout for GitLab requests to the AI Gateway when waiting for model responses. Use longer timeouts for self-hosted models with large context windows or complex queries.
You can configure a timeout between 60 and 600 seconds (10 minutes). If you don't set the timeout, GitLab uses the default timeout of 60 seconds.
To configure the AI Gateway timeout:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Change configuration.
- Under AI Gateway request timeout, enter the timeout value in seconds (between 60 and 600).
- Select Save changes.
Determine the timeout value
The timeout value depends on your specific deployment and use case.
To determine the timeout value:
- Start with the default timeout of 60 seconds and monitor for timeout errors.
- Monitor your logs for
A1000timeout errors in your logs. If these errors occur frequently, consider increasing the timeout. - Consider your use case. Larger prompts, complex code generation tasks, or processing large design documents might require longer timeouts.
- Consider your infrastructure. Model performance depends on available GPU resources, network latency between the AI Gateway and model endpoint, and the model's processing capabilities.
- Increase incrementally. If you experience timeouts, increase the value gradually (for example, by 30-60 seconds) and monitor the results.
For more information on troubleshooting timeout errors, see Error A1000.
Configure access to the GitLab Duo Agent Platform
Version history
-
Introduced in GitLab 18.4, as an experiment with a feature flag named
self_hosted_agent_platform. Disabled by default. - Changed from experiment to beta in GitLab 18.5.
- Enabled in GitLab 18.7.
- Generally available in GitLab 18.8.
- Feature flag
self_hosted_agent_platformremoved in GitLab 18.9. - On GitLab 18.7 and 18.8, this feature is beta for customers with an online licenses. To use this feature, you must turn on self-hosted beta models and features.
Prerequisites:
- If your instance has an offline license, you must have the GitLab Duo Agent Platform Self-Hosted add-on.
To access the Agent Platform service from your GitLab instance:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Change configuration.
- Under Local URL for the GitLab Duo Agent Platform service, enter the URL for the local Agent Platform service.
- The URL is typically the same as the Local AI Gateway URL but on gRPC port :50052.
- Do not include a URL prefix such as
http://orhttps://. - If you have set up SSL with an NGINX reverse proxy as recommended, or use the Helm chart with Ingress enabled do not specify port. The NGINX Ingress handles port-forwarding.
- Optional. If your local GitLab Duo Agent Platform endpoint uses TLS, under Security, select the Use secure connection (TLS) for GitLab Duo Agent Platform service checkbox.
- Select Save changes.
Add a self-hosted model
You must add a self-hosted model to your GitLab instance to use it with GitLab Duo features.
To add a self-hosted model:
-
In the upper-right corner, select Admin.
-
In the left sidebar, select GitLab Duo.
-
Select Configure models for GitLab Duo.
- If Configure models for GitLab Duo is not available, synchronize your
subscription after purchase:
- In the left sidebar, select Subscription.
- In Subscription details, to the right of Last sync, select synchronize subscription ({retry}).
- If Configure models for GitLab Duo is not available, synchronize your
subscription after purchase:
-
Select Add self-hosted model.
-
Complete the fields:
-
Deployment name: Enter a name to uniquely identify the model deployment, for example,
Mixtral-8x7B-it-v0.1 on GCP. -
Model family: Select the model family the deployment belongs to. You can select either a supported or compatible model.
-
Endpoint: Enter the URL where the model is hosted.
-
API key: Optional. Add an API key if you need one to access the model.
-
Model identifier: Enter the model identifier based on your deployment method. The model identifier should match the following format:
Deployment method Format Example vLLM custom_openai/<name of the model served through vLLM>custom_openai/Mixtral-8x7B-Instruct-v0.1Amazon Bedrock bedrock/<model ID of the model>bedrock/mistral.mixtral-8x7b-instruct-v0:1Google Vertex AI vertex_ai/<model ID of the model>vertex_ai/claude-sonnet-4-6@defaultAnthropic anthropic/<model ID of the model>anthropic/claude-opus-4-6OpenAI openai/<model ID of the model>openai/gpt-5Azure OpenAI azure/<model ID of the model>azure/gpt-35-turbo
-
-
Select Add self-hosted model.
Set the model identifier for Amazon Bedrock models
To set a model identifier for an Amazon Bedrock model:
-
Set your
AWS_REGION. Ensure you have access to models in that region in your AI Gateway Docker configuration. -
Add the region prefix to the model's inference profile ID for cross-region inferencing.
-
Use the
bedrock/prefix region as the prefix for the model identifier.For example, for the Anthropic Claude 4.0 model in the Tokyo region:
- The
AWS_REGIONisap-northeast-1. - The cross-region inferencing prefix is
apac.. - The model identifier is
bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0.
- The
Some regions are not supported by cross-region inferencing. For these regions, do not specify a region prefix in the model identifier. For example:
- The
AWS_REGIONiseu-west-2. - The model identifier is
anthropic.claude-sonnet-4-5-20250929-v1:0.
Turn on self-hosted beta models and features
Note
Turning on beta self-hosted models and features also accepts the GitLab Testing Agreement.
To enable self-hosted beta models and features:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Change configuration.
- Under Self-hosted beta models and features, select the Use beta models and features in GitLab Duo Self-Hosted checkbox.
- Select Save changes.
Configure GitLab Duo features to use self-hosted models
View configured features
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Configure models for GitLab Duo.
- If Configure models for GitLab Duo is not available, synchronize your
subscription after purchase:
- In the left sidebar, select Subscription.
- In Subscription details, to the right of Last sync, select synchronize subscription ({retry}).
- If Configure models for GitLab Duo is not available, synchronize your
subscription after purchase:
- Select the AI-native features tab.
Select a self-hosted model for a feature
To select a self-hosted model:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Configure models for GitLab Duo.
- Select the AI-native features tab.
- For the feature you want to select a self-hosted model for, select the model from dropdown list.
Note
If you don't specify a model for a GitLab Duo Chat sub-feature, it automatically uses the model configured for General Chat. This ensures all Chat functionality works without requiring individual model selection for each sub-feature.
Select a GitLab-managed model for a feature
Version history
-
Introduced in GitLab 18.3, as a beta with a feature flag named
ai_self_hosted_vendored_features. Disabled by default. - Enabled by default in GitLab 18.7
- Generally available in GitLab 18.9. Feature flag
ai_self_hosted_vendored_featuresremoved.
You can select a GitLab-managed model for a feature, even if you use a self-hosted AI Gateway and self-hosted models.
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Configure models for GitLab Duo.
- Select the AI-native features tab.
- For the feature and sub-feature you want to configure, from the dropdown list, select GitLab-managed model.
Turn off GitLab Duo features
GitLab Duo features remain turned on even if you have not chosen a model for a feature.
To turn off a GitLab Duo feature:
- In the upper-right corner, select Admin.
- In the left sidebar, select GitLab Duo.
- Select Configure models for GitLab Duo.
- Select the AI-native features tab.
- For the feature you want to turn off, from the dropdown list, select Disabled.
Self-host the GitLab documentation
If your setup prevents you from accessing the GitLab documentation at
docs.gitlab.com, you can self-host the documentation.
For more information, see Host the GitLab product documentation.