Deploying an LLM
Certain Gretel Models can utilize an online LLM (Large Language Model) to improve functionality. This guide will walk you through the steps to deploy an LLM in your Gretel Hybrid environment.
Prerequisites
Ensure you have completed the general prerequisites for deploying Gretel Hybrid, found in the Deployment guide.
You'll need to have already installed Gretel Hybrid.
This guide will utlize
helm
to install a chart within your Kubernetes cluster.
Apply the helm
chart
helm
chartThe Gretel Inference LLM chart is available in the Gretel Helm repository.
To add repository to your local
helm
installation, run the following command:
helm repo add gretel https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/helm-charts/stable/
helm repo update
Create a
values.yml
file:
gretelConfig:
# This should match the secret ref that was created as part of Gretel Hybrid
apiKeySecretRef: "gretel-api-key"
gretelLLMConfig:
modelName: "mistral-7b"
# Ensure the tolerations allow the LLM pods to run on GPU nodes.
# For example, these tolerations will allow the pod to run if you
# used our terraform modules to create your cluster.
tolerations:
- effect: NoSchedule
key: gretel-worker
operator: Equal
value: gpu-model
- effect: NoSchedule
key: nvidia.com/gpu
operator: Exists
Ensure your
kubectl
context is set to the correct cluster where you're already running Gretel Hybrid.Apply the chart to your Kubernetes cluster:
helm upgrade --namespace gretel-hybrid \
--install gretel-inference-llm gretel/gretel-inference-llm \
--values values.yml
After giving the pod a few minutes to spin up, ensure that the pod is in a healthy state:
kubectl --namespace gretel-hybrid get pods -l app.kubernetes.io/name=gretel-inference-llm
Usage
Transform v2 can utilize the Gretel Inference LLM service for classification. For an example of how to configure a hybrid Transform v2 job to use classification, see the Transform v2 guide.
Last updated
Was this helpful?