Release Notes
TOC
AI 2.0.0New and Optimized FeaturesNPU SupportQueueing & Admission ControlModel RegistryDistributed InferenceLeader-Worker Set SupportAI GatewayTrainer OrchestrationPipelines OrchestrationVector StorageNVIDIA DRANode Feature DiscoveryLlama Stack RuntimeGenerative AILow-Code Builder IntegrationDeprecated FeaturesUI Fine-Tuning & Pretraining DeprecationSecret Manage DeprecationFixed IssuesKnown IssuesAI 2.0.0
New and Optimized Features
NPU Support
By introducing the NPU Operator, NPU hardware management is supported, simplifying the setup for using NPUs and enhancing hardware acceleration.
Queueing & Admission Control
By introducing the Alauda Build of Kueue, queue management and admission control are implemented, optimizing task scheduling and resource allocation.
Model Registry
The integration of Kubeflow Model Registry enhances model management and version control, simplifying the process of model registration and storage.
Distributed Inference
With the addition of llm-d, distributed inference is now supported, improving the performance and resource efficiency of large-scale inference tasks.
Leader-Worker Set Support
Through the introduction of the Alauda Build of LeaderWorkerSet, distributed training task management is supported, simplifying task distribution and coordination in a Leader-Worker model.
AI Gateway
The new Alauda build of Envoy AI Gateway optimizes traffic management and security, providing enhanced AI service proxy capabilities.
Trainer Orchestration
By supporting Kubeflow Trainer v2, task scheduling and management for model training have been enhanced, enabling more flexible and efficient training workflows.
Pipelines Orchestration
Integration with Kubeflow Pipeline allows for more efficient pipeline orchestration and task management, improving automation across workflows.
Vector Storage
The introduction of Milvus provides an efficient vector storage solution, supporting large-scale vector data storage and fast retrieval.
NVIDIA DRA
By introducing the NVIDIA DRA Driver for GPUs, GPU resource management and scheduling have been optimized, increasing GPU utilization and performance.
Node Feature Discovery
The Alauda Build of Node Feature Discovery enables automatic discovery of hardware features and node labeling, improving flexibility in node resource management.
Llama Stack Runtime
The introduction of Llama Stack enhances AI agent capabilities by providing an efficient runtime environment for supporting distributed AI tasks.
Generative AI
By introducing Kserve's Generative AI section, support for generative AI has been enhanced, optimizing model deployment and inference, especially for generative AI applications.
Low-Code Builder Integration
By updating the Dify version and providing deployable Charts, the low-code application building process has been simplified, further enhancing AI application development efficiency.
Deprecated Features
UI Fine-Tuning & Pretraining Deprecation
Due to the lack of general applicability, complex horizontal scaling, and the fact that this is not a mainstream method in the industry for model training and fine-tuning, this feature is deprecated. It is recommended to use notebook-based model fine-tuning and training instead.
Secret Manage Deprecation
The Secret Manage feature is deprecated due to the lack of a valid use case, as manual GitLab integration is no longer required and no longer applicable.
Fixed Issues
- When updating the inference service resource yaml through the page, the volumeMount field is missing, which can cause the inference service to fail to start properly
- In older versions, GraphQL queries (POST by default) were incorrectly intercepted by the gateway layer and checked for create permission. In the new version, requests sent to the /api/graphql interface are correctly treated as get read permissions by the RBAC interceptor, ensuring that users with read-only roles can read and access page content containing GraphQL data streams without problems.
Known Issues
- After deleting a model, the list page fails to reflect the deletion result immediately, and the deleted model still briefly exists in the list. Temporary solution, manually refresh the page.
- Modifying library_name in Gitlab by directly editing the readme file does not synchronize the model type change on the page.
Temporary solution: Use UI operation to modify the library_name to avoid direct operation in Gitlab.