【免费】scheduler-plugins-release-1.23.zip资源-CSDN文库

共2000个文件

go：1765个

md：160个

yaml：42个

需积分: 0 164 浏览量 2024-02-14 16:54:26 上传评论收藏 22.95MB ZIP 举报

在 Kubernetes (K8s) 生态系统中，调度器（Scheduler）是核心组件之一，它负责将待调度的Pod分配到合适的Node上运行。"scheduler-plugins-release-1.23.zip" 文件很可能是 Kubernetes 社区发布的一个版本，专注于调度器插件。这个压缩包可能包含了 Kubernetes v1.23 版本的调度器插件源码、编译构建文件以及相关的文档资料。在 Kubernetes 1.23 版本中，调度器插件框架进一步发展，增强了可扩展性和定制化能力。调度器插件允许开发者根据特定的工作负载需求定制调度策略，比如性能优化、资源限制、亲和性/反亲和性规则等。以下是与这个主题相关的几个关键知识点： 1. **调度器插件机制**：Kubernetes 调度器基于插件架构，由一系列可插入的组件组成。这些插件按顺序执行，分为预过滤（Predicates）、过滤（Filters）、评分（Scoring）、绑定（Bind）等阶段。每个阶段的插件都有自己的功能，例如“PodFitsResources”预过滤插件检查Pod是否符合Node的资源限制。 2. **自定义调度策略**：通过编写新的插件或调整现有插件的配置，可以实现自定义的调度策略。例如，如果你的集群中有特殊硬件，可以创建一个插件来确保这些Pod只被调度到有相应硬件的Node上。 3. **调度器插件注册**：在 Kubernetes 配置中，需要声明并启用所需的调度器插件。这通常在 `scheduler.config.yaml` 文件中完成，其中包含了启用的插件列表和它们的配置参数。 4. **插件开发**：对于开发者，理解 Kubernetes 的 API 和插件接口至关重要。在 `scheduler-plugins-release-1.23` 中，可能包含了示例代码和指南，帮助开发者了解如何创建和集成新的调度插件。 5. **调度器性能优化**：插件可以用于优化调度性能。例如，通过优先级函数（Priority Functions），可以根据业务需求设置不同权重，使得某些Pod优先被调度。 6. **版本兼容性**：`scheduler-plugins-release-1.23` 版本对应的是 Kubernetes 1.23，因此要注意与其他Kubernetes组件的版本兼容性，避免因版本不匹配导致的问题。 7. **测试和调试**：在开发和部署自定义调度插件时，需要进行充分的测试和调试。Kubernetes 提供了模拟调度器工具，可以帮助开发者在真实环境之外验证插件的行为。在深入研究 "scheduler-plugins-release-1.23.zip" 内容之前，建议先熟悉 Kubernetes 调度器的基本概念和工作流程，然后逐步学习和分析提供的源码和文档，以便更好地理解和利用这些插件来优化你的Kubernetes集群。

资源推荐

资源详情

资源评论

收起资源包目录

scheduler-plugins-release-1.23.zip （2000个子文件）

gccgo_c.c 1KB

cpu_gccgo_x86.c 1KB

zerrors_windows.go 923KB

rpc.pb.go 648KB

tables13.0.0.go 379KB

tables12.0.0.go 377KB

tables11.0.0.go 376KB

tables10.0.0.go 374KB

tables9.0.0.go 372KB

OpenAPIv2.go 301KB

OpenAPIv3.pb.go 295KB

OpenAPIv3.go 283KB

tables13.0.0.go 278KB

tables12.0.0.go 273KB

tables11.0.0.go 271KB

OpenAPIv2.pb.go 269KB

tables10.0.0.go 267KB

tables9.0.0.go 263KB

zstdlib.go 188KB

zsyscall_windows.go 165KB

zerrors_linux.go 150KB

ztypes_linux.go 140KB

tables13.0.0.go 121KB

tables12.0.0.go 119KB

zsysnum_zos_s390x.go 117KB

tables11.0.0.go 117KB

descriptor.pb.go 115KB

tables10.0.0.go 111KB

tables9.0.0.go 109KB

metrics.pb.go 104KB

types_windows.go 97KB

server.go 94KB

zerrors_darwin_amd64.go 87KB

zerrors_darwin_arm64.go 87KB

transport.go 83KB

table_marshal.go 79KB

zerrors_openbsd_mips64.go 77KB

tables12.0.0.go 77KB

tables13.0.0.go 77KB

zerrors_freebsd_arm64.go 76KB

zerrors_freebsd_386.go 76KB

zerrors_freebsd_amd64.go 76KB

tables11.0.0.go 75KB

tables10.0.0.go 74KB

zerrors_openbsd_arm64.go 74KB

syscall_windows.go 73KB

zerrors_netbsd_386.go 73KB

zerrors_openbsd_amd64.go 73KB

zerrors_netbsd_arm64.go 72KB

zerrors_netbsd_amd64.go 72KB

zerrors_freebsd_arm.go 72KB

zerrors_netbsd_arm.go 72KB

tables9.0.0.go 71KB

zerrors_dragonfly_amd64.go 71KB

raft_internal.pb.go 70KB

syscall_linux.go 69KB

zerrors_openbsd_arm.go 68KB

zerrors_openbsd_386.go 68KB

zsyscall_darwin_amd64.go 65KB

zsyscall_darwin_arm64.go 65KB

zerrors_solaris_amd64.go 59KB

zsyscall_solaris_amd64.go 57KB

trace.pb.go 54KB

table_unmarshal.go 54KB

zerrors_aix_ppc64.go 53KB

zerrors_aix_ppc.go 52KB

security_windows.go 52KB

assertion_forward.go 52KB

syscall_zos_s390x.go 49KB

assertions.go 49KB

wrappers.go 48KB

zsyscall_linux.go 48KB

zsyscall_freebsd_arm.go 48KB

zsyscall_freebsd_386.go 48KB

installer.go 48KB

zsyscall_freebsd_amd64.go 48KB

zsyscall_freebsd_arm64.go 48KB

fix.go 48KB

command.go 47KB

frame.go 46KB

zsyscall_netbsd_arm.go 44KB

zsyscall_netbsd_386.go 44KB

zsyscall_netbsd_amd64.go 44KB

zsyscall_netbsd_arm64.go 44KB

zsyscall_aix_ppc64_gc.go 42KB

zsyscall_openbsd_arm.go 40KB

zsyscall_openbsd_386.go 40KB

zsyscall_openbsd_mips64.go 39KB

zsyscall_openbsd_amd64.go 39KB

zsyscall_openbsd_arm64.go 39KB

zsyscall_dragonfly_amd64.go 39KB

packages.go 38KB

encode.go 38KB

decode.go 38KB

decode_number_int.go 37KB

membership.pb.go 36KB

golist.go 36KB

zsysnum_freebsd_amd64.go 36KB

zsysnum_freebsd_arm64.go 36KB

zsysnum_freebsd_arm.go 36KB

共 2000 条

# Network-Aware Scheduling ## Table of Contents  - [Summary](#summary) - [Motivation](#motivation) - [Goals](#goals) - [Non-Goals](#non-goals) - [Use cases/Topologies](#use-cases--topologies) - [1 - Spark/Database applications running in Data centers or small scale cluster topologies](#1---sparkdatabase-applications-running-in-data-centers-or-small-scale-cluster-topologies) - [2 - Cloud2Edge application running on a multi-region geo-distributed cluster](#2---cloud2edge-application-running-on-a-multi-region-geo-distributed-cluster) - [Proposal - Design & Implementation Details](#proposal---design--implementation-details) - [Overview of the System Design](#overview-of-the-system-design) - [Application Group CRD](#application-group-crd) - [Network Topology CRD](#network-topology-crd) - [The inclusion of bandwidth in the scheduling process](#the-inclusion-of-bandwidth-in-the-scheduling-process) - [Bandwidth Requests via extended resources](#bandwidth-requests-via-extended-resources) - [Bandwidth Limitations via the Bandwidth CNI plugin](#bandwidth-limitations-via-the-bandwidth-cni-plugin) - [The Network-aware scheduling Plugins](#the-network-aware-scheduling-plugins) - [Description of the `TopologicalSort` plugin](#description-of-the-topologicalsort-plugin) - [Description of the `NetworkOverhead` plugin](#description-of-the-networkoverhead-plugin) - [Known limitations](#known-limitations) - [Test plans](#test-plans) - [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) - [Scalability](#scalability) - [Troubleshooting](#troubleshooting) - [Graduation criteria](#graduation-criteria) - [Implementation history](#implementation-history)  # Summary This proposal introduces an end-to-end solution to model/weight a cluster's network latency and topological information, and leverage that to better schedule latency- and bandwidth-sensitive workloads. # Motivation Many applications are latency-sensitive, demanding lower latency between microservices in the application. Scheduling policies that aim to reduce costs or increase resource efficiency are not enough for applications where end-to-end latency becomes a primary objective. Applications such as the Internet of Things (IoT), multi-tier web services, and video streaming services would benefit the most from network-aware scheduling policies, which consider latency and bandwidth in addition to the default resources (e.g., CPU and memory) used by the scheduler. Users encounter latency issues frequently when using multi-tier applications. These applications usually include tens to hundreds of microservices with complex interdependencies. Distance from servers is usually the primary culprit. The best strategy is to reduce the latency between chained microservices in the same application, according to the prior work about [Service Function Chaining](https://www.sciencedirect.com/science/article/pii/S1084804516301989) (SFC). Besides, bandwidth plays an essential role for those applications with high volumes of data transfers among microservices. For example, multiple replicas in a database application may require frequent copies to ensure data consistency. [Spark jobs](https://spark.apache.org/) may have frequent data transfers among map and reduce nodes. Insufficient network capacity in nodes would lead to increasing delay or packet drops, which will degrade the Quality of Service (QoS) for applications. We propose two **Network-Aware Scheduling Plugins** for Kubernetes that focus on delivering low latency to end-users and ensuring bandwidth reservations in pod scheduling. This work significantly extends the previous work open-sourced [here](https://github.com/jpedro1992/sfc-controller) that implements a latency-aware scheduler extender based on the [scheduler extender](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/scheduler_extender.md) design. ## Goals - Define microservice dependencies in an Application via custom resources (**AppGroup CRD**). - Describe the network topology for the underlying cluster via weights between regions (`topology.kubernetes.io/region`) and zones (`topology.kubernetes.io/zone`) via custom resources (**NetworkTopology CRD**). - Make existing scheduler plugins aware of network bandwidth by advertise the nodes' (physical) bandwidth capacity as [extended resources](https://kubernetes.io/docs/tasks/administer-cluster/extended-resource-node/). - Provide a **QueueSort** plugin [TopologicalSort](https://en.wikipedia.org/wiki/Topological_sorting), which orders pods to be scheduled in an **AppGroup** based on their dependencies. - Provide **network-aware Filter & Score** plugins to filter out nodes based on microservice dependencies defined in **AppGroup** and score nodes with lower network costs (described in **NetworkTopology**) higher to achieve latency-aware scheduling. ## Non-Goals - Descheduling due to unexpected outcomes is not addressed in this proposal. - The conflict between plugins in this proposal and other plugins are not studied in this proposal. Users are welcome to try plugins in this proposal with other plugins (e.g., `RequestedToCapacityRatio`, `BalancedAllocation`). However, a higher weight must be given to our plugin ensuring low network costs are preferred. ## Use cases / Topologies ### 1 - Spark/Database applications running in Data centers or small scale cluster topologies Network-aware scheduling examines the infrastructure topology, so network latency and bandwidth between nodes are considered while making scheduling decisions. Data centers with fat-tree topology or cluster topology can benefit from our network-aware framework, as network conditions (i.e., network latency, available bandwidth) between nodes can vary according to their locations in the infrastructure. <img src="figs/cluster.png" title="Cluster Topology" width="600" class="center"/> <img src="figs/data_center.png" title="DC Topology" width="600" class="center"/> Deploying microservices on different sets of nodes will impact the application's response time. For specific applications, latency and bandwidth requirements can be critical. For example, in a [Redis cluster](https://redis.io/topics/cluster-tutorial), master nodes need to synchronize data with slave nodes frequently. Namely, there are dependencies between the masters and the slaves. High latency or low bandwidth between masters and slaves can lead to slow CRUD operations. <img src="figs/redis.png" title="Redis app" width="600" class="center"/> ### 2 - Cloud2Edge application running on a multi-region geo-distributed cluster. Multi-region Geo-distributed scenarios benefit the most from our framework and network-aware plugins. <img src="figs/multi_region.png" title="MultiRegion Topology" width="600" class="center"/> High latency is a big concern in these topologies, especially for IoT applications (e.g., [Eclipse Hono](https://github.com/eclipse/hono), [Eclipse Cloud2Edge](https://www.eclipse.org/packages/packages/cloud2edge/)). For example, in the Cloud2Edge platform, there are several dependencies among the several APIs and MQTT brokers where devices connect to: <img src="figs/cloud2edge.png" title="Cloud2Edge" width="600" class="center"/> # Proposal - Design & Implementation details ## Overview of the System Design The proposal introduces two [Custom Resources (CRs)](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) defined as Custom Resource Definitions (CRDs): - **AppGroup CRD**: abstracts the service topology to maintain application microservice dependencies. - **NetworkTopology CRD**: abstracts the network infrastructure to establish network weights between regions and zones in the cluster. Thus,

评论收藏

内容反馈