PyPI官网下载|gpu-load-balancer-0.0.2.tar.gz资源-CSDN文库

版权申诉

155 浏览量 2022-01-11 13:19:24 上传评论收藏 16KB GZ 举报

共24个文件

py：13个

txt：4个

pkg-info：2个

《PyPI官网下载：GPU负载均衡器初探——gpu-load-balancer-0.0.2.tar.gz》 PyPI（Python Package Index）是Python开发者的重要资源库，它提供了丰富的Python库，供全球开发者免费下载和使用。本次我们将探讨的是一个名为“gpu-load-balancer”的Python库，版本为0.0.2，它被封装在名为“gpu-load-balancer-0.0.2.tar.gz”的压缩文件中。 “gpu-load-balancer”这个名字暗示了该库的核心功能，即针对GPU资源的负载均衡管理。在高性能计算和深度学习领域，GPU的重要性不言而喻，它们能显著加速计算密集型任务。然而，如何高效、公平地分配多块GPU的负载，避免过载或资源浪费，就成为了一个亟待解决的问题。这就是“gpu-load-balancer”库的用武之地。从0.0.2的版本号来看，这可能是一个早期版本，开发者可能仍在不断优化和完善其功能。通常，较低的版本号表示软件仍处于发展初期，可能存在一些未解决的bug或者功能限制，但同时也意味着更多的改进空间。在Python环境中，`.tar.gz`是一种常见的压缩格式，它结合了tar（归档）和gzip（压缩）两种工具，可以将多个文件打包成一个单一的压缩文件，便于传输和存储。解压这个文件后，我们可以看到“gpu-load-balancer-0.0.2”目录，里面包含了源代码、文档、测试文件等，这些都是分析和理解库功能的关键。一般来说，Python库的核心组件包括`setup.py`文件，它是构建、安装和发布Python包的标准方式。此外，`requirements.txt`文件列出所有必要的依赖库，确保在使用该库时能够正确安装所有依赖。`LICENSE`文件则规定了该库的使用许可，通常遵循MIT、Apache 2.0等开源协议。在深入研究源代码之前，我们可以通过阅读`README.md`或`docs`目录下的文档来获取关于“gpu-load-balancer”的详细信息，了解其工作原理、安装步骤、示例用法以及可能的API接口。通过`tests`目录中的测试脚本，可以验证库的功能是否正常，并理解其预期行为。总结来说，“gpu-load-balancer-0.0.2.tar.gz”是一个Python库，专注于GPU资源的负载均衡，它可能包含源码、文档、测试和依赖信息。对于深度学习和高性能计算的开发者而言，理解和利用这个库，可以提高计算效率，优化资源分配，为项目带来更大的灵活性。同时，对于开源社区的贡献者，这是一个很好的机会，参与其发展，提出改进建议，甚至为库的进一步完善贡献力量。

资源推荐

资源详情

资源评论

收起资源包目录

gpu-load-balancer-0.0.2.tar.gz （24个子文件）

gpu-load-balancer-0.0.2

setup.py 922B

bin

glb-start 1KB

glb-run 844B

gpu_load_balancer.egg-info

top_level.txt 4B

SOURCES.txt 553B

PKG-INFO 665B

dependency_links.txt 1B

requires.txt 39B

LICENSE 1KB

setup.cfg 79B

README.md 5KB

PKG-INFO 665B

glb

core

gpu_master.py 12KB

job.py 8KB

__init__.py 0B

constants.py 437B

globals.py 605B

grpc_resources

__init__.py 0B

master_pb2.py 10KB

master_pb2_grpc.py 5KB

__init__.py 25B

utils

__init__.py 0B

utils.py 649B

gpu_monitors.py 4KB

# GPU Load Balancer (GLB) GLB is a tool for evenly distributing N-many jobs across M-many GPUs on a single machine. Its core assumption is that resource consumption for a job can only be determined at runtime. ## Main features - **Minimal API** with almost no changes required to existing code. Decorate functions with `@glb.job`. That's it! - **Naturally composable** with other workflow and job orchestration tools. # Installation - Run `pip install -e .` # Usage Decorate functions that require load balancing ```python import glb @glb.job() def job(): import torch x = torch.randn(1000, device='cuda:0') if __name__ == '__main__': job() ``` If you have a long-running job that is not suitable for quick profiling, you may optionally define a lighter-weight version that will be profiled instead. It is on you to ensure that this lightweight version is an accurate reflection of the main job. ```python import glb @glb.job() def long_job(sleep: int): import torch import time x = torch.randn(1000, device='cuda:0') time.sleep(sleep) @long_job.profile def short_job(*args): import torch import time x = torch.randn(1000, device='cuda:0') if __name__ == '__main__': # short_job will run first to gather the resource profile. # Then long_job will run automatically. # The arguments passed to long_job here will also be passed to short_job. long_job(10) ``` ## **Combined server/client invocation** In your terminal, run. ```shell $ glb-run "python job.py" ``` This command simply runs `glb-start` and then runs the command `python job.py` in sequence. You can pass multiple commands if you want. ```shell $ glb-run "python job.py" "python job.py" ``` ## **Separate server/client invocation** You can instead run the server by invoking in a separate terminal ```shell $ glb-start ____ _ ____ ____ ____ _ _ __ __ _ / ___| | | __ ) _ / ___| _ \| | | | | \/ | __ _ ___| |_ ___ _ __ | | _| | | _ \ (_) | | _| |_) | | | | _____ | |\/| |/ _` / __| __/ _ \ '__| | |_| | |___| |_) | _ | |_| | __/| |_| | |_____| | | | | (_| \__ \ || __/ | \____|_____|____/ (_) \____|_| \___/ |_| |_|\__,_|___/\__\___|_| [11/12/21 20:16:35] INFO Managing GPUs {0, 1, 2, 3} gpu_monitors.py:38 ``` and then in the original terminal ```shell $ python job.py ``` Note that no `CUDA_VISIBLE_DEVICES` flag was used here. If you want to restrict the possible GPUs that your job can run on, simply launch the server with a restricted range ```shell $ CUDA_VISIBLE_DEVICES=0,1 glb-start ____ _ ____ ____ ____ _ _ __ __ _ / ___| | | __ ) _ / ___| _ \| | | | | \/ | __ _ ___| |_ ___ _ __ | | _| | | _ \ (_) | | _| |_) | | | | _____ | |\/| |/ _` / __| __/ _ \ '__| | |_| | |___| |_) | _ | |_| | __/| |_| | |_____| | | | | (_| \__ \ || __/ | \____|_____|____/ (_) \____|_| \___/ |_| |_|\__,_|___/\__\___|_| [11/12/21 20:16:35] INFO Managing GPUs {0, 1} gpu_monitors.py:38 ``` The load balancer will do the rest of the work to make sure that `job.py` will only run on a GPU in the set `{0, 1}`. # Examples Working examples are maintained in the `examples/` directory. Consider starting with `examples/mnist`, which will give you a sense for how GLB can naturally compose with other workflow and job orchestration tools. # More details The most important thing to know is that GLB assumes that the GPU resources required by a job are not known a priori -- whether this be due to dynamic computation graphs that cannot be analyzed before runtime or due to other complications. This assumption leads to a couple of practical constraints and footguns. - When you attempt to run a job that GLB has not encountered before, GLB will profile it in a blocking manner, meaning that any jobs of the same type that you attempt to run in the interim will wait until completion of the first. However, this ***does not*** mean that an unknown job of a different type will also be blocked. Rather, GLB can build profiles of multiple job types simultaneously. ## Potential Footguns - GLB only caches profiles for **jobs that succeed**. If you have a large number of jobs that all fail at some midway point, they will all run one-at-a-time because GLB will never recognize any of them as known job types. - You can explicitly tell GLB to use one job to build a profile for another. If you use this feature, it is **up to you** to ensure that this other job accurately reflects the resource usage of the first. - If the secondary job overestimates resource consumption, you will not be able to maximize throughput. - If the secondary job underestimates resource consumption, jobs may fail unexpectedly due to CUDA out of memory errors. ## Job types Currently, job types are a loosely defined concept for GLB. It is assumed that It is assumed that jobs that share the same function signature are of the same type. This means that a job type is sufficiently specified by a descriptor string of the form ```python "tests/gpu_job_test.py::gpu_job" ``` This descriptor string is automatically built when you use the `@glb.job` decorator.

评论收藏

内容反馈

版权申诉