使用联合学习对预测性维护用例进行概念验证，以持续改进对飞机燃气涡轮发动机剩余寿命的预测。

共92个文件

py：40个

svg：10个

txt：5个

版权申诉

5星 · 超过95%的资源 120 浏览量 2023-09-17 14:56:56 上传评论 1 收藏 3.15MB ZIP 举报

在这个名为"使用联合学习对预测性维护用例进行概念验证，以持续改进对飞机燃气涡轮发动机剩余寿命的预测。_Jupyter Notebook.zip"的压缩包中，包含了一个关于使用联合学习来提升飞机燃气涡轮发动机剩余寿命预测的项目。联合学习是一种分布式机器学习方法，它允许在不集中数据的情况下训练模型，这对于处理敏感或隐私保护的数据如飞机引擎的维护记录尤其重要。我们要理解预测性维护的核心是利用数据分析和机器学习技术，提前预测设备可能出现的故障，从而减少意外停机和维护成本。在航空领域，准确预测燃气涡轮发动机的剩余寿命（RUL，Remaining Useful Life）至关重要，因为它直接影响飞行安全和运营效率。联合学习在此项目中的应用意味着各个参与方（可能是拥有不同飞机引擎数据的航空公司或维修机构）可以在保持数据本地化的同时，共同训练模型。这种方法解决了数据孤岛问题，使得每个参与方都能从其他机构的经验中获益，而无需直接分享其敏感的运营数据。 Jupyter Notebook是数据科学家和工程师常用的工具，用于编写和展示代码、分析数据和可视化结果。在"Turbofan-Federated-Learning-POC-master"这个目录中，很可能包含了使用Python编程语言编写的代码，使用了诸如TensorFlow或PyTorch等深度学习框架来构建和训练模型。这些代码可能涵盖了以下步骤： 1. **数据预处理**：收集到的涡轮发动机数据需要清洗、转换和标准化，以便输入到机器学习模型中。可能包括传感器读数、飞行条件、运行时间等多种特征。 2. **特征工程**：通过专家知识和统计分析，可能创建新的特征变量，这些特征对于预测RUL至关重要。 3. **模型选择与训练**：选择合适的模型架构，如循环神经网络（RNN）、长短时记忆网络（LSTM）或其他时间序列预测模型。在联合学习框架下，每个参与者本地训练模型，并将模型的更新发送到中央协调器。 4. **联合学习算法**：实现联合学习算法，如FedAvg（联邦平均），将各参与者的模型更新进行聚合，形成全局模型。 5. **模型评估**：使用交叉验证或者保留一部分数据作为测试集，评估模型的预测性能，如均方误差（MSE）、平均绝对误差（MAE）或R²分数。 6. **模型迭代与优化**：根据评估结果调整模型参数或改进模型结构，重复训练和评估过程，直到达到满意的性能。 7. **部署与监控**：最终模型可以部署到实时系统中，用于持续监控发动机状态并预测剩余寿命。同时，模型应持续更新，以适应新的数据和潜在的性能变化。这个项目为航空业提供了一种创新的方法，通过联合学习来改善燃气涡轮发动机的预测性维护，提高整体的安全性和效率，同时尊重数据隐私。对于想要深入了解如何在实际场景中应用联合学习的工程师和研究人员，这是一个极好的学习资源。

资源推荐

资源详情

资源评论

收起资源包目录

使用联合学习对预测性维护用例进行概念验证，以持续改进对飞机燃气涡轮发动机剩余寿命的预测。_Jupyter Noteboo.zip （92个子文件）

Turbofan-Federated-Learning-POC-master

docker_syft_base

Dockerfile 251B

.github

workflows

turbofan-engine.yml 849B

federated_trainer

federated_trainer.py 8KB

__init__.py 0B

helper

__init__.py 0B

trainings_helper.py 4KB

data_helper.py 5KB

turbofan_model.py 1KB

entrypoint.sh 50B

Dockerfile 288B

requirements.txt 103B

build-docker-images.sh 219B

LICENSE 11KB

docker-compose.yml 6KB

engine

__init__.py 0B

gunicorn.conf 142B

engine_node

__init__.py 0B

app

__init__.py 2KB

templates

index.html 5KB

main

__init__.py 185B

helper

__init__.py 0B

data_helper.py 4KB

config_helper.py 94B

handler

__init__.py 0B

stats_handler.py 758B

sensor_handler.py 8KB

state_handler.py 454B

persistence

utils.py 2KB

__init__.py 0B

models.py 2KB

model_manager.py 1KB

routes.py 2KB

static

main.js 7KB

css

template.css 599B

favicon.ico 11KB

images

grid-white.01b839c4.svg 2KB

mine.63c4a4b6.svg 1KB

torch.logo.svg 1KB

background-gradient.bd2e1d32.svg 388KB

grid.01b839c4.svg 2KB

turbofan_worker.py 3KB

logging.conf 342B

grid_node

__init__.py 0B

gunicorn.conf 142B

logging.conf 341B

app

__init__.py 1KB

templates

index.html 7KB

main

__init__.py 387B

persistence

utils.py 4KB

__init__.py 0B

models.py 2KB

model_manager.py 12KB

events

__init__.py 2KB

model_events.py 4KB

control_events.py 2KB

syft_events.py 2KB

auth

__init__.py 2KB

user_session.py 3KB

session_repository.py 2KB

routes.py 9KB

static

main.js 3KB

css

template.css 200B

favicon.ico 11KB

images

grid-white.01b839c4.svg 2KB

mine.63c4a4b6.svg 1KB

torch.logo.svg 1KB

background-gradient.bd2e1d32.svg 388KB

grid.01b839c4.svg 2KB

LICENSE 11KB

websocket_app.py 3KB

entrypoint.sh 126B

Dockerfile 301B

runtime.txt 13B

Procfile 56B

requirements.txt 318B

README.md 240B

entrypoint.sh 294B

engine_stats.js 760B

Dockerfile 186B

requirements.txt 364B

data_preprocessor.py 9KB

pysyft-notebook

entrypoint.sh 329B

Dockerfile 436B

README.md 222B

requirements.txt 20B

models

turbofan_initial.pt 17KB

.gitignore 316B

images

engine_diagram.png 41KB

engine_node.jpg 47KB

README.md 8KB

notebooks

data_analysis.ipynb 3.18MB

initial_training.ipynb 81KB

# Turbofan POC: Predictive Maintenance of Turbofan Engines using Federated Learning This repository shows a proof of concept (POC) of preventing machine outages using federated learning to continuously improve predictions of the remaining lifetime of aircraft gas turbine engines. ![Engine Crash](https://media.giphy.com/media/4OGPHOwyp6MO4/giphy.gif) For the engine emulation the "Turbofan Engine Degradation Simulation Data Set" from the NASA [1] is used. :rocket: The implementation is based on [PySyft](https://github.com/OpenMined/PySyft), an awesome library for encrypted, privacy preserving machine learning. ## The Use Case In this proof of concept the goal is to maintain aircraft gas turbine engines in time before they fail as failures of these engines are very expensive for the operating company. To achieve this we will predict the remaining useful life (RUL) for the engines and switch them into maintenance a few cycles before we think a failure will happen. This task is aggravated by the fact that this use case is from the perspective of the manufacturer of the engines who are selling them and has no direct access to the engines operating data as the operating companies consider this data as confidential. The manufacturer still wants to offer the described failure early warning system. There is some data available to the manufacturer from internal turbofan engines that will be used for training an initial machine learning model. All engines on the market will be expanded by a software component reading in the sensor measurements of the engine and predicting the RUL using this model and reacting on a low RUL with performing a maintenance. During a maintenance the theoretical moment of failure will be estimated by the maintenance staff, in this proof of concept the engine will be set in maintenance mode and the emulation data will continue to figure out the moment of failure. Complete data series up to a failure will then be used the regularly re-train the model to improve prediction quality over time. ## The Data ![Simplified diagram of turbofan engine [2]](images/engine_diagram.png) <sup>*Simplified diagram of turbofan engine [2]*</sup> The NASA dataset contains data on engine degradation that was simulated using C-MAPSS (Commercial Modular Aero-Propulsion System Simulation). Four different sets were simulated under different combinations of operational conditions and fault modes. Each set includes operational settings and sensor measurements (temperature, pressure, fan speed, etc.) for several engines for every cycle of their lifetime. For more information on the data see [2]. ## Prerequisites To emulate turbofan engines that can continue to run after a failure / maintenance we are combining multiple engine data series from the dataset to one set for each of our engine nodes. These series are then replayed by the engine nodes in sequence. To prepare the data for our POC it needs to be downloaded and split. We will work with the set "FD001" containing 100 engines for training and 100 engines for validation/testing. The train data is split into one subset for initial training (5 engine series) and 5 subsets for each of our engine nodes (19 engines series each). The test data is split into one subset for cross-validation (50 engine series) and one subset for evaluation (50 engine series). The `data_preprocessor` script is accomplishing all this for us. Ensure you have all requirements installed. ``` python data_preprocessor.py --turbofan_dataset_id=FD001 --engine_percentage_initial=5 --engine_percentage_val=50 --worker_count=5 ``` ## Data Analysis Now the project officially begins! :rocket: The first step is analysing the initial data we have centrally as the manufacturer to learn more about the data itself. See the [data analysis notebook](notebooks/data_analysis.ipynb). ## Initial Training The next step is to prepare the data for training and to design a model. Then an initial model is trained, evaluated and saved into the model directory. See the [initial training notebook](notebooks/initial_training.ipynb). ## Start the Engines Let´s start the engines! :cyclone: There is a full setup prepared using docker in the `docker-compose.yml`. It contains - a container for **jupyter notebooks** - a [PyGrid](https://github.com/OpenMined/PyGrid) **grid gateway** - 5 **engines** - a **federated trainer** The engine container consist of a custom engine node and a [PyGrid](https://github.com/OpenMined/PyGrid) grid node. The **engine node** is reading in the sensor data, controlling the engine state and predicting the RUL using the current model in the grid. The **federated trainer** is regularly checking the grid for enough new data and then starting a new federated learning round. After the round is finished the new model is served to the grid to be directly used by the engine nodes. ``` docker-compose up -d ``` The engine nodes expose an interface showing the engines state, stats and sensor values: **localhost:800[1-5]**. You can also checkout the interface of the grid node: **localhost:300[1-5]**. ![Engine Node](images/engine_node.jpg) Also checkout the logs of the federated trainer to see the federated training in action: ``` docker logs -f trainer ``` ## Pimp the Engines There are a lot of parameters in the `docker-compose.yml` and for serious results you need to adjust some of them. Here are the most important ones explained: - **CYCLE_LENGTH (engine)**: The amount of seconds one single engine cycle will take. Decrease to speed up the engine emulation. - **NEW_DATA_THRESHOLD (trainer)**: The federated trainer will wait for this amount of new data before starting a new training round. Increase to prevent training rounds with too few data. - **EPOCHS (trainer)**: The number of epochs the federated trainer is using for training. ## Bonus: Local Setup If you want to run the POC locally without docker, no problem. You can start all the nodes manually on your machine. Ensure you have all requirements installed and then start with launching the grid gateway. Checkout the [PyGrid](https://github.com/OpenMined/PyGrid) repository and go into the gateway directory. Then start the gateway like this: ``` python gateway.py --start_local_db --port=5000 ``` Next we need to start the grid nodes from the ./engine/grid_node directory. You can also use the grid node from your PyGrid repository but check the current parameters from the version you checked out. Execute the following command for every engine you want to have in your setup, of course changing the id and port. ``` python websocket_app.py --start_local_db --id=worker1 --port=3000 --gateway_url=http://localhost:5000 ``` Now the whole grid setup is working and waiting for commands so let´s continue launching the engines from the ./engine/engine_node directory: ``` python turbofan_worker.py --id=engine1 --port=8000 --grid_node_address=localhost:3000 --grid_gateway_address=localhost:5000 --data_dir=../../data --dataset_id=1 --cycle_length=1 ``` Launch as many engines as grid nodes you have, connecting them with the `grid_node_address` parameter. The engines directly start running, you can check this out on their web interface (http://localhost:8000). So the only piece still missing is the federated trainer. You can start it like this: ``` python federated_trainer.py --grid_gateway_address=localhost:5000 --new_data_threshold=250 --scheduler_interval=10 --epochs=70 --data_dir=../data --model_dir=../models ``` The setup is now ready and running! :tada: ### Building docker images If you want to build your own docker images with code changes you can easily use the `build-docker-images.sh` script. It will use a base image from docker hub with main dependencies to build an image for the engine, an image for the federated trainer and an image for the jupyter notebook environment. ## Join the PySyft Community! Join the rapidly

评论收藏

内容反馈

版权申诉