Jun 4, 2023
Parameter selection
May 31, 2023
Python framework (auto-tuner) requirement:
- tunable parameters and the config files they belong to
- execute command to run one experiment (assuming auto-tuner has put all tunable parameters in
configs)
- output dir path if any
May 24, 2023
Commands and paths:
HIBENCH_REPORT_PATH=~/HiBench/report/hibench.report
run_hibench=~/HiBench/bin/run_all.sh
WORDCOUNT_CONF_PATH=~/HiBenchconf/workloads/micro/wordcount.conf
HIBENCH_CONF_PATH=~/HiBench/conf/hibench.conf
May 11, 2023
现在有两种方案:
1. 用simulator 可以对yarn的参数进行修改 修改的参数文件格式为.SH格式,需要搭建框架,用
Hadoop systhesis load generator 生成模拟数据
a. 优点
i. 在自己电脑上运行,系统比较熟悉顺手
ii. 对hadoop有基本了解
b. 缺点
i. 只是yarn部件的模拟
2. 在阿里云和华为云上搭建hadoop
a. 优点
i. 可以跑数据,做出来的试验比较专业
b. 缺点
i. 为非图形界面,需要进一步熟悉
ii. 对框架在非图形界面的使用不了解
Preliminary work:
1. ML platform that has the functionalities of
a. generating configuration files and executing simulation jobs;
b. analyzing metrics from simulation outputs;
c. training and evaluating ML models;
d. sampling configuration uniformly or based on ML models.
Experiment design for baseline method:
1. Categorize jobs to large / mid / small sizes, mixed with json input files synthesized in hadoop.
Parameters of each category is specified.