# Stegosaurus
## A steganography tool for embedding payloads within Python bytecode.
Stegosaurus is a [steganography tool](https://en.wikipedia.org/wiki/Steganography)
that allows embedding arbitrary payloads in
Python bytecode (pyc or pyo) files. The embedding process does not alter the
runtime behavior or file size of the carrier file and typically results in a low
encoding density. The payload is dispersed throughout the bytecode so tools like
```strings``` will not show the actual payload. Python's ```dis``` module will
return the same results for bytecode before and after Stegosaurus is used to embed
a payload. At this time, no prior work or detection methods are known for this type
of payload delivery.
Stegosaurus requires Python 3.6 or later.
#### Usage
$ python3 -m stegosaurus -h
usage: stegosaurus.py [-h] [-p PAYLOAD] [-r] [-s] [-v] [-x] carrier
positional arguments:
carrier Carrier py, pyc or pyo file
optional arguments:
-h, --help show this help message and exit
-p PAYLOAD, --payload PAYLOAD
Embed payload in carrier file
-r, --report Report max available payload size carrier supports
-s, --side-by-side Do not overwrite carrier file, install side by side
instead.
-v, --verbose Increase verbosity once per use
-x, --extract Extract payload from carrier file
#### Example
Assume we wish to embed a payload in the bytecode of the following Python script, named example.py:
"""Example carrier file to embed our payload in.
"""
import math
def fibV1(n):
if n == 0 or n == 1:
return n
return fibV1(n - 1) + fibV1(n - 2)
def fibV2(n):
if n == 0 or n == 1:
return n
return int(((1 + math.sqrt(5))**n - (1 - math.sqrt(5))**n) / (2**n * math.sqrt(5)))
def main():
result1 = fibV1(12)
result2 = fibV2(12)
print(result1)
print(result2)
if __name__ == "__main__":
main()
The first step is to use Stegosaurus to see how many bytes our payload can contain without
changing the size of the carrier file.
$ python3 -m stegosaurus example.py -r
Carrier can support a payload of 20 bytes
We can now safely embed a payload of up to 20 bytes. To help show the before and after the
```-s``` option can be used to install the carrier file side by side with the untouched
bytecode:
$ python3 -m stegosaurus example.py -s --payload "root pwd: 5+3g05aW"
Payload embedded in carrier
Looking on disk, both the carrier file and original bytecode file have the same size:
$ ls -l __pycache__/example.cpython-36*
-rw-r--r-- 1 jherron staff 743 Mar 10 00:58 __pycache__/example.cpython-36-stegosaurus.pyc
-rw-r--r-- 1 jherron staff 743 Mar 10 00:58 __pycache__/example.cpython-36.pyc
_Note: If the ```-s``` option is omitted, the original bytecode would have been overwritten._
The payload can be extracted by passing the ```-x``` option to Stegosaurus:
$ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x
Extracted payload: root pwd: 5+3g05aW
The payload does not have to be an ascii string, shellcode is also supported:
$ python3 -m stegosaurus example.py -s --payload "\xeb\x2a\x5e\x89\x76"
Payload embedded in carrier
$ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x
Extracted payload: \xeb\x2a\x5e\x89\x76
To show that the runtime behavior of the Python code remains after Stegosaurus embeds the
payload:
$ python3 example.py
144
144
$ python3 __pycache__/example.cpython-36.pyc
144
144
$ python3 __pycache__/example.cpython-36-stegosaurus.pyc
144
144
Output of ```strings``` after Stegosaurus embeds the payload (notice the payload is
not shown):
$ python3 -m stegosaurus example.py -s --payload "PAYLOAD_IS_HERE"
Payload embedded in carrier
$ strings __pycache__/example.cpython-36-stegosaurus.pyc
.Example carrier file to embed our payload in.
fibV1)
example.pyr
math
sqrt)
fibV2
print)
result1
result2r
main
__main__)
__doc__r
__name__r
<module>
$ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x
Extracted payload: PAYLOAD_IS_HERE
Sample output of Python's ```dis``` module, which shows no difference before and after
Stegosaurus embeds its payload:
Before:
20 LOAD_GLOBAL 0 (int)
22 LOAD_CONST 2 (1)
24 LOAD_GLOBAL 1 (math)
26 LOAD_ATTR 2 (sqrt)
28 LOAD_CONST 3 (5)
30 CALL_FUNCTION 1
32 BINARY_ADD
34 LOAD_FAST 0 (n)
36 BINARY_POWER
38 LOAD_CONST 2 (1)
40 LOAD_GLOBAL 1 (math)
42 LOAD_ATTR 2 (sqrt)
44 LOAD_CONST 3 (5)
46 CALL_FUNCTION 1
48 BINARY_SUBTRACT
50 LOAD_FAST 0 (n)
52 BINARY_POWER
54 BINARY_SUBTRACT
56 LOAD_CONST 4 (2)
After:
20 LOAD_GLOBAL 0 (int)
22 LOAD_CONST 2 (1)
24 LOAD_GLOBAL 1 (math)
26 LOAD_ATTR 2 (sqrt)
28 LOAD_CONST 3 (5)
30 CALL_FUNCTION 1
32 BINARY_ADD
34 LOAD_FAST 0 (n)
36 BINARY_POWER
38 LOAD_CONST 2 (1)
40 LOAD_GLOBAL 1 (math)
42 LOAD_ATTR 2 (sqrt)
44 LOAD_CONST 3 (5)
46 CALL_FUNCTION 1
48 BINARY_SUBTRACT
50 LOAD_FAST 0 (n)
52 BINARY_POWER
54 BINARY_SUBTRACT
56 LOAD_CONST 4 (2)
#### Using Stegosaurus
Payloads, delivery and reciept methods are entirely up to the user. Stegosaurus only
provides the means to embed and extract paylods from a given Python bytecode file.
Due to the desire to leave file size intact, a relatively few number of bytes can be used to
deliver the payload. This may require spreading larger payloads across multiple bytecode
files, which has some advantages such as:
* Delivering a payload in pieces over time
* Portions of the payload can be spread over mutliple locations and joined when needed
* A single portion being compromised does not divulge the whole payload
* Thwarting detection of the entire payload by spreading it across multiple seemingly unrelated files
The means to spread large payloads across multiple Python bytecode files is not supported
as this moment, see TODOs.
#### How Stegosaurus Works
In order to embed a payload without increasing the file size, dead zones need to be identified
within the bytecode. A dead zone is defined as any byte which if changed will not impact the
behavior of the Python script. Python 3.6 introduced easy to exploit dead zones. Stepping back
though, a little history to set the stage.
Python's reference interpreter, CPython has two types of opcodes - those with arguments and
those without. In Python <= 3.5 instructions in the bytecode occupied either 1 or 3 bytes,
depending on if the opcode took an arugment or not. In Python 3.6 this was changed so that
all instructions occupy two bytes. Those without arguments simply set the second byte to zero
and it is ignored during execution. This means that for each instruction in the bytecode that
does not take an arugment, Stegosaurus can safely insert one byte of the payload.
Some examples of opcodes that do not take an argument:
BINARY_SUBTRACT
INPLACE_ADD
RETURN_VALUE
GET_ITER
YIELD_VALUE
IMPORT_STAR
END_FINALLY
NOP
...
To see an example of the changes in the bytecode, consider the following Python snippet:
def test(n):
return n + 5 + n - 3
Using ```dis```
stegosaurus(剑龙)
需积分: 5 189 浏览量
2022-09-14
16:49:17
上传
评论
收藏 5.4MB ZIP 举报
Deanshit
- 粉丝: 0
- 资源: 18
最新资源
- Python 程序语言设计模式思路-结构型模式:组合模式:将对象组合成树形结构
- 毕业设计基于python矩阵分解的推荐算法研究源码+详细文档+全部数据资料 高分项目.zip
- 基于网络的入侵检测系统源码+数据集+详细文档(高分毕业设计).zip
- 微信小程序源码 旅行故事分享 - 面包旅行App界面设计与文本展示资源下载
- 微信小程序源码 创意互动游戏 - 你画我猜App下载
- 摸底考试_学生版20230305.py
- 课程设计基于FPGA数字钟课程设计源码+课设报告(95分以上).zip
- 基于Java的企业家申报系统设计源码
- Cesium案例,集成各种模型,推演,各种Cesium效果
- 基于Python的Struts2全漏洞扫描利用工具设计源码
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
评论0