Target Independent Opportunities:
//===---------------------------------------------------------------------===//
We should recognized various "overflow detection" idioms and translate them into
llvm.uadd.with.overflow and similar intrinsics. Here is a multiply idiom:
unsigned int mul(unsigned int a,unsigned int b) {
if ((unsigned long long)a*b>0xffffffff)
exit(0);
return a*b;
}
The legalization code for mul-with-overflow needs to be made more robust before
this can be implemented though.
//===---------------------------------------------------------------------===//
Get the C front-end to expand hypot(x,y) -> llvm.sqrt(x*x+y*y) when errno and
precision don't matter (ffastmath). Misc/mandel will like this. :) This isn't
safe in general, even on darwin. See the libm implementation of hypot for
examples (which special case when x/y are exactly zero to get signed zeros etc
right).
//===---------------------------------------------------------------------===//
On targets with expensive 64-bit multiply, we could LSR this:
for (i = ...; ++i) {
x = 1ULL << i;
into:
long long tmp = 1;
for (i = ...; ++i, tmp+=tmp)
x = tmp;
This would be a win on ppc32, but not x86 or ppc64.
//===---------------------------------------------------------------------===//
Shrink: (setlt (loadi32 P), 0) -> (setlt (loadi8 Phi), 0)
//===---------------------------------------------------------------------===//
Reassociate should turn things like:
int factorial(int X) {
return X*X*X*X*X*X*X*X;
}
into llvm.powi calls, allowing the code generator to produce balanced
multiplication trees.
First, the intrinsic needs to be extended to support integers, and second the
code generator needs to be enhanced to lower these to multiplication trees.
//===---------------------------------------------------------------------===//
Interesting? testcase for add/shift/mul reassoc:
int bar(int x, int y) {
return x*x*x+y+x*x*x*x*x*y*y*y*y;
}
int foo(int z, int n) {
return bar(z, n) + bar(2*z, 2*n);
}
This is blocked on not handling X*X*X -> powi(X, 3) (see note above). The issue
is that we end up getting t = 2*X s = t*t and don't turn this into 4*X*X,
which is the same number of multiplies and is canonical, because the 2*X has
multiple uses. Here's a simple example:
define i32 @test15(i32 %X1) {
%B = mul i32 %X1, 47 ; X1*47
%C = mul i32 %B, %B
ret i32 %C
}
//===---------------------------------------------------------------------===//
Reassociate should handle the example in GCC PR16157:
extern int a0, a1, a2, a3, a4; extern int b0, b1, b2, b3, b4;
void f () { /* this can be optimized to four additions... */
b4 = a4 + a3 + a2 + a1 + a0;
b3 = a3 + a2 + a1 + a0;
b2 = a2 + a1 + a0;
b1 = a1 + a0;
}
This requires reassociating to forms of expressions that are already available,
something that reassoc doesn't think about yet.
//===---------------------------------------------------------------------===//
These two functions should generate the same code on big-endian systems:
int g(int *j,int *l) { return memcmp(j,l,4); }
int h(int *j, int *l) { return *j - *l; }
this could be done in SelectionDAGISel.cpp, along with other special cases,
for 1,2,4,8 bytes.
//===---------------------------------------------------------------------===//
It would be nice to revert this patch:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20060213/031986.html
And teach the dag combiner enough to simplify the code expanded before
legalize. It seems plausible that this knowledge would let it simplify other
stuff too.
//===---------------------------------------------------------------------===//
For vector types, DataLayout.cpp::getTypeInfo() returns alignment that is equal
to the type size. It works but can be overly conservative as the alignment of
specific vector types are target dependent.
//===---------------------------------------------------------------------===//
We should produce an unaligned load from code like this:
v4sf example(float *P) {
return (v4sf){P[0], P[1], P[2], P[3] };
}
//===---------------------------------------------------------------------===//
Add support for conditional increments, and other related patterns. Instead
of:
movl 136(%esp), %eax
cmpl $0, %eax
je LBB16_2 #cond_next
LBB16_1: #cond_true
incl _foo
LBB16_2: #cond_next
emit:
movl _foo, %eax
cmpl $1, %edi
sbbl $-1, %eax
movl %eax, _foo
//===---------------------------------------------------------------------===//
Combine: a = sin(x), b = cos(x) into a,b = sincos(x).
Expand these to calls of sin/cos and stores:
double sincos(double x, double *sin, double *cos);
float sincosf(float x, float *sin, float *cos);
long double sincosl(long double x, long double *sin, long double *cos);
Doing so could allow SROA of the destination pointers. See also:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17687
This is now easily doable with MRVs. We could even make an intrinsic for this
if anyone cared enough about sincos.
//===---------------------------------------------------------------------===//
quantum_sigma_x in 462.libquantum contains the following loop:
for(i=0; i<reg->size; i++)
{
/* Flip the target bit of each basis state */
reg->node[i].state ^= ((MAX_UNSIGNED) 1 << target);
}
Where MAX_UNSIGNED/state is a 64-bit int. On a 32-bit platform it would be just
so cool to turn it into something like:
long long Res = ((MAX_UNSIGNED) 1 << target);
if (target < 32) {
for(i=0; i<reg->size; i++)
reg->node[i].state ^= Res & 0xFFFFFFFFULL;
} else {
for(i=0; i<reg->size; i++)
reg->node[i].state ^= Res & 0xFFFFFFFF00000000ULL
}
... which would only do one 32-bit XOR per loop iteration instead of two.
It would also be nice to recognize the reg->size doesn't alias reg->node[i],
but this requires TBAA.
//===---------------------------------------------------------------------===//
This isn't recognized as bswap by instcombine (yes, it really is bswap):
unsigned long reverse(unsigned v) {
unsigned t;
t = v ^ ((v << 16) | (v >> 16));
t &= ~0xff0000;
v = (v << 24) | (v >> 8);
return v ^ (t >> 8);
}
//===---------------------------------------------------------------------===//
[LOOP DELETION]
We don't delete this output free loop, because trip count analysis doesn't
realize that it is finite (if it were infinite, it would be undefined). Not
having this blocks Loop Idiom from matching strlen and friends.
void foo(char *C) {
int x = 0;
while (*C)
++x,++C;
}
//===---------------------------------------------------------------------===//
[LOOP RECOGNITION]
These idioms should be recognized as popcount (see PR1488):
unsigned countbits_slow(unsigned v) {
unsigned c;
for (c = 0; v; v >>= 1)
c += v & 1;
return c;
}
unsigned int popcount(unsigned int input) {
unsigned int count = 0;
for (unsigned int i = 0; i < 4 * 8; i++)
count += (input >> i) & i;
return count;
}
This should be recognized as CLZ: https://github.com/llvm/llvm-project/issues/64167
unsigned clz_a(unsigned a) {
int i;
for (i=0;i<32;i++)
if (a & (1<<(31-i)))
return i;
return 32;
}
This sort of thing should be added to the loop idiom pass.
//===---------------------------------------------------------------------===//
These should turn into single 16-bit (unaligned?) loads on little/big endian
processors.
unsigned short read_16_le(const unsigned char *adr) {
return adr[0] | (adr[1] << 8);
}
unsigned short read_16_be(const unsigned char *adr) {
return (adr[0] << 8) | adr[1];
}
//===---------------------------------------------------------------------===//
-instcombine should handle this transform:
icmp pred (sdiv X / C1 ), C2
when X, C1, and C2 are unsigned. Similarly for udiv and signed operands.
Currently InstCombine avoids this transform but will do it when the signs of
the operands and
没有合适的资源?快使用搜索试试~ 我知道了~
llvm代码-AddressSanitizer
共2000个文件
h:1018个
cpp:778个
txt:134个
需积分: 0 0 下载量 124 浏览量
2023-10-06
22:08:55
上传
评论
收藏 254.1MB ZIP 举报
温馨提示
内存检测 AddressSanitizer的源码位于/lib/Transforms/Instrumentation/AddressSanitizer.cpp中(函数插装),Runtime-library的源码在llvm的另一个项目compiler-rt的/lib/asan文件夹中(运行时库,替换分配释放函数)。
资源推荐
资源详情
资源评论
收起资源包目录
llvm代码-AddressSanitizer (2000个子文件)
SemaOpenMP.cpp 962KB
CGBuiltin.cpp 859KB
SemaExpr.cpp 859KB
SemaDecl.cpp 798KB
SemaDeclCXX.cpp 733KB
SemaChecking.cpp 733KB
SemaOverload.cpp 648KB
ExprConstant.cpp 593KB
CGOpenMPRuntime.cpp 514KB
ASTContext.cpp 502KB
SemaTemplate.cpp 464KB
SemaInit.cpp 432KB
SemaCodeComplete.cpp 390KB
ASTImporter.cpp 377KB
SemaType.cpp 371KB
SemaExprCXX.cpp 362KB
SemaDeclAttr.cpp 345KB
CGStmtOpenMP.cpp 339KB
Clang.cpp 338KB
CGObjCMac.cpp 304KB
ParseDecl.cpp 296KB
CodeGenModule.cpp 284KB
SemaTemplateDeduction.cpp 260KB
SemaTemplateInstantiateDecl.cpp 255KB
Driver.cpp 255KB
ItaniumMangle.cpp 246KB
CGCall.cpp 230KB
CGDebugInfo.cpp 228KB
CGExpr.cpp 228KB
SemaLookup.cpp 221KB
TokenAnnotator.cpp 219KB
CGExprScalar.cpp 213KB
SemaDeclObjC.cpp 212KB
CFG.cpp 210KB
Decl.cpp 201KB
ItaniumCXXABI.cpp 192KB
Expr.cpp 187KB
SemaExprObjC.cpp 185KB
SemaStmt.cpp 184KB
ParseDeclCXX.cpp 183KB
MicrosoftCXXABI.cpp 182KB
ParseOpenMP.cpp 176KB
SemaTemplateInstantiate.cpp 174KB
CGObjCGNU.cpp 172KB
StdLibraryFunctionsChecker.cpp 165KB
Format.cpp 164KB
Lexer.cpp 161KB
Type.cpp 160KB
CGObjC.cpp 155KB
UnwrappedLineParser.cpp 152KB
CGOpenMPRuntimeGPU.cpp 151KB
MicrosoftMangle.cpp 151KB
ParseExprCXX.cpp 150KB
RecordLayoutBuilder.cpp 146KB
ExprEngine.cpp 145KB
ParseExpr.cpp 144KB
VTableBuilder.cpp 140KB
ParsePragma.cpp 140KB
PPDirectives.cpp 135KB
DeclCXX.cpp 135KB
MallocChecker.cpp 132KB
ParseObjc.cpp 132KB
Gnu.cpp 131KB
RangeConstraintManager.cpp 130KB
SemaCast.cpp 130KB
X86.cpp 129KB
Darwin.cpp 129KB
BugReporterVisitors.cpp 126KB
ContinuationIndenter.cpp 119KB
CGClass.cpp 119KB
SemaObjCProperty.cpp 119KB
BugReporter.cpp 116KB
CodeGenFunction.cpp 111KB
CGStmt.cpp 111KB
StmtOpenMP.cpp 109KB
RegionStore.cpp 108KB
CGDecl.cpp 107KB
CGBlocks.cpp 107KB
Sema.cpp 105KB
ModuleMap.cpp 103KB
CStringChecker.cpp 102KB
UnsafeBufferUsage.cpp 101KB
AnalysisBasedWarnings.cpp 100KB
Parser.cpp 100KB
OpenMPClause.cpp 98KB
SemaLambda.cpp 96KB
ParseStmt.cpp 96KB
CommonArgs.cpp 94KB
CGExprCXX.cpp 92KB
ThreadSafety.cpp 91KB
ASTStructuralEquivalence.cpp 90KB
CGAtomic.cpp 89KB
SourceManager.cpp 89KB
CGExprConstant.cpp 89KB
DeclObjC.cpp 86KB
CGException.cpp 86KB
CGExprAgg.cpp 84KB
ODRDiagsEmitter.cpp 83KB
TypePrinter.cpp 82KB
ParseTentative.cpp 81KB
共 2000 条
- 1
- 2
- 3
- 4
- 5
- 6
- 20
资源评论
gyk2333
- 粉丝: 0
- 资源: 7
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于python的高性能爬虫程序,使用了多线程+缓存+xpath实现的,这里以彼-岸图库为例,实现,仅用于学习交流
- 中分辨率成像光谱仪(MODIS)烧毁面积产品信息MODIS-C6-BA-User-Guide-1.2.pdf
- Screenshot_20240427_172613_com.huawei.browser.jpg
- 关于学习Python的相关资源网站链接及相关介绍.docx
- (HAL库)基于STM32F103C8T6的温控PID系统[Dht11、ESP8266、无线透传、L298N……]
- VoLTE高丢包优化指导书.xlsx
- Rust资源文件.zip
- 前后端分离实践:使用 React 和 Express 搭建完整登录注册流程
- gradle-publish-to-MavenLocal.zip
- 10份网络优化创新案例.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功