移植到DSP上的X264程序资源-CSDN文库

共85个文件

h：25个

c：25个

obj：23个

4星 · 超过85%的资源需积分: 9 9 浏览量 2010-08-26 00:18:27 上传评论 2 收藏 1.51MB RAR 举报

《移植到DSP上的X264程序：深入解析与实践》 X264是一种高效、高质量的H.264视频编码库，被广泛应用于各种视频处理和传输场景。其核心是基于运动补偿和离散余弦变换的熵编码算法，通过优化的算法实现，能够在保证视频质量的同时，显著降低视频数据的存储和传输需求。本篇将围绕如何将X264程序移植到Digital Signal Processor (DSP) —— Texas Instruments的DM642上进行详细阐述。 DM642是一款高性能的TMS320C64x+ DSP芯片，具有强大的浮点运算能力，特别适合处理复杂的视频编码任务。为了在DM642上运行X264，我们需要对X264源代码进行适配，使其符合DSP的指令集和内存模型。这通常涉及到指令级别的优化，包括使用向量指令、减少内存访问以及优化循环结构。 CCS3.3（Code Composer Studio）是TI提供的集成开发环境，用于编写、调试和优化针对TI DSP的应用程序。在CCS中，我们可以创建一个新的工程，将X264源代码导入，并配置编译器选项以适应DM642的目标平台。编译过程中，可能会遇到数据类型不兼容、内存分配问题等，这些问题都需要根据DM642的特性进行修正。 X264的编码函数库是整个系统的核心部分，它包含了编码流程中的所有关键步骤，如帧内预测、运动估计、量化、熵编码等。在移植过程中，我们需要确保这些函数在DSP上能够正确执行。例如，运动估计通常需要大量的浮点运算，DM642的浮点单元可以提供足够的性能支持；而量化过程可能涉及到位宽转换，需要考虑定点运算的精度和溢出问题。在实现编码库后，我们还需要设计一个接口，使得上层应用程序能够调用X264的功能。这个接口应该简洁易用，同时考虑到实时性和效率。例如，可以提供一个函数接收原始视频帧，然后返回编码后的码流。此外，还需考虑错误处理机制，确保在异常情况下能够优雅地退出。对于性能优化，我们可以利用CCS的性能分析工具，找出程序中的瓶颈并针对性地进行改进。这可能涉及到多线程编程、任务调度和内存管理等高级技术，以充分利用DM642的多核和并行处理能力。在实际应用中，可能还需要考虑与其他硬件组件的交互，如视频输入/输出接口、外部存储设备等。这些都需要根据具体的硬件平台进行适配，确保数据流的顺畅。将X264移植到DSP上是一项涉及编译技术、程序优化、系统集成等多个方面的复杂工作。但通过精心的设计和优化，我们可以将这一先进的视频编码技术应用到嵌入式环境中，为视频处理系统带来更高的性能和更低的功耗。

资源推荐

资源详情

资源评论

收起资源包目录

x264lib_origin.rar （85个子文件）

0x264lib_origin

eval.c 6KB

pixel.c 19KB

bs.h 9KB

predict.h 3KB

dct.h 5KB

encoder_set.h 1KB

frame.h 5KB

cpu.h 2KB

cpu.c 5KB

getopt.h 6KB

encoder_macroblock.c 29KB

set.c 9KB

macroblock.h 11KB

me.c 32KB

ratecontrol.h 2KB

dct.c 18KB

rdo.c 19KB

x264lib_origin.paf2 12KB

config.h 182B

frame.c 33KB

common.c 29KB

macroblock.c 57KB

common.h 19KB

csp.c 14KB

mc.h 3KB

cavlc.c 26KB

quant.c 6KB

Debug

predict.obj 99KB

frame.obj 134KB

quant.obj 55KB

analyse.obj 1.02MB

encoder_set.obj 84KB

getopt.obj 28KB

macroblock.obj 176KB

csp.obj 65KB

eval.obj 27KB

encoder.obj 193KB

ratecontrol.obj 186KB

common.obj 112KB

set.obj 73KB

encoder_cabac.obj 122KB

dct.obj 93KB

x264lib_origin.lib 1.03MB

mdate.obj 6KB

encoder_macroblock.obj 108KB

cabac.obj 56KB

mc.obj 94KB

pixel.obj 117KB

cpu.obj 41KB

cavlc.obj 106KB

me.obj 251KB

mc.c 16KB

cabac.c 44KB

cabac.h 2KB

cc_build_Debug.log 28KB

x264.h 16KB

me.h 2KB

predict.c 28KB

analyse.h 1KB

x264lib_origin.sbl 9KB

vlc.h 30KB

encoder_macroblock.h 3KB

analyse.c 98KB

encoder_cabac.c 37KB

clip1.h 3KB

osdep.h 4KB

pixel.h 4KB

encoder.c 67KB

encoder_set.c 20KB

x264lib_origin.pjt 1KB

stdint.h 6KB

x264lib_origin.CS_

SYMBOL.CDX 218KB

SYMBOL.FPT 342KB

FILE.FPT 3KB

FILE.DBF 3KB

SYMBOL.DBF 174KB

FILE.CDX 3KB

set.h 5KB

ratecontrol.c 60KB

Debug.lkf 145B

mdate.c 2KB

csp.h 1KB

getopt.c 13KB

quant.h 2KB

slicetype.c 19KB

/***************************************************************************** * analyse.c: h264 encoder library ***************************************************************************** * Copyright (C) 2003 x264 project * $Id: analyse.c,v 1.1 2004/06/03 19:27:08 fenrir Exp $ * * Authors: Laurent Aimar <fenrir@via.ecp.fr> * Loren Merritt <lorenm@u.washington.edu> * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA. *****************************************************************************/ #include <stdio.h> #include <string.h> #include <math.h> #include <limits.h> #ifndef _MSC_VER //#include <unistd.h> #endif #include "common.h" #include "encoder_macroblock.h" #include "me.h" #include "ratecontrol.h" #include "analyse.h" #include "rdo.c" typedef struct { /* 16x16 */ int i_ref; int i_rd16x16; x264_me_t me16x16; /* 8x8 */ int i_cost8x8; int mvc[32][5][2]; /* [ref][0] is 16x16 mv, [ref][1..4] are 8x8 mv from partition [0..3] */ x264_me_t me8x8[4]; /* Sub 4x4 */ int i_cost4x4[4]; /* cost per 8x8 partition */ x264_me_t me4x4[4][4]; /* Sub 8x4 */ int i_cost8x4[4]; /* cost per 8x8 partition */ x264_me_t me8x4[4][2]; /* Sub 4x8 */ int i_cost4x8[4]; /* cost per 8x8 partition */ x264_me_t me4x8[4][2]; /* 16x8 */ int i_cost16x8; x264_me_t me16x8[2]; /* 8x16 */ int i_cost8x16; x264_me_t me8x16[2]; } x264_mb_analysis_list_t; typedef struct { /* conduct the analysis using this lamda and QP */ int i_lambda; int i_lambda2; int i_qp; int16_t *p_cost_mv; int b_mbrd; /* I: Intra part */ /* Take some shortcuts in intra search if intra is deemed unlikely */ int b_fast_intra; int b_try_pskip; /* Luma part */ int i_satd_i16x16; int i_satd_i16x16_dir[7]; int i_predict16x16; int i_satd_i8x8; int i_satd_i8x8_dir[12][4]; int i_predict8x8[4]; int i_satd_i4x4; int i_predict4x4[16]; /* Chroma part */ int i_satd_i8x8chroma; int i_satd_i8x8chroma_dir[4]; int i_predict8x8chroma; /* II: Inter part P/B frame */ x264_mb_analysis_list_t l0; x264_mb_analysis_list_t l1; int i_cost16x16bi; /* used the same ref and mv as l0 and l1 (at least for now) */ int i_cost16x16direct; int i_cost8x8bi; int i_cost8x8direct[4]; int i_cost16x8bi; int i_cost8x16bi; int i_rd16x16bi; int i_rd16x16direct; int i_rd16x8bi; int i_rd8x16bi; int i_rd8x8bi; int i_mb_partition16x8[2]; /* mb_partition_e */ int i_mb_partition8x16[2]; int i_mb_type16x8; /* mb_class_e */ int i_mb_type8x16; int b_direct_available; } x264_mb_analysis_t; /* lambda = pow(2,qp/6-2) */ static const int i_qp0_cost_table[52] = { 1, 1, 1, 1, 1, 1, 1, 1, /* 0-7 */ 1, 1, 1, 1, /* 8-11 */ 1, 1, 1, 1, 2, 2, 2, 2, /* 12-19 */ 3, 3, 3, 4, 4, 4, 5, 6, /* 20-27 */ 6, 7, 8, 9,10,11,13,14, /* 28-35 */ 16,18,20,23,25,29,32,36, /* 36-43 */ 40,45,51,57,64,72,81,91 /* 44-51 */ }; /* pow(lambda,2) * .9 */ static const int i_qp0_cost2_table[52] = { 1, 1, 1, 1, 1, 1, /* 0-5 */ 1, 1, 1, 1, 1, 1, /* 6-11 */ 1, 1, 1, 2, 2, 3, /* 12-17 */ 4, 5, 6, 7, 9, 11, /* 18-23 */ 14, 18, 23, 29, 36, 46, /* 24-29 */ 58, 73, 91, 115, 145, 183, /* 30-35 */ 230, 290, 366, 461, 581, 731, /* 36-41 */ 922,1161,1463,1843,2322,2926, /* 42-47 */ 3686,4645,5852,7373 }; /* TODO: calculate CABAC costs */ static const int i_mb_b_cost_table[X264_MBTYPE_MAX] = { 9, 9, 9, 9, 0, 0, 0, 1, 3, 7, 7, 7, 3, 7, 7, 7, 5, 9, 0 }; static const int i_mb_b16x8_cost_table[17] = { 0, 0, 0, 0, 0, 0, 0, 0, 5, 7, 7, 7, 5, 7, 9, 9, 9 }; static const int i_sub_mb_b_cost_table[13] = { 7, 5, 5, 3, 7, 5, 7, 3, 7, 7, 7, 5, 1 }; static const int i_sub_mb_p_cost_table[4] = { 5, 3, 3, 1 }; static void x264_analyse_update_cache( x264_t *h, x264_mb_analysis_t *a ); /* initialize an array of lambda*nbits for all possible mvs */ static void x264_mb_analyse_load_costs( x264_t *h, x264_mb_analysis_t *a ) { static int16_t *p_cost_mv[52]; if( !p_cost_mv[a->i_qp] ) { /* could be faster, but isn't called many times */ /* factor of 4 from qpel, 2 from sign, and 2 because mv can be opposite from mvp */ int i; p_cost_mv[a->i_qp] = x264_malloc( (4*4*2048 + 1) * sizeof(int16_t) ); p_cost_mv[a->i_qp] += 2*4*2048; for( i = 0; i <= 2*4*2048; i++ ) { p_cost_mv[a->i_qp][-i] = p_cost_mv[a->i_qp][i] = a->i_lambda * bs_size_se( i ); } } a->p_cost_mv = p_cost_mv[a->i_qp]; } static void x264_mb_analyse_init( x264_t *h, x264_mb_analysis_t *a, int i_qp ) { memset( a, 0, sizeof( x264_mb_analysis_t ) ); /* conduct the analysis using this lamda and QP */ a->i_qp = h->mb.i_qp = i_qp; h->mb.i_chroma_qp = i_chroma_qp_table[x264_clip3( i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )]; a->i_lambda = i_qp0_cost_table[i_qp]; a->i_lambda2 = i_qp0_cost2_table[i_qp]; a->b_mbrd = h->param.analyse.i_subpel_refine >= 6 && ( h->sh.i_type != SLICE_TYPE_B || h->param.analyse.b_bframe_rdo ); h->mb.i_me_method = h->param.analyse.i_me_method; h->mb.i_subpel_refine = h->param.analyse.i_subpel_refine; h->mb.b_chroma_me = h->param.analyse.b_chroma_me && h->sh.i_type == SLICE_TYPE_P && h->mb.i_subpel_refine >= 5; h->mb.b_trellis = h->param.analyse.i_trellis > 1 && a->b_mbrd; h->mb.b_transform_8x8 = 0; h->mb.b_noise_reduction = 0; /* I: Intra part */ a->i_satd_i16x16 = a->i_satd_i8x8 = a->i_satd_i4x4 = a->i_satd_i8x8chroma = COST_MAX; a->b_fast_intra = 0; /* II: Inter part P/B frame */ if( h->sh.i_type != SLICE_TYPE_I ) { int i, j; int i_fmv_range = 4 * h->param.analyse.i_mv_range; // limit motion search to a slightly smaller range than the theoretical limit, // since the search may go a few iterations past its given range int i_fpel_border = 5; // umh unconditional radius int i_spel_border = 8; // 1.5 for subpel_satd, 1.5 for subpel_rd, 2 for bime, round up /* Calculate max allowed MV range */ #define CLIP_FMV(mv) x264_clip3( mv, -i_fmv_range, i_fmv_range ) h->mb.mv_min[0] = 4*( -16*h->mb.i_mb_x - 24 ); h->mb.mv_max[0] = 4*( 16*( h->sps->i_mb_width - h->mb.i_mb_x - 1 ) + 24 ); h->mb.mv_min_spel[0] = CLIP_FMV( h->mb.mv_min[0] ); h->mb.mv_max_spel[0] = CLIP_FMV( h->mb.mv_max[0] ); h->mb.mv_min_fpel[0] = (h->mb.mv_min_spel[0]>>2) + i_fpel_border; h->mb.mv_max_fpel[0] = (h->mb.mv_max_spel[0]>>2) - i_fpel_border; if( h->mb.i_mb_x == 0) { int mb_y = h->mb.i_mb_y >> h->sh.b_mbaff; int mb_height = h->sps->i_mb_height >> h->sh.b_mbaff; int thread_mvy_range = i_fmv_range; if( h->param.i_threads > 1 ) { int pix_y = (h->mb.i_mb_y | h->mb.b_interlaced) * 16; int thresh = pix_y + h->param.analyse.i_mv_range_thread; for( i = (h->sh.i_type == SLICE_TYPE_B);

评论收藏

内容反馈

WheatField

2012-09-10

编译速度还好,偶在ccs3.3上也通过编译了。不过使用的时候比较麻烦，看看x264.h这个头文件就知道了，最好能再简化一下
cecilio_gqx

2014-07-09

为什么我没有编译通呢
njtl0925

2013-05-20

ccs编译顺利通过，谢谢，就是不知道x264的编码大家一般用什么解码
mayerf

2013-07-23

CCS上可以用，不错的资料。
lianggucas

2014-09-02

还不知道如何调用，有个上级调用程序更好