【免费】论文《AKConv:具有任意采样形状和任意数目参数的卷积核》翻译资源-CSDN文库

神经网络

卷积神经网络

需积分: 0 124 浏览量 2024-04-17 21:26:00 上传评论收藏 1.54MB DOCX 举报

资源推荐

资源详情

资源评论

AKConv: Convolutional Kernel with Arbitrary Sampled Shapes and

Arbitrary Number of Parameters

Abstract. Neural networks based on convolutional operations have achieved remarkable results

in the field of deep learning, but there are two inherent flaws in standard convolutional

operations. On the one hand, the convolution operation be confined to a local window and

cannot capture information from other locations, and its sampled shapes is fixed. On the other

hand, the size of the convolutional kernel is fixed to

𝑘

, which is a fixed square shape, and

the number of parameters tends to grow squarely with size. It is obvious that the shape and size

of targets are various in different datasets and at different locations. Convolutional kernels with

fixed sample shapes and squares do not adapt well to changing targets. In response to the above

questions, the Alterable Kernel Convolution (AKConv) is explored in this work, which gives

the convolution kernel an arbitrary number of parameters and arbitrary sampled shapes to

provide richer options for the trade-off between network overhead and performance. In

AKConv, we define initial positions for convolutional kernels of arbitrary size by means of a

new coordinate generation algorithm. To adapt to changes for targets, we introduce offsets to

adjust the shape of the samples at each position. Moreover, we explore the effect of the neural

network by using the AKConv with the same size and different initial sampled shapes. AKConv

completes the process of efficient feature extraction by irregular convolutional operations and

brings more exploration options for convolutional sampling shapes. Object detection

experiments on representative datasets COCO2017, VOC 7+12 and VisDrone-DET2021 fully

demonstrate the advantages of AKConv. AKConv can be used as a plug-and-play convolutional

operation to replace convolutional operations to improve network performance. The code for

the relevant tasks can be found at https://github.com/CV-ZhangXin/AKConv.

1 Introduction

Convolutional Neural Networks (CNNs), such as ResNet [1], DenseNet [2], and YOLO

[3], have demonstrated excellent performance in various applications and have led the

technological progress in many aspects of modern society. It has become indispensable from

image recognition in self-driving cars [4] and medical image analysis [5] to intelligent

surveillance [6] and personalized recommendation systems [7]. These successful network

models rely heavily on convolutional operations, which efficiently extract local features in

images and ensure model complexity.

Despite the fact that CNNs have achieved many successes in classification [8], object

detection [9], semantic segmentation [10], etc., they still have some limitations. One of the most

notable limitations concerns the choice of convolutional sample shape and size. Standard

convolution operations tend to rely on square kernels with fixed sampling locations, such as 1

× 1, 3 × 3, 5 × 5 and 7 × 7, etc. The sampling position of the regular kernel is not deformable

and cannot be dynamically changed in response to changes in the shape of the object.

Deformable Conv [11, 12] enhances network performance with offset to flexibly adjust the

sampling shape of the convolution kernel, which adapts to the change of the target. For instance,

in [13, 14, 15], they utilized it to to align features. Zhao et al. [16] improved the effectively of

detection the dead fish by adding it in YOLOv4 [17]. Yang et al. [18] improved the YOLOv8

[19] for detecting the cattle by adding it in backbone. Li et al. [20] introduced Deformable Conv

into deep image compression tasks [21, 22] to obtain content-adaptive receptive-fields.

Although the studies mentioned above have demonstrated the superior benefits

ofDeformable Conv. It is still not flexible enough. Because the convolution kernel is still

limited to select kernel-size, and the number of convolution kernel parameters in standard

convolutional operations and Deformable Conv shows a squared growth trend with the increase

of the convolution kernel size, which is not a friendly way of growth to the hardware

environment. Therefore, after careful analysis of standard convolution operations and

Deformable Conv, we propose Alterable Kernel Convolution (AKConv). Unlike standard

regular convolution, AKConv is a novel convolutional operations, which can extract features

using efficient convolution kernels with any number of parameters such as (1, 2, 3, 4, 5, 6, 7...),

which is not implemented by standard convolution and Deformable Convolution. AKConv can

easily be used to replace the standard convolutional operations in a network to improve network

performance. Importantly, AKConv allows the number of convolutional parameters to trend

linearly up or down, which is beneficial to hardware environments, and it can be used as an

alternative to lightweight models to reduce the number of model parameters and computational

overhead. Secondly, it has more options to improve the network performance in large kernels

with sufficient resources. Fig. 1 shows that the regular convolutional kernel makes the number

of parameters to show a square increasing trend, while AKConv only shows a linear increasing

trend. Compared to the square growth trend, AKConv grows gently and provides more options

for the choice of convolution kernel. Furthermore, its ideas can be extended to specific areas.

Because, the special sampled shapes can be created for convolution operations according to the

prior knowledge, and then dynamically and automatically adapt to changes in the target shape

via offset. Object detection experiments on representative datasets VOC [23], COCO2017 [24],

VisDrone-DET2021 [25] fully demonstrate the advantages of AKConv. In summary, our

contributions are as follows:

1. For different sizes of convolutional kernels, we propose an algorithm to generate initial

sampled coordinate for convolutional kernels of arbitrary sizes.

2. To adapt to the different variations of the target, we adjust the sampling position of the

irregular convolutional kernel by the obtained offsets.

3. Compared to regular convolution kernels, the proposed AKConv realizes the function

of irregular convolution kernels to extract features, providing convolution kernels with arbitrary

sampling shapes and sizes for a variety of varying targets, which makes up for the shortcomings

of regular convolutions.

2 Related works

In recent years, many works have considered and analyzed standard convolutional

operations from different perspectives, and then designed novel convolutional operations to

improve network performance.

Li et al. [26] argued that convolutional kernels sharing parameters across all spatial

locations, which leads to limited modeling capabilities across different spatial locations, and do

not effectively capture spatially long-range relationships. Secondly, the approach of using a

different convolution kernel for each output channel is actually not efficient. Therefore, to

address these shortcomings, they proposed the Involution operator, which inverts the features

of the convolutional operation to improve network performance. Qi et al. [27] proposed the

DSConv based on Deformable Conv. The offset obtained from learning in Deformable Conv is

freedom, leading to the model losing a small percentage of fine structure features, which poses

a great challenge for the task of segmenting elongated tubular structures, therefore, they

proposed the DSConv. Zhang et al. [28] understood the spatial attention mechanism form a new

perspective, they asserted that the spatial attention mechanism essentially solves the problem

of parameter sharing of convolutional operations. However, some spatial attention mechanisms,

such as CBAM [29] and CA [30], not completely solve the problem of large-size convolutional

parameter sharing. Therefore, they proposed RFAConv. Chen et al. [31] proposed the Dynamic

Conv. Unlike using a convolutional kernel for every layers, the Dynamic Conv dynamically

aggregated multiple parallel convolutional kernels based on their attention. The Dynamic Conv

provided greater representation of features. Tan et al. [32] argued that kernel size is often

neglected in CNNS, which may affect the accuracy and efficiency of the network. Second,

using only layer-by-layer convolution does not utilize the full potential of convolutional

networks. Therefore, they proposed MixConv, which naturally mixes multiple kernel sizes in a

single convolution to improve performance of networks.

Although these methods improve the performance of convolutional operations, they are

still limited to regular convolutional operations and do not allow multiple variations of

convolutional sample shapes. In contrast, our proposed AKConv can efficiently extract features

using a convolutional kernel with arbitrary number of parameters and sample shapes.

3 Methods

3.1 Define the initial sampling position

Convolutional neural networks are based on the convolution operation, which localizes

the features at the corresponding locations by means of a regular sampling grid. In [11, 33, 34],

the regular sampling grid for the 3 × 3 convolution operation is given. Let R denote the sampling

grid, then R is denoted as follows:

However, the sampling grid is regular, while AKConv targets irregularly shaped

convolutional kernels. Therefore, to allow irregular convolutional kernels to have a sampling

grid, we create an algorithm for arbitrary size convolution, which generates the initial sampling

coordinates of the convolutional kernel Pn. First, we generate the sampling grid as a regular

sampling grid, then the irregular grids is created for the remaining sampling points, and finally,

we stitch them to generate the overall sampling grid. The pseudo code is as in Algorithm 1.

As shown in Fig. 2, it shown that the initial sampled coordinates is generated for arbitrary

size convolution. The sampling grid of the regular convolution is centered at the (0, 0) point.

While the irregular convolution has no center at many sizes, to adapt to the size of the

convolution used, we set the upper left corner (0, 0) point as the sampling origin in the algorithm.

After defining the initial coordinates Pn for the irregular convolution, the corresponding

convolution operation at position P0 can be defined as follows:

Here, w denotes the convolutional parameter. However, the irregular convolution

operations are impossible to realize, because irregular sampling coordinates cannot be matched

to the corresponding size convolution operations, e.g., convolution of sizes 5, 7, and 13.

Cleverly, our proposed AKConv realizes it.

3.2 Alterable convolutional operation

It is obvious that the standard convolutional sampling position is fixed, which leads to the

convolution can only extract the local information of the current window, and can not capture

the information of other positions. Deformable Conv learns the offsets through convolutional

operations to adjust the sampling grid of the initial regular pattern. The approach compensates

for the shortcomings of the convolution operation to a certain extent. However, the standard

convolution and Deformable Conv are regular sampling grids that not allow convolution

kernels with arbitrary number of parameters. Moreover, as the size of the convolution kernel

increases their number of convolution parameters tends to increase by a square, which is not

friendly for the hardware environment. Therefore, we propose a novel Alterable convolutional

operation (AKConv). As shown in Fig. 3, it illustrates the overall structure of an AKConv of

size 5.

Similar to Deformable Conv, in AKConv, the offset of the corresponding kernel are first

obtained by convolution operations, which has the dimensions (B, 2N, H, W), where N is the

convolution kernel size. Take Fig. 3 as an example, N = 5. Then the modified coordinates are

剩余25页未读，继续阅读

评论收藏

内容反馈

ProgrammerMonkey

粉丝: 42
资源: 37

论文《AKConv:具有任意采样形状和任意数目参数的卷积核》翻译

CNN.rar_卷积核_卷积神经_卷积神经网络 MATLAB_卷积网络_神经网络采样

数字信号处理仿真：滤波器 采样定理 卷积演示

基于全采样和L1范数降采样的卷积神经网络图像分类方法.pdf

基本上采样和下采样代码：基本上采样和下采样代码-matlab开发

直接射频合成发射机中一种任意采样率变换的FPGA实现.pdf

Simulink永磁同步电机控制仿真：单电阻采样时序及具体实现

图像降采样和升采样Matlab代码

无功补偿策略探（三）：基于任意电容器分组比例的无功补偿优化策略

mkmatlab代码-dda-sampling:任意笛卡尔采样的工具

论文研究-改进卷积自编码器的局部特征描述算法.pdf

论文研究 - 对与随机采样有关的误差进行反卷积

感兴趣区域欠采样MRI重建：一种深度卷积神经网络方法

vaeac:具有任意条件的变分自动编码器

实验报告 时域采样与频域采样

论文研究-基于NetFlow的特征感知自适应的流采样方法.pdf

几种PWM控制方法[1].doc。让你运用自如

eccv2020_paperlist:我的论文摘要

论文研究-MSOLA:基于多维分层采样的大数据在线聚集技术.pdf

图像显著性检测算法matlab代码-FES:使用稀疏采样和核密度估计快速有效地进行显着性检测的代码

相关实用应用程序（Windows可用）

免费可用的ChatGPT网页版.zip

ChatGPT使用总结：150个ChatGPT提示词模板（完整版）

chromedriver-win64.zip

全国计算机二级WPSoffice精选350道选择题题库（含答案）.pdf

哈尔滨工业大学-ChatGPT调研报告-2023.3.6-94页.pdf

2023泛娱乐社交出海手册-ZEGO即构科技

4个亲测好用的ChatGPT4渠道

HAI-2024斯坦福AI指数报告（中文译版）.pdf

学术海报模板+论文科研+研究生

最新资源

数字信号处理仿真：滤波器采样定理卷积演示

实验报告时域采样与频域采样