《電子技術(shù)應(yīng)用》
您所在的位置:首頁 > 人工智能 > 設(shè)計應(yīng)用 > 基于CNN-Transformer混合構(gòu)架的輕量圖像超分辨率方法
基于CNN-Transformer混合構(gòu)架的輕量圖像超分辨率方法
網(wǎng)絡(luò)安全與數(shù)據(jù)治理
林承浩,吳麗君
福州大學(xué)物理與信息工程學(xué)院
摘要: 針對基于混合構(gòu)架的圖像超分模型通常需要較高計算成本的問題,提出了一種基于CNN-Transformer混合構(gòu)架的輕量圖像超分網(wǎng)絡(luò)STSR(Swin Transformer based Single Image Super Resolution)。首先,提出了一種并行特征提取的特征增強(qiáng)模塊(Feature Enhancement Block,F(xiàn)EB),由卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN)和輕量型Transformer網(wǎng)絡(luò)并行地對輸入圖像進(jìn)行特征提取,再將提取到的特征進(jìn)行特征融合。其次,設(shè)計了一種動態(tài)調(diào)整模塊(Dynamic Adjustment,DA),使得網(wǎng)絡(luò)能根據(jù)輸入圖像來動態(tài)調(diào)整網(wǎng)絡(luò)的輸出,減少網(wǎng)絡(luò)對無關(guān)信息的依賴。最后,采用基準(zhǔn)數(shù)據(jù)集來測試網(wǎng)絡(luò)的性能,實驗結(jié)果表明STSR在降低模型參數(shù)量的前提下仍然保持較好的重建效果。
中圖分類號:TP391文獻(xiàn)標(biāo)識碼:ADOI:10.19358/j.issn.2097-1788.2024.03.005
引用格式:林承浩,吳麗君.基于CNN-Transformer混合構(gòu)架的輕量圖像超分辨率方法[J].網(wǎng)絡(luò)安全與數(shù)據(jù)治理,2024,43(3):27-33.
A lightweight image super resolution method based on a hybrid CNN-Transformer architecture
Lin Chenghao, Wu Lijun
School of Physics and Information Engineering, Fuzhou University
Abstract: In order to address the problem that image super segmentation models based on hybrid architectures usually require high computational cost, this study proposes a lightweight image super segmentation network STSR (Swin Transformer based Single Image Super Resolution) based on a hybrid CNN-Transformer architecture. Firstly, this paper proposes a Feature Enhancement Block (FEB) for parallel feature extraction, which consists of a Convolutional Neural Network (CNN) and a lightweight Transformer Network to extract features from the input image in parallel, and then the extracted features are fused to the features. Secondly, this paper designs a Dynamic Adjustment (DA) module, which enables the network to dynamically adjust the output of the network according to the input image, reducing the network's dependence on irrelevant information. Finally, some benchmark datasets are used to test the performance of the network, and the experimental results show that STSR still maintains a better reconstruction effect under the premise of reducing the number of model parameters.
Key words : image superresolution; lightweighting; Convolutional Neural Network; Transformer

引言

圖像超分辨率(Super Resolution, SR)是一項被廣泛關(guān)注的計算機(jī)視覺任務(wù),其目的是從低分辨率(Low Resolution, LR)圖像中重建出高質(zhì)量的高分辨率(High Resolution, HR)圖像[1]。由于建出高質(zhì)量的高分辨率圖像具有不適定的性質(zhì),因此極具挑戰(zhàn)性[2]。隨著深度學(xué)習(xí)等新興技術(shù)的崛起,許多基于卷積神經(jīng)網(wǎng)絡(luò)(CNN)的方法被引入到圖像超分任務(wù)中[3-6]。SRCNN[3]首次將卷積神經(jīng)網(wǎng)絡(luò)引入到圖像超分任務(wù)中,用卷積神經(jīng)網(wǎng)絡(luò)來學(xué)習(xí)圖像的特征表示,并通過卷積層的堆疊來逐步提取更高級別的特征,使得重建出的圖像具有較高的質(zhì)量。在后續(xù)研究中,Kaiming He等人提出了殘差結(jié)構(gòu)ResNet[5],通過引入跳躍連接,允許梯度能夠跨越層進(jìn)行傳播,有助于減輕梯度消失的問題,使得模型在較深的網(wǎng)絡(luò)情況下仍然能保持較好的性能。Bee Lim等人在EDSR[6]中也引入了殘差結(jié)構(gòu),EDSR實際上是SRResnet[7]的改進(jìn)版,去除了傳統(tǒng)殘差網(wǎng)絡(luò)中的BN層,在節(jié)省下來的空間中擴(kuò)展模型尺寸來增強(qiáng)表現(xiàn)力。RCAN[8]中提出了一種基于Residual in Residual結(jié)構(gòu)(RIR)和通道注意力機(jī)制(CA)的深度殘差網(wǎng)絡(luò)。雖然這些模型在當(dāng)時取得了較好的效果,但本質(zhì)上都是基于CNN網(wǎng)絡(luò)的模型,網(wǎng)絡(luò)中卷積核的大小會限制可以檢測的空間范圍,導(dǎo)致無法捕捉到長距離的依賴關(guān)系,意味著它們只能提取到局部特征,無法獲取全局的信息,不利于紋理細(xì)節(jié)的恢復(fù),使得圖像重建的效果不佳[5]。


本文詳細(xì)內(nèi)容請下載:

http://theprogrammingfactory.com/resource/share/2000005931


作者信息:

林承浩,吳麗君

福州大學(xué)物理與信息工程學(xué)院,福建福州350108


雜志訂閱.jpg

此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。