• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:张菁

Refining:

Source

Submit Unfold

Co-Author

Submit Unfold

Language

Submit

Clean All

Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 20 >
Unpaved road segmentation of UAV imagery via a global vision transformer with dilated cross window self-attention for dynamic map SCIE
期刊论文 | 2024 | VISUAL COMPUTER
Abstract&Keyword Cite

Abstract :

Road segmentation is a fundamental task for dynamic map in unmanned aerial vehicle (UAV) path navigation. In unplanned, unknown and even damaged areas, there are usually unpaved roads with blurred edges, deformations and occlusions. These challenges of unpaved road segmentation pose significant challenges to the construction of dynamic maps. Our major contributions have: (1) Inspired by dilated convolution, we propose dilated cross window self-attention (DCWin-Attention), which is composed of a dilated cross window mechanism and a pixel regional module. Our goal is to model the long-range horizontal and vertical road dependencies for unpaved roads with deformation and blurred edges. (2) A shifted cross window mechanism is introduced through coupling with DCWin-Attention to reduce the influence of occluded roads in UAV imagery. In detail, the GVT backbone is constructed by using the DCWin-Attention block for multilevel deep features with global dependency. (3) The unpaved road is segmented with the confidence map generated by fusing the deep features of different levels in a unified perceptual parsing network. We verify our method on the self-established BJUT-URD dataset and public DeepGlobe dataset, which achieves 67.72 and 52.67% of the highest IoU at proper inference efficiencies of 2.7, 2.8 FPS, respectively, demonstrating its effectiveness and superiority in unpaved road segmentation. Our code is available at https://github.com/BJUT-AIVBD/GVT-URS.

Keyword :

Unpaved road segmentation Unpaved road segmentation Dynamic map Dynamic map UAV imagery UAV imagery Global vision transformer Global vision transformer DCWin-attention DCWin-attention

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Wensheng , Zhang, Jing , Li, Jiafeng et al. Unpaved road segmentation of UAV imagery via a global vision transformer with dilated cross window self-attention for dynamic map [J]. | VISUAL COMPUTER , 2024 .
MLA Li, Wensheng et al. "Unpaved road segmentation of UAV imagery via a global vision transformer with dilated cross window self-attention for dynamic map" . | VISUAL COMPUTER (2024) .
APA Li, Wensheng , Zhang, Jing , Li, Jiafeng , Zhuo, Li . Unpaved road segmentation of UAV imagery via a global vision transformer with dilated cross window self-attention for dynamic map . | VISUAL COMPUTER , 2024 .
Export to NoteExpress RIS BibTex
LCMA-Net: Alight cross-modal attention network for streamer re-identification in live video SCIE
期刊论文 | 2024 , 249 | COMPUTER VISION AND IMAGE UNDERSTANDING
Abstract&Keyword Cite

Abstract :

With the rapid expansion of the we-media industry, streamers have increasingly incorporated inappropriate content into live videos to attract traffic and pursue interests. Blacklisted streamers often forge their identities or switch platforms to continue streaming, causing significant harm to the online environment. Consequently, streamer re-identification (re-ID) has become of paramount importance. Streamer biometrics in live videos exhibit multimodal characteristics, including voiceprints, faces, and spatiotemporal information, which complement each other. Therefore, we propose alight cross-modal attention network (LCMA-Net) for streamer re-ID in live videos. First, the voiceprint, face, and spatiotemporal features of the streamer are extracted by RawNetSA, Pi- Net, and STDA-ResNeXt3D, respectively. We then design alight cross-modal pooling attention (LCMPA) module, which, combined with a multilayer perceptron (MLP), aligns and concatenates different modality features into multimodal features within the LCMA-Net. Finally, the streamer is re-identified by measuring the similarity between these multimodal features. Five experiments were conducted on the StreamerReID dataset, and the results demonstrated that the proposed method achieved competitive performance. The dataset and code are available at https://github.com/BJUT-AIVBD/LCMA-Net.

Keyword :

Live video Live video Light cross-modal attention network Light cross-modal attention network Re-identification Re-identification Light cross-modal pooling attention Light cross-modal pooling attention Streamer Streamer

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Yao, Jiacheng , Zhang, Jing , Zhang, Hui et al. LCMA-Net: Alight cross-modal attention network for streamer re-identification in live video [J]. | COMPUTER VISION AND IMAGE UNDERSTANDING , 2024 , 249 .
MLA Yao, Jiacheng et al. "LCMA-Net: Alight cross-modal attention network for streamer re-identification in live video" . | COMPUTER VISION AND IMAGE UNDERSTANDING 249 (2024) .
APA Yao, Jiacheng , Zhang, Jing , Zhang, Hui , Zhuo, Li . LCMA-Net: Alight cross-modal attention network for streamer re-identification in live video . | COMPUTER VISION AND IMAGE UNDERSTANDING , 2024 , 249 .
Export to NoteExpress RIS BibTex
HDUD-Net: heterogeneous decoupling unsupervised dehaze network SCIE
期刊论文 | 2024 , 36 (6) , 2695-2711 | NEURAL COMPUTING & APPLICATIONS
Abstract&Keyword Cite

Abstract :

Haze reduces the imaging effectiveness of outdoor vision systems, significantly degrading the quality of images; hence, reducing haze has been a focus of many studies. In recent years, decoupled representation learning has been applied in image processing; however, existing decoupled networks lack a specific design for information with different characteristics to achieve satisfactory results in dehazing tasks. This study proposes a heterogeneous decoupling unsupervised dehazing network (HDUD-Net). Heterogeneous modules are used to learn the content and haze information of images individually to separate them effectively. To address the problem of information loss when extracting the content from hazy images with complex noise, this study proposes a bi-branch multi-hierarchical feature fusion module. Additionally, it proposes a style feature contrast learning method to generate positive and negative sample queues and construct contrast loss for enhancing decoupling performance. Numerous experiments confirm that the proposed algorithm achieves higher performance according to objective metrics and a more realistic visual effect when compared with state-of-the-art single-image dehazing algorithms.

Keyword :

Unsupervised learning Unsupervised learning Single image dehazing Single image dehazing Image restoration Image restoration Image enhancement Image enhancement

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Jiafeng , Kuang, Lingyan , Jin, Jiaqi et al. HDUD-Net: heterogeneous decoupling unsupervised dehaze network [J]. | NEURAL COMPUTING & APPLICATIONS , 2024 , 36 (6) : 2695-2711 .
MLA Li, Jiafeng et al. "HDUD-Net: heterogeneous decoupling unsupervised dehaze network" . | NEURAL COMPUTING & APPLICATIONS 36 . 6 (2024) : 2695-2711 .
APA Li, Jiafeng , Kuang, Lingyan , Jin, Jiaqi , Zhuo, Li , Zhang, Jing . HDUD-Net: heterogeneous decoupling unsupervised dehaze network . | NEURAL COMPUTING & APPLICATIONS , 2024 , 36 (6) , 2695-2711 .
Export to NoteExpress RIS BibTex
Cell cluster detection of thyroid FNAB-WSI via deformable convolution with frequency channel attention SCIE
期刊论文 | 2024 , 100 | BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Abstract&Keyword Cite

Abstract :

Background and Objective: Cell cluster detection of thyroid fine needle aspiration biopsy (FNAB) in whole-slide image (WSI) is significant for improving the diagnostic efficiency and accuracy of thyroid cancer. For ultraresolution, small object size, and sparse irregularity of cell cluster in thyroid FNAB-WSI, we propose a cell cluster detection method via deformable convolution with frequency channel attention (FCA). Methods: Firstly, an adaptive data augmentation (ADA) module is used to classify the patch images after cropping with sliding window to activate different augmentation operations to solve the problem of easy loss of cell cluster objects. By combining ResNeXt101 backbone with feature pyramid network (FPN) to extract the multi-scale features of cell cluster, we add deformable convolutional DCNv2 and FCA to achieve cell cluster feature refinement. Finally, the improved Sparse R-CNN model with sparse learnable proposals is adopted to detect cell clusters in thyroid FNAB-WSI. Results: The dataset contains approximately 6020 patch images, with 3612 for training, 1204 for validation, and 1204 for testing. Experiments results demonstrate that our method wins the highest average detection accuracy of 95.4 % on the self-built thyroid FNAB-WSI dataset, which is 2.9 % higher than that of SOTA. Since feature extraction weighs on model consumption, yielding a balanced 12FPS, model acceleration is a future work. Overall, our cell cluster detection method has a positive impact on the efficiency and accuracy of thyroid cancer diagnosis. Significance: The proposed method can be applied as a fast and accurate computer-aided method for thyroid cancer diagnosis in clinical practice.

Keyword :

Deformable convolution Deformable convolution Frequency channel attention Frequency channel attention Thyroid FNAB-WSI Thyroid FNAB-WSI Sparse R -CNN Sparse R -CNN Cell cluster detection Cell cluster detection

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Sun, Meng , Zhang, Jing , Zhao, Shimei et al. Cell cluster detection of thyroid FNAB-WSI via deformable convolution with frequency channel attention [J]. | BIOMEDICAL SIGNAL PROCESSING AND CONTROL , 2024 , 100 .
MLA Sun, Meng et al. "Cell cluster detection of thyroid FNAB-WSI via deformable convolution with frequency channel attention" . | BIOMEDICAL SIGNAL PROCESSING AND CONTROL 100 (2024) .
APA Sun, Meng , Zhang, Jing , Zhao, Shimei , Li, Xiaoguang , Zhuo, Li . Cell cluster detection of thyroid FNAB-WSI via deformable convolution with frequency channel attention . | BIOMEDICAL SIGNAL PROCESSING AND CONTROL , 2024 , 100 .
Export to NoteExpress RIS BibTex
When zero-padding position encoding encounters linear space reduction attention: an efficient semantic segmentation Transformer of remote sensing images SCIE
期刊论文 | 2024 , 45 (2) , 609-633 | INTERNATIONAL JOURNAL OF REMOTE SENSING
WoS CC Cited Count: 1
Abstract&Keyword Cite

Abstract :

Semantic segmentation of remote sensing images (RSIs) is of great significance for obtaining geospatial object information. Transformers win promising effect, whereas multi-head self-attention (MSA) is expensive. We propose an efficient semantic segmentation Transformer (ESST) of RSIs that combines zero-padding position encoding with linear space reduction attention (LSRA). First, to capture the coarse-to-fine features of RSI, a zero-padding position encoding is proposed by adding overlapping patch embedding (OPE) layers and convolution feed-forward networks (CFFN) to improve the local continuity of features. Then, we replace LSRA in the attention operation to extract multi-level features to reduce the computational cost of the encoder. Finally, we design a lightweight all multi-layer perceptron (all-MLP) head decoder to easily aggregate multi-level features to generate multi-scale features for semantic segmentation. Experimental results demonstrate that our method produces a trade-off in accuracy and speed for semantic segmentation of RSIs on the Potsdam and Vaihingen datasets, respectively.

Keyword :

semantic segmentation semantic segmentation All-MLP All-MLP Remote sensing images Remote sensing images Transformer Transformer linear space reduction attention linear space reduction attention Zero-padding position encoding Zero-padding position encoding

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Yan, Yi , Zhang, Jing , Wu, Xinjia et al. When zero-padding position encoding encounters linear space reduction attention: an efficient semantic segmentation Transformer of remote sensing images [J]. | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2024 , 45 (2) : 609-633 .
MLA Yan, Yi et al. "When zero-padding position encoding encounters linear space reduction attention: an efficient semantic segmentation Transformer of remote sensing images" . | INTERNATIONAL JOURNAL OF REMOTE SENSING 45 . 2 (2024) : 609-633 .
APA Yan, Yi , Zhang, Jing , Wu, Xinjia , Li, Jiafeng , Zhuo, Li . When zero-padding position encoding encounters linear space reduction attention: an efficient semantic segmentation Transformer of remote sensing images . | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2024 , 45 (2) , 609-633 .
Export to NoteExpress RIS BibTex
低光图像的增强方法、装置、电子设备及存储介质 incoPat
专利 | 2023-01-09 | CN202310028246.9
Abstract&Keyword Cite

Abstract :

本发明提供一种低光图像的增强方法、装置、电子设备及存储介质,该方法包括:获取待增强的低光图像;将所述低光图像输入至图像分解网络,得到所述低光图像对应的第一反射率分量图及第一光照分量图;将所述第一反射率分量图及所述第一光照分量图输入至反射率调整网络,得到第二反射率分量图,并将所述第一光照分量图输入至光照调整网络,得到第二光照分量图;根据所述第二反射率分量图及所述第二光照分量图,得到所述低光图像对应的增强图像。该方法利用图像分解网络,可准确分解低光图像,并利用反射率调整网络和光照调整网络,以从粗到细的方式调整分解后的低光图像,得到对应的反射率分量和光照分量,进而可有效提高获取增强图像的准确性。

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 李嘉锋 , 郝帅 , 况玲艳 et al. 低光图像的增强方法、装置、电子设备及存储介质 : CN202310028246.9[P]. | 2023-01-09 .
MLA 李嘉锋 et al. "低光图像的增强方法、装置、电子设备及存储介质" : CN202310028246.9. | 2023-01-09 .
APA 李嘉锋 , 郝帅 , 况玲艳 , 张菁 , 卓力 . 低光图像的增强方法、装置、电子设备及存储介质 : CN202310028246.9. | 2023-01-09 .
Export to NoteExpress RIS BibTex
Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting SCIE
期刊论文 | 2023 , 61 | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Abstract&Keyword Cite

Abstract :

With the development of high-resolution remote sensing images (HR-RSIs) and the escalating demand for intelligent analysis, fine-grained recognition of geospatial objects has become a more practical and challenging task. Although deep learning-based object recognition has achieved superior performance, it is inflexible to be directly utilized to the fine-grained object recognition (FGOR) tasks of HR-RSIs under the limitation of the size of geospatial objects. An efficient fine-grained object recognition method in HR-RSIs from knowledge distillation (KL) to filter grafting is proposed. Specifically, fine-grained object recognition consists of two stages: Stage 1 utilizes oriented region convolutional neural network (oriented R-CNN) to accurately locate and preliminarily classify geospatial objects. At the same time, it serves as a teacher network to guide students' effective learning of fine-grained object recognition; in Stage 2, we design a coarse-to-fine object recognition network (CF-ORNet), as the second teacher network, which realizes fine-grained recognition through feature learning and category correction. After that, we propose a lightweight model from knowledge distillation to filter grafting on two teacher networks to achieve efficient fine-grained object recognition. The experimental results on Vehicle Detection in Aerial Imagery (VEDAI) and HR Ship Collection 2016 (HRSC2016) datasets achieve competitive performance.

Keyword :

knowledge distillation knowledge distillation high-resolution remote sensing image (HR-RSI) high-resolution remote sensing image (HR-RSI) Coarse-to-fine object recognition network (CF-ORNet) Coarse-to-fine object recognition network (CF-ORNet) filter grafting filter grafting fine-grained object recognition (FGOR) fine-grained object recognition (FGOR)

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang, Liuqian , Zhang, Jing , Tian, Jimiao et al. Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting [J]. | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2023 , 61 .
MLA Wang, Liuqian et al. "Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting" . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 61 (2023) .
APA Wang, Liuqian , Zhang, Jing , Tian, Jimiao , Li, Jiafeng , Zhuo, Li , Tian, Qi . Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2023 , 61 .
Export to NoteExpress RIS BibTex
BARRN: A Blind Image Compression Artifact Reduction Network for Industrial IoT Systems SCIE
期刊论文 | 2023 , 19 (9) , 9479-9490 | IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
Abstract&Keyword Cite

Abstract :

Most industrial Internet of Things (IoT) devices reduce the capture image size using high-ratio joint photographic experts group (JPEG) compression, saving storage space, and transmission bandwidth consumption. However, the resulting compression artifacts considerably affect the accuracy of subsequent tasks. Most artifact reduction algorithms do not consider the limitations of storage space and computing power of edge devices. In this study, a blind artifact reduction recurrent network (BARRN), which can reduce compression artifacts when the quality factors are unknown, is proposed. First, a structure based on recurrent convolution is designed for the specific requirements of industrial IoT image acquisition devices; the network can be scaled according to system resource constraints. Second, a more efficient convolution group, capable of adaptively processing different degradation levels, is proposed for optimal use of the limited computational resources. The experimental results demonstrate that the proposed BARRN can meet the needs of industrial systems with high computational efficiency.

Keyword :

Artifact reduction (AR) Artifact reduction (AR) image restoration image restoration blind JPEG compression recurrent convolution blind JPEG compression recurrent convolution industrial Internet of Things (IoT) industrial Internet of Things (IoT)

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Jiafeng , Liu, Xiaoyu , Gao, Yuqi et al. BARRN: A Blind Image Compression Artifact Reduction Network for Industrial IoT Systems [J]. | IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS , 2023 , 19 (9) : 9479-9490 .
MLA Li, Jiafeng et al. "BARRN: A Blind Image Compression Artifact Reduction Network for Industrial IoT Systems" . | IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 19 . 9 (2023) : 9479-9490 .
APA Li, Jiafeng , Liu, Xiaoyu , Gao, Yuqi , Zhuo, Li , Zhang, Jing . BARRN: A Blind Image Compression Artifact Reduction Network for Industrial IoT Systems . | IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS , 2023 , 19 (9) , 9479-9490 .
Export to NoteExpress RIS BibTex
Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention SCIE
期刊论文 | 2023 , 23 (19) , 22930-22943 | IEEE SENSORS JOURNAL
Abstract&Keyword Cite

Abstract :

RGB-thermal (RGB-T) dual-modal imaging significantly broadens the observation dimensions of the vision system. However, effectively harnessing the inherent advantages of different spectral bands and establishing fusion solutions tightly coupled with end tasks remains highly challenging. This article proposes a modality fusion approach that combines channel switching and cross-modal attention for RGB-T tracking. We explore the hierarchical fusion method adapted to the deep features of different abstraction levels. For low-level features, cross-modal information is introduced to increase the diversity of unimodal data by swapping feature channels with low computational costs. To exploit the semantic representation of high-level deep features and heterogeneous information in multimodal data, a fusion structure based on modal mutual attention is designed, which achieves effective enhancement of RGB-T fusion feature representation by integrating modal self-attention and cross-modal attention. Experimental results on public datasets show that the proposed algorithm is effective and computationally efficient to obtain the state-of-the-art tracking performance and real-time processing.

Keyword :

modal mutual attention modal mutual attention RGB-thermal (RGB-T) tracking RGB-thermal (RGB-T) tracking multimodal fusion multimodal fusion object fusion tracking object fusion tracking Channel swapping Channel swapping

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Luan, Tian , Zhang, Hui , Li, Jiafeng et al. Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention [J]. | IEEE SENSORS JOURNAL , 2023 , 23 (19) : 22930-22943 .
MLA Luan, Tian et al. "Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention" . | IEEE SENSORS JOURNAL 23 . 19 (2023) : 22930-22943 .
APA Luan, Tian , Zhang, Hui , Li, Jiafeng , Zhang, Jing , Zhuo, Li . Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention . | IEEE SENSORS JOURNAL , 2023 , 23 (19) , 22930-22943 .
Export to NoteExpress RIS BibTex
Cascade Transformer Decoder Based Occluded Pedestrian Detection With Dynamic Deformable Convolution and Gaussian Projection Channel Attention Mechanism SCIE
期刊论文 | 2023 , 25 , 1529-1537 | IEEE TRANSACTIONS ON MULTIMEDIA
Abstract&Keyword Cite

Abstract :

Occluded pedestrian detection is very challenging in computer vision, because the pedestrians are frequently occluded by various obstacles or persons, especially in crowded scenarios. In this article, an occluded pedestrian detection method is proposed under a basic DEtection TRansformer (DETR) framework. Firstly, Dynamic Deformable Convolution (DyDC) and Gaussian Projection Channel Attention (GPCA) mechanism are proposed and embedded into the low layer and high layer of ResNet50 respectively, to improve the representation capability of features. Secondly, Cascade Transformer Decoder (CTD) is proposed, which aims to generate high-score queries, avoiding the influence of low-score queries in the decoder stage, further improving the detection accuracy. The proposed method is verified on three challenging datasets, namely CrowdHuman, WiderPerson, and TJU-DHD-pedestrian. The experimental results show that, compared with the state-of-the-art methods, it can obtain a superior detection performance.

Keyword :

Cascade transformer decoder Cascade transformer decoder Feature extraction Feature extraction occluded pedestrian detection occluded pedestrian detection Object detection Object detection Convolution Convolution dynamic deformable convolution dynamic deformable convolution Task analysis Task analysis Gaussian project channel attention mechanism Gaussian project channel attention mechanism Decoding Decoding Transformers Transformers Kernel Kernel

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ma, Chunjie , Zhuo, Li , Li, Jiafeng et al. Cascade Transformer Decoder Based Occluded Pedestrian Detection With Dynamic Deformable Convolution and Gaussian Projection Channel Attention Mechanism [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2023 , 25 : 1529-1537 .
MLA Ma, Chunjie et al. "Cascade Transformer Decoder Based Occluded Pedestrian Detection With Dynamic Deformable Convolution and Gaussian Projection Channel Attention Mechanism" . | IEEE TRANSACTIONS ON MULTIMEDIA 25 (2023) : 1529-1537 .
APA Ma, Chunjie , Zhuo, Li , Li, Jiafeng , Zhang, Yutong , Zhang, Jing . Cascade Transformer Decoder Based Occluded Pedestrian Detection With Dynamic Deformable Convolution and Gaussian Projection Channel Attention Mechanism . | IEEE TRANSACTIONS ON MULTIMEDIA , 2023 , 25 , 1529-1537 .
Export to NoteExpress RIS BibTex
10| 20| 50 per page
< Page ,Total 20 >

Export

Results:

Selected

to

Format:
Online/Total:1392/5228234
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.