您的检索:
学者姓名:尹宝才
精炼检索结果:
年份
成果类型
收录类型
来源
综合
合作者
语言
清除所有精炼条件
摘要 :
Inductive Knowledge Graph Completion (KGC) poses challenges due to the absence of emerging entities during training. Current methods utilize Graph Neural Networks (GNNs) to learn and propagate entity representations, achieving notable performance. However, these approaches primarily focus on chain-based logical rules, limiting their ability to capture the rich semantics of knowledge graphs. To address this challenge, we propose to generate Graph-based Rules for Enhancing Logical Reasoning (GRELR), a novel framework that leverages graph-based rules for enhanced reasoning. GRELR formulates graph-based rules by extracting relevant subgraphs and fuses them to construct comprehensive relation representations. This approach, combined with subgraph reasoning, significantly improves inference capabilities and showcases the potential of graph-based rules in inductive KGC. To demonstrate the effectiveness of the GRELR framework, we conduct experiments on three benchmark datasets, and our approach achieves state-of-the-art performance.
关键词 :
Graph-based Rules Graph-based Rules Knowledge Graphs Knowledge Graphs Inductive Knowledge Graph Completion Inductive Knowledge Graph Completion Subgraph reasoning Subgraph reasoning
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Sun, Kai , Jiang, Huajie , Hu, Yongli et al. Generating Graph-Based Rules for Enhancing Logical Reasoning [J]. | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 : 143-156 . |
MLA | Sun, Kai et al. "Generating Graph-Based Rules for Enhancing Logical Reasoning" . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 14873 (2024) : 143-156 . |
APA | Sun, Kai , Jiang, Huajie , Hu, Yongli , Yin, Baocai . Generating Graph-Based Rules for Enhancing Logical Reasoning . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 , 143-156 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
With the widespread adoption of deep learning, the performance of Visual Question Answering (VQA) tasks has seen significant improvements. Nonetheless, this progress has unveiled significant challenges concerning their credibility, primarily due to the susceptibility of linguistic biases. Such biases can result in considerable declines in performance when faced with out-of-distribution scenarios. Therefore, various debiasing methods have been developed to reduce the impact of linguistic biases, where causal theory-based methods have attracted great attention due to their theoretical underpinnings and superior performance. However, traditional debiased causal strategies typically remove biases through simple subtraction, which neglects the fine-grained bias information, resulting in incomplete debiasing. To tackle this issue, we propose a fine-grained debiasing method named as VQA-PDF, which utilizes the features of the base model to guide the identification of biased features, purifying the debiased features and aiding the base learning process. This method has shown significant improvements on VQA-CP v2, VQA v2 and VQA-CE datasets.
关键词 :
Visual Question Answering Visual Question Answering Language Bias Language Bias Causal Strategy Causal Strategy
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Bi, Yandong , Jiang, Huajie , Liu, Jing et al. VQA-PDF: Purifying Debiased Features for Robust Visual Question Answering Task [J]. | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 : 264-277 . |
MLA | Bi, Yandong et al. "VQA-PDF: Purifying Debiased Features for Robust Visual Question Answering Task" . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 14873 (2024) : 264-277 . |
APA | Bi, Yandong , Jiang, Huajie , Liu, Jing , Liu, Mengting , Hu, Yongli , Yin, Baocai . VQA-PDF: Purifying Debiased Features for Robust Visual Question Answering Task . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 , 264-277 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Referring Image Segmentation (RIS) is an essential topic in visual language understanding that aims to segment the target instance in the image referred to by the language description. Conventional RIS methods have relied on expensive manual annotations involving the triplet (image-text-mask), with the acquisition of text annotations posing the most formidable challenge. To eliminate the heavy dependence on human annotations, we propose a novel RIS method, the Referring Image Segmentation without Text Annotations (WoTA), which substitutes textual annotations by generating the pseudo-query through the utilization of visual information. Specifically, we design a novel training-testing scheme that introduces a Pseudo-Query Generation Scheme (PQGS) in the training phase, which relies on the pre-trained cross-modal knowledge in CLIP to generate the pseudo-query related to global and local visual information. In the testing phase, the CLIP text encoder is directly applied to the test statements to generate real query language features. Extensive experiments on several benchmark datasets demonstrate the advantage of the proposed WoTA over several zero-shot baselines of the task and even the weakly supervised referring image segmentation method.
关键词 :
Without Text Annotation Without Text Annotation Pseudo-Query Pseudo-Query Referring Image Segmentation Referring Image Segmentation
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Liu, Jing , Jiang, Huajie , Bi, Yandong et al. Referring Image Segmentation Without Text Annotations [J]. | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 : 278-293 . |
MLA | Liu, Jing et al. "Referring Image Segmentation Without Text Annotations" . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 14873 (2024) : 278-293 . |
APA | Liu, Jing , Jiang, Huajie , Bi, Yandong , Hu, Yongli , Yin, Baocai . Referring Image Segmentation Without Text Annotations . | ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024 , 2024 , 14873 , 278-293 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Long Document Classification (LDC) has attracted great attention in Natural Language Processing and achieved considerable progress owing to the large-scale pre-trained language models. In spite of this, as a different problem from the traditional text classification, LDC is far from being settled. Long documents, such as news and articles, generally have more than thousands of words with complex structures. Moreover, compared with flat text, long documents usually contain multi-modal content of images, which provide rich information but not yet being utilized for classification. In this article, we propose a novel cross-modal method for long document classification, in which multiple granularity feature shifting networks are proposed to integrate the multi-scale text and visual features of long documents adaptively. Additionally, a multi-modal collaborative pooling block is proposed to eliminate redundant fine-grained text features and simultaneously reduce the computational complexity. To verify the effectiveness of the proposed model, we conduct experiments on the Food101 dataset and two constructed multi-modal long document datasets. The experimental results show that the proposed cross-modal method outperforms the single-modal text methods and defeats the state-of-the-art related multi-modal baselines.
关键词 :
Long document classification Long document classification multi-modal collaborative pooling multi-modal collaborative pooling cross-modal multi-granularity interactive fusion cross-modal multi-granularity interactive fusion
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Liu, Tengfei , Hu, Yongli , Gao, Junbin et al. Cross-modal Multiple Granularity Interactive Fusion Network for Long Document Classification [J]. | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2024 , 18 (4) . |
MLA | Liu, Tengfei et al. "Cross-modal Multiple Granularity Interactive Fusion Network for Long Document Classification" . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 18 . 4 (2024) . |
APA | Liu, Tengfei , Hu, Yongli , Gao, Junbin , Sun, Yanfeng , Yin, Baocai . Cross-modal Multiple Granularity Interactive Fusion Network for Long Document Classification . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2024 , 18 (4) . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Vehicle behavior analysis has gradually developed by utilizing trajectories and motion features to characterize on-road behavior. However, the existing methods analyze the behavior of each vehicle individually, ignoring the interaction between vehicles. According to the theory of interactive cognition, vehicle-to-vehicle interaction is an indispensable feature for future autonomous driving, just as interaction is universally required for traditional driving. Therefore, we place the vehicle behavior analysis in the context of the vehicle interaction scene, where the self-vehicle should observe the behavior category and degree of the other-vehicle that is about to interact with itself, in order to predict whether the other-vehicle will pass through the intersection first or later, and then decide to pass through or wait. Inspired by the interactive cognition, we develop a general framework of Structured Vehicle Behavior Analysis (StruVBA) and derive a new model of Structured Fully Convolutional Networks (StruFCN). Moreover, both Intersection over Union (IoU) and False Negative Rate (FNR) are adopted to measure the similarity between the predicted behavior degree and the ground truth. Experimental results illustrate that the proposed method achieves higher prediction accuracy than most existing methods, while predicting vehicle behavior with richer visual meaning. In addition, it also provides an example of modeling the interaction between vehicles and a verification for interaction cognition theory as well.
关键词 :
Cognition Cognition vehicle-to-vehicle interaction vehicle-to-vehicle interaction Structured vehicle behavior analysis Structured vehicle behavior analysis Analytical models Analytical models Roads Roads Junctions Junctions Vehicular ad hoc networks Vehicular ad hoc networks structured fully convolutional networks structured fully convolutional networks structured label structured label Trajectory Trajectory interactive cognition interactive cognition Turning Turning
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Mou, Luntian , Xie, Haitao , Mao, Shasha et al. Image-Based Structured Vehicle Behavior Analysis Inspired by Interactive Cognition [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 : 9121-9134 . |
MLA | Mou, Luntian et al. "Image-Based Structured Vehicle Behavior Analysis Inspired by Interactive Cognition" . | IEEE TRANSACTIONS ON MULTIMEDIA 26 (2024) : 9121-9134 . |
APA | Mou, Luntian , Xie, Haitao , Mao, Shasha , Yan, Dandan , Ma, Nan , Yin, Baocai et al. Image-Based Structured Vehicle Behavior Analysis Inspired by Interactive Cognition . | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 , 9121-9134 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Although object detection algorithms based on deep learning have been widely used in many scenarios, they face challenges under some degraded conditions, such as low-light. A conventional solution is that image enhancement approaches are used as a separate pre-processing module to improve the quality of degraded image. However, this two-step approach makes it difficult to unify the goals of enhancement and detection, that is, low-light enhancement operations are not always helpful for subsequent object detection. Recently, some works try to integrate enhancement and detection in an end-to-end network, but still suffer from complex network structure, training convergence problem and demanding reference images. To address above problems, a plug-and-play image enhancement model is proposed in this paper, namely, low-light image enhancement (LLIE) model, which can be easily embedded into some off-the-shelf object detection methods in an end-to-end manner. LLIE is composed of a parameter estimation module and image processing module. The former learns to regress lighting enhancement parameters according to the feedback of detection network, and the latter enhances degraded image adaptively to promote subsequent detection model under low-light condition. Extensive object detection experiments on several low-light image data sets show that the performance of detector is significantly improved when LLIE is integrated.
关键词 :
Plug-and-play Plug-and-play End-to-End End-to-End Low-light image enhancement Low-light image enhancement Object detection Object detection
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Yuan, Jiaojiao , Hu, Yongli , Sun, Yanfeng et al. A plug-and-play image enhancement model for end-to-end object detection in low-light condition [J]. | MULTIMEDIA SYSTEMS , 2024 , 30 (1) . |
MLA | Yuan, Jiaojiao et al. "A plug-and-play image enhancement model for end-to-end object detection in low-light condition" . | MULTIMEDIA SYSTEMS 30 . 1 (2024) . |
APA | Yuan, Jiaojiao , Hu, Yongli , Sun, Yanfeng , Wang, Boyue , Yin, Baocai . A plug-and-play image enhancement model for end-to-end object detection in low-light condition . | MULTIMEDIA SYSTEMS , 2024 , 30 (1) . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
The goal of mixed-modality clustering, which differs from typical multi-modality/view clustering, is to divide samples derived from various modalities into several clusters. This task has to solve two critical semantic gap problems: i) how to generate the missing modalities without the pairwise-modality data; and ii) how to align the representations of heterogeneous modalities. To tackle the above problems, this paper proposes a novel mixed-modality clustering model, which integrates the missing-modality generation and the heterogeneous modality alignment into a unified framework. During the missing-modality generation process, a bidirectional mapping is established between different modalities, enabling generation of preliminary representations for the missing-modality using information from another modality. Then the intra-modality bipartite graphs are constructed to help generate better missing-modality representations by weighted aggregating existing intra-modality neighbors. In this way, a pairwise-modality representation for each sample can be obtained. In the process of heterogeneous modality alignment, each modality is modelled as a graph to capture the global structure among intra-modality samples and is aligned against the heterogeneous modality representations through the adaptive heterogeneous graph matching module. Experimental results on three public datasets show the effectiveness of the proposed model compared to multiple state-of-the-art multi-modality/view clustering methods.
关键词 :
multi-view clustering multi-view clustering Web sites Web sites adaptive graph structure learning adaptive graph structure learning Data models Data models Bipartite graph Bipartite graph Semantics Semantics Correlation Correlation heterogeneous graph matching heterogeneous graph matching Task analysis Task analysis Feature extraction Feature extraction Mixed-modality clustering Mixed-modality clustering
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | He, Xiaxia , Wang, Boyue , Gao, Junbin et al. Mixed-Modality Clustering via Generative Graph Structure Matching [J]. | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING , 2024 , 36 (12) : 8773-8786 . |
MLA | He, Xiaxia et al. "Mixed-Modality Clustering via Generative Graph Structure Matching" . | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 36 . 12 (2024) : 8773-8786 . |
APA | He, Xiaxia , Wang, Boyue , Gao, Junbin , Wang, Qianqian , Hu, Yongli , Yin, Baocai . Mixed-Modality Clustering via Generative Graph Structure Matching . | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING , 2024 , 36 (12) , 8773-8786 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
A better knowledge-based visual question answering (KBVQA) model needs to rely on visual features, question features, and related external knowledge to solve an open visual question answering task. Although the existing knowledge-based visual question answering works have achieved some accomplishments, there are still the following challenges: 1) There is a serious lack of visual feature information. Image information is worth a thousand words. Only relying on the converted salient text information is difficult to express the original rich information of the image. 2) The external knowledge acquired is not comprehensive enough, and there is a lack of relevant knowledge directly retrieved by visual feature information. To solve these challenges, we propose a Visual Information-Guided knowledge-based visual question answering (VIG) model. It fully considers the utilization of visual features information. Specifically: 1) We introduce multi-granularity visual information that can comprehensively characterize visual feature information. 2) We consider not only the knowledge retrieved through text information but also the knowledge directly retrieved from visual feature information. Finally, we feed the visual features and retrieved multiple text knowledge into an encoder-decoder module to generate an answer. We perform extensive experiments on the OKVQA dataset and achieve state-of-the-art performance of 60.27% accuracy.
关键词 :
Visual Information-Guided Visual Information-Guided External Knowledge External Knowledge Knowledge-Based VQA Knowledge-Based VQA
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Liu, Heng , Wang, Boyue , Sun, Yanfeng et al. VIG: Visual Information-Guided Knowledge-Based Visual Question Answering [J]. | PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024 , 2024 : 1086-1091 . |
MLA | Liu, Heng et al. "VIG: Visual Information-Guided Knowledge-Based Visual Question Answering" . | PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024 (2024) : 1086-1091 . |
APA | Liu, Heng , Wang, Boyue , Sun, Yanfeng , Li, Xiaoyan , Hu, Yongli , Yin, Baocai . VIG: Visual Information-Guided Knowledge-Based Visual Question Answering . | PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024 , 2024 , 1086-1091 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Being capable of extracting more information than 2-D convolutional neural networks (CNNs), 3-D CNNs have been playing a vital role in video analysis tasks like human action recognition, but their massive operations hinder the realtime execution on edge devices with constrained computation and memory resources. Although various model compression techniques have been applied to accelerate 2-D CNNs, there are rare efforts in investigating hardware-friendly pruning of 3D CNNs and acceleration on customizable edge platforms like FPGAs. This work starts from proposing a kernel group row- column (KGRC) weight sparsity pattern, which is fine-grained to achieve high pruning ratios with negligible accuracy loss, and balanced across kernel groups to achieve high computation parallelism on hardware. The reweighted pruning algorithm for this sparsity is then presented and performed on 3-D CNNs, followed by quantization under different precisions. Along with model compression, FPGA-based accelerators with four modes are designed in support of the kernel group sparsity in multiple dimensions. The co-design framework of the pruning algorithm and the accelerator is tested on two representative 3-D CNNs, namely C3D and R(2+1)D, + 1)D, with the Xilinx ZCU102 FPGA platform for action recognition. The experimental results indicate that the accelerator implementation with the KGRC sparsity and 8-bit quantization achieves a good balance between the speedup and model accuracy, leading to acceleration ratios of 4.12x for C3D and 3.85x for R(2+1)D compared with the 16-bit baseline designs supporting only dense models.
关键词 :
Quantization (signal) Quantization (signal) edge device inference edge device inference Parallel processing Parallel processing model compression model compression Convolutional neural networks Convolutional neural networks Kernel Kernel Computational modeling Computational modeling Field programmable gate arrays Field programmable gate arrays Three-dimensional displays Three-dimensional displays FPGA FPGA 3-D convolutional neural network (CNN) 3-D convolutional neural network (CNN) weight pruning weight pruning
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Sun, Mengshu , Xu, Kaidi , Lin, Xue et al. Hardware-Friendly 3-D CNN Acceleration With Balanced Kernel Group Sparsity [J]. | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS , 2024 , 43 (10) : 3027-3040 . |
MLA | Sun, Mengshu et al. "Hardware-Friendly 3-D CNN Acceleration With Balanced Kernel Group Sparsity" . | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 43 . 10 (2024) : 3027-3040 . |
APA | Sun, Mengshu , Xu, Kaidi , Lin, Xue , Hu, Yongli , Yin, Baocai . Hardware-Friendly 3-D CNN Acceleration With Balanced Kernel Group Sparsity . | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS , 2024 , 43 (10) , 3027-3040 . |
导入链接 | NoteExpress RIS BibTex |
摘要 :
Graph Neural Networks (GNNs) have emerged as a dominant tool for effectively learning from graph data, leveraging their remarkable learning capabilities. However, many GNN-based techniques assume complete and accurate graph relations. Unfortunately, this assumption often diverges from reality, as real-world scenarios frequently exhibit missing and erroneous edges within graphs. Consequently, GNNs that rely solely on the original graph structure inevitably lead to suboptimal results. To address this challenge, we propose a novel approach known as Multi-graph fusion and Virtual node enhanced Graph Neural Networks (MVGNN). Initially, we introduce an adaptive graph that complements the original and feature graphs. This adaptive graph serves to bridge gaps in the original and feature graphs, capturing missing edges and refining the graph's structure. Subsequently, we merge the original, feature, and adaptive graphs by applying attention mechanisms. In addition, MVGNN strategically designs virtual nodes, which act as auxiliary elements, changing the propagation mode between low-weighted edges and further enhancing the robustness of the model. The proposed MVGNN is evaluated on six benchmark datasets, demonstrating its superiority over existing state-of-the-art classification methodologies.
关键词 :
Graph Convolutional Networks Graph Convolutional Networks Robustness Robustness Virtual nodes Virtual nodes Classification Classification
引用:
复制并粘贴一种已设定好的引用格式,或利用其中一个链接导入到文献管理软件中。
GB/T 7714 | Yang, Yachao , Sun, Yanfeng , Guo, Jipeng et al. Multi-graph Fusion and Virtual Node Enhanced Graph Neural Networks [J]. | ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V , 2024 , 15020 : 190-201 . |
MLA | Yang, Yachao et al. "Multi-graph Fusion and Virtual Node Enhanced Graph Neural Networks" . | ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V 15020 (2024) : 190-201 . |
APA | Yang, Yachao , Sun, Yanfeng , Guo, Jipeng , Wang, Shaofan , Yin, Baocai . Multi-graph Fusion and Virtual Node Enhanced Graph Neural Networks . | ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V , 2024 , 15020 , 190-201 . |
导入链接 | NoteExpress RIS BibTex |
导出
数据: |
选中 到 |
格式: |