收录:
摘要:
In the past few years, text detection in natural scenes has attracted increasing attention due to many real-world applications. Most existing methods only detect horizontal or nearly horizontal texts and have complicated processes. When using the neural network to detect text in the image, some ambiguity and small words are easy to be ignored because of many pooling operations. Therefore, this paper proposes an end-to-end trainable neural network for detecting multi-oriented text lines or words in natural scene images. The network fuses multi-level features and is guided by deep supervision during training. In this way, richer hierarchical representations can be learned automatically. The network makes two kinds of predictions: text/no text classification and location regression, thus we can directly locate multi-oriented words or text lines without other unnecessary intermediate steps. Experimental results on the ICDAR 2015 datasets and MSRA-TD500 datasets have proven that the proposed method outperforms the state-of-the-art methods by a noticeable margin on F-score.
关键词:
通讯作者信息:
电子邮件地址: