收录:
摘要:
Text-line segmentation is an important task in the historical Tibetan document recognition. Historical Tibetan document images usually contain touching or overlapping characters between consecutive text-lines, making text-line segmentation a difficult task. In this paper, we present a text-line segmentation method based on baseline detection. The initial positions for the baseline of each line are obtained by template matching, pruning algorithms and closing operation. The baseline is estimated using dynamic tracing within pixel points of each line and the context information between pixel points. The overlapping or touching areas are cut by finding the minimum width stroke. Finally, text-lines are extracted based on the estimated baseline and the cut position of touching area. The proposed algorithm has been evaluated on the dataset of historical Tibetan document images. Experimental result shows the effectiveness of the proposed method.
关键词:
通讯作者信息: