收录:
摘要:
Based on the analysis of the Hadoop open source distributed computing platform as well as the parallel training methods for the BP network, for the disadvantage of time-consuming when using large amounts of texts to train the BP network, we designed a BP network text categorization model based on data parallel method on Hadoop platform using the MapReduce programming model. The model uses the method of batch training, it adjusts the network weights after getting the accumulated error by summing every sample training error on each node, and the categorization of text is done in parallel. The method based on Hadoop platform improves the training speed of BP network and efficiency of text categorization, and achieves good categorization performance. © 2013 IEEE.
关键词:
通讯作者信息:
电子邮件地址: