End-to-End Music Note Recognition Based on Deep Learning - Details

Author：

Huang, Zhiqing (Huang, Zhiqing.) | Jia, Xiang (Jia, Xiang.) | Guo, Yifan (Guo, Yifan.) | Zhang, Jing (Zhang, Jing.)

Indexed by：

EI CSCD

Abstract：

Optical　music　recognition(OMR)is　an　important　technology　in　music　information　retrieval.　Note　recognition　is　the　key　part　of　music　score　recognition.　In　view　of　the　low　accuracy　of　notes　recognition　and　the　cumbersome　steps　of　the　recogni-tion　of　music　score　image,　an　end-to-end　note　recognition　model　based　on　deep　learning　is　designed.　The　model　uses　the　deep　convolutional　neural　network　to　input　the　whole　score　image　as　the　input,　and　directly　outputs　the　duration　and　pitch　of　the　note.　In　data　preprocessing,　the　music　image　and　the　corresponding　tag　data　required　for　model　training　were　obtained　by　parsing　the　MusicXML　file,　the　label　data　was　a　vector　composed　of　note　pitch,　note　duration　and　note　coordinates,　therefore,　the　model　learned　the　label　vector　through　training　to　transform　the　note　recognition　task　into　detection　and　classification　tasks.　Data　enhancement　methods　such　as　noise　and　random　cropping　were　added　to　increase　the　diversity　of　data,　which　made　the　trained　model　more　robust.　In　the　model　design,　based　on　the　darknet53　basic　network　and　feature　fusion　technology,　an　end-to-end　target　detection　model　was　designed　to　recognize　the　notes.　The　deep　neural　network　darknet53　was　used　to　extract　the　feature　image　of　the　music　image,　so　that　the　notes　on　the　feature　map　had　a　large　enough　receptive　field,　and　then　the　upper　layer　feature　map　of　the　neural　network　and　the　feature　map　were　spliced,　and　the　feature　fusion　is　completed　to　make　the　note　have　more　obvious　feature　and　texture,　allowing　the　model　to　detect　small　objects　such　as　notes.　The　model　adopted　multi-task　learning,　and　learned　the　pitch　and　duration　classification　task　and　note　coordinates　task,　which　improved　the　generalization　ability　of　the　model.　Finally,　the　model　was　tested　on　the　test　set　generated　by　MuseScore.　The　note　recognition　accuracy　is　high,　and　the　duration　accuracy　of　0.96　and　the　pitch　accuracy　of　0.98　can　be　achieved.　©　2020,　Editorial　Board　of　Journal　of　Tianjin　University(Science　and　Technology).　All　right　reserved.

Keyword：

Learning systems Textures Deep neural networks Multi-task learning Multilayer neural networks Deep learning Convolutional neural networks Feature extraction Object detection

Author Community：

[ 1 ] [Huang, Zhiqing]Faculty of Information Science, Beijing University of Technology, Beijing; 100022, China
[ 2 ] [Jia, Xiang]Faculty of Information Science, Beijing University of Technology, Beijing; 100022, China
[ 3 ] [Guo, Yifan]Faculty of Information Science, Beijing University of Technology, Beijing; 100022, China
[ 4 ] [Zhang, Jing]Faculty of Information Science, Beijing University of Technology, Beijing; 100022, China

Reprint Author's Address：

[huang, zhiqing]faculty of information science, beijing university of technology, beijing; 100022, china

Email：

zqhuang@bjut.edu.cn

Show more details

Related Keywords：

A robust CNN model for handwritten digits recognition and classification
2020，2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, AEECA 2020
Topic network: Topic model with deep learning for image classification
2018，Journal of Electronic Imaging
Image feature matching based on deep learning
2018，4th IEEE International Conference on Computer and Communications, ICCC 2018
Gradient descent finds global minima of deep neural networks
2019，36th International Conference on Machine Learning, ICML 2019
Aerial Forest Fire Detection based on Transfer Learning and Improved Faster RCNN
2023，3rd IEEE International Conference on Information Technology, Big Data and Artificial Intelligence, ICIBA 2023

Source ：

Journal of Tianjin University Science and Technology

ISSN： 0493-2137

Year： 2020

Issue： 6

Volume： 53

Page： 653-660

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to