Encoding Multiple Audio Objects Using Intra-Object Sparsity - Details

Author：

Jia, Maoshen (Jia, Maoshen.) | Yang, Ziyu (Yang, Ziyu.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春) | Zheng, Xiguang (Zheng, Xiguang.) | Ritz, Christian (Ritz, Christian.)

Indexed by：

EI Scopus SCIE

Abstract：

Preserving　audio　scenes　in　the　form　of　audio　objects　has　become　common　in　recent　years.　Object-based　audio　techniques　provide　more　flexibility　for　personalized　rendering　as　well　as　a　more　accurate　audio　object　trajectory.　For　encoding　and　transmitting　multiple　audio　objects　in　a　lossy　manner,　a　new　compression　framework　for　multiple　simultaneously　occurring　audio　objects　is　presented　in　this　work.　The　proposed　encoding　approach　is　based　on　the　intra-object　sparsity　(approximate　k-sparsity).　After　establishing　a　quantitative　measure　of　approximate　k-sparsity,　statistical　analysis　is　employed　to　validate　the　proposed　intra-object　sparsity　of　audio　objects.　By　exploring　this　intra-object　sparsity,　multiple　simultaneously　occurring　audio　objects　are　compressed　into　a　mono　downmix　signal　with　side　information.　This　downmix　signal　can　be　further　compressed　by　legacy　audio　codecs.　Meanwhile,　the　side　information　is　transmitted　in　a　lossless　manner.　The　objective　and　subjective　evaluations　revealed　that　the　proposed　compression　framework　achieved　better　perceptual　quality　compared　to　an　existing　technique　where　up　to　eight　audio　objects　are　considered.　The　subjective　evaluations　also　confirmed　that　the　proposed　approach　is　able　to　achieve　scalable　transmission　according　to　the　bandwidth　while　preserving　the　perceptual　quality　of　both　the　individual　audio　objects　and　the　spatial　audio　scenes.

Keyword：

Audio object coding multichannel audio compression sparsity

Author Community：

[ 1 ] [Jia, Maoshen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
[ 2 ] [Yang, Ziyu]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
[ 3 ] [Bao, Changchun]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
[ 4 ] [Zheng, Xiguang]Univ Wollongong, ICT Res Inst, Wollongong, NSW 2500, Australia
[ 5 ] [Ritz, Christian]Univ Wollongong, ICT Res Inst, Wollongong, NSW 2500, Australia
[ 6 ] [Zheng, Xiguang]Univ Wollongong, Sch Elect Comp & Telecommun, Wollongong, NSW 2500, Australia
[ 7 ] [Ritz, Christian]Univ Wollongong, Sch Elect Comp & Telecommun, Wollongong, NSW 2500, Australia

Reprint Author's Address：

[Jia, Maoshen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China

Email：

jiamaoshen@bjut.edu.cn |
yangziyu@emails.bjut.edu.cn |
chchbao@bjut.edu.cn |
xz725@uow.edu.au |
critz@uow.edu.au

Show more details

Related Keywords：

Supervised sparsity preserving projections for face recognition
2011，3rd International Conference on Digital Image Processing (ICDIP 2011)
A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
2022，CIRCUITS SYSTEMS AND SIGNAL PROCESSING
Multiple Audio Source Separation by Using Intra-Object-Sparsity Encoding Framework
2017，IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)
A Novel Reconstruction Algorithm for Bioluminescent Tomography Based on Bayesian Compressive Sensing
2016，SPIE Biomedical Applications in Molecular, Structural and Functional Imaging Conference

Source ：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

ISSN： 2329-9290

Year： 2015

Issue： 6

Volume： 23

Page： 1082-1095

5 . 4 0 0

JCR@2022

ESI Discipline： ENGINEERING;

ESI HC Threshold：174

JCR Journal Grade：2

CAS Journal Grade：2

Cited Count：

WoS CC Cited Count： 22

SCOPUS Cited Count： 30

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to