• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Jia, Maoshen (Jia, Maoshen.) | Yang, Ziyu (Yang, Ziyu.) | Bao, Changchun (Bao, Changchun.) (Scholars:鲍长春) | Zheng, Xiguang (Zheng, Xiguang.) | Ritz, Christian (Ritz, Christian.)

Indexed by:

EI Scopus SCIE

Abstract:

Preserving audio scenes in the form of audio objects has become common in recent years. Object-based audio techniques provide more flexibility for personalized rendering as well as a more accurate audio object trajectory. For encoding and transmitting multiple audio objects in a lossy manner, a new compression framework for multiple simultaneously occurring audio objects is presented in this work. The proposed encoding approach is based on the intra-object sparsity (approximate k-sparsity). After establishing a quantitative measure of approximate k-sparsity, statistical analysis is employed to validate the proposed intra-object sparsity of audio objects. By exploring this intra-object sparsity, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. This downmix signal can be further compressed by legacy audio codecs. Meanwhile, the side information is transmitted in a lossless manner. The objective and subjective evaluations revealed that the proposed compression framework achieved better perceptual quality compared to an existing technique where up to eight audio objects are considered. The subjective evaluations also confirmed that the proposed approach is able to achieve scalable transmission according to the bandwidth while preserving the perceptual quality of both the individual audio objects and the spatial audio scenes.

Keyword:

Audio object coding multichannel audio compression sparsity

Author Community:

  • [ 1 ] [Jia, Maoshen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
  • [ 2 ] [Yang, Ziyu]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
  • [ 3 ] [Bao, Changchun]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
  • [ 4 ] [Zheng, Xiguang]Univ Wollongong, ICT Res Inst, Wollongong, NSW 2500, Australia
  • [ 5 ] [Ritz, Christian]Univ Wollongong, ICT Res Inst, Wollongong, NSW 2500, Australia
  • [ 6 ] [Zheng, Xiguang]Univ Wollongong, Sch Elect Comp & Telecommun, Wollongong, NSW 2500, Australia
  • [ 7 ] [Ritz, Christian]Univ Wollongong, Sch Elect Comp & Telecommun, Wollongong, NSW 2500, Australia

Reprint Author's Address:

  • [Jia, Maoshen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China

Show more details

Related Keywords:

Related Article:

Source :

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

ISSN: 2329-9290

Year: 2015

Issue: 6

Volume: 23

Page: 1082-1095

5 . 4 0 0

JCR@2022

ESI Discipline: ENGINEERING;

ESI HC Threshold:174

JCR Journal Grade:2

CAS Journal Grade:2

Cited Count:

WoS CC Cited Count: 22

SCOPUS Cited Count: 30

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Affiliated Colleges:

Online/Total:952/5339826
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.