--

Principal Investigator

Mark Plumbley

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Professor of Signal Processing in the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey. He has investigated a wide range of audio signal processing methods, including automatic music transcription (Abdallah & Plumbley, 2006) and audio source separation (Nesbit et al, 2011), using techniques such as sparse representations (Plumbley et al 2010), and high- resolution NMF (Badeau & Plumbley, 2014). He led the D-CASE data challenge on Detection and Classification of Acoustic Scenes and Events (Stowell et al, 2015), his work on bird sound classification (Stowell and Plumbley, 2014) was widely featured in news media, and collaborative article on Best practices for scientific computing (Wilson et al, 2014) was the most-read PLoS Biology article of 2014. He is PI on the EPSRC project Musical Audio Repurposing using Source Separation, is the lead academic on the Innovate UK projects Audio Data Exploration and Advanced Smart Microphone, with the company Audio Analytic, and has been PI on several other EPSRC grants. Before joining Surrey (Jan 2015), Plumbley was Director of the world-leading Centre for Digital Music (C4DM) at Queen Mary University of London.

Co-Investigators

Philip Jackson

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Senior Lecturer at CVSSP and leads the Machine Audition Lab within CVSSP. His research in audio and speech processing has contributed to projects (e.g., Columbo, BALTHASAR, DANSA, QESTRAL, UDRC and POSZ) in acoustics of speech production (Jackson & Shadle, 2001), audio source separation (Liu et al, 2013; Alinaghi et al, 2014), audio-visual processing for speech enhancement and visual speech synthesis, and spatial aspects of subjective sound quality evaluation (Coleman et al, 2014; Conetta et al, 2014), and has over 100 academic publications. He is Surrey's technical lead on the EPSRC-funded S3A Programme Grant on spatial audio, responsible for object-based audio production.

Wenwu Wang

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Reader in Signal Processing at CVSSP and Co-Director of the Machine Audition Lab in CVSSP. His research interests include audio source separation (Liu et al, 2013; Alinaghi et al, 2014), blind dereverberation (Jan & Wang, 2012), sparse signal processing (Dai et al, 2012), machine learning (Barnard et al, 2014), and audio-visual signal processing (Kilic et al, 2015). He has over 150 publications in these areas including two books: Machine Audition: Principles, Algorithms and Systems (2010) and Blind Source Separation: Advances in Theory, Algorithms and Applications (2014). He has been a PI and Co-I on several EPSRC projects, including Multimodal Blind Source Separation for Robot Audition (EPSRC EP/H012842/1) and Audio and Video Based Speech Separation of Multiple Moving Sources (EPSRC EP/H050000/1).

Krystian Mikolajczyk

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Reader in Robot Vision in CVSSP. He has more than 80 publications in top-tier computer vision and machine learning forums, in areas such as feature extraction, object recognition, and tracking, including kernel based methods (Yan et al, 2011), audio-visual annotation (Yan et al, 2014), and deep canonical correlation analysis (Yan & Mikolajczyk, 2015). He received Longuet-Higgins Prize 2014 for his contribution to invariant local image descriptors. He chaired BMVC 2012 and the IEEE AVSS 2013. His team regularly comes top in retrieval challenges, including in TRECVid 2008, Visual Object Classes Challenge 2008 & 2010, and ImageCLEF 2010.

David Frohlich

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Director of Digital World Research Centre (DWRC) at Surrey and Professor of Interaction Design. He joined the Centre in 2005 from HP Labs, to establish a new research agenda on user- centred innovation in digital media technology. His work explores a variety of new media futures relating to digital storytelling, personal media collections, and community news and arts. David has a longstanding interest in sound through his work on Audiophotography as a new media form (Frohlich, 2004; Frohlich & Fennell, 2007): this led to the first published study of the domestic soundscape on the Sonic Interventions project (Oleksik et al, 2008), and the Com-Me toolkit (Frohlich et al, 2012) for creating audiophoto narratives on the Community Generated Media project (EP/H042857/1), and the design of audiopaper documents on the Interactive Newsprint and Light Tags projects (EP/I032142/1, EP/K503939/1).

Bill Davies

School of Computing, Science & Engineering
University of Salford, Salford M5 4WT, UK

Expert in soundscapes, room acoustics and perception. He has 27 years’ experience in acoustic and psychoacoustic research methods. He led the EPSRC Positive Soundscape Project (EP/E011624/1), a large consortium project to develop new ways of evaluating soundscapes, involving artists, social scientists and acousticians. Project outputs included new methods to measure soundscape perception, advice for urban planners, a major art exhibition, radio programmes, and a soundscape sequencer toy for people to gain a greater understanding of their aural environment (Davies et al, 2013). He edited a special edition of Applied Acoustics on soundscapes, and sits on ISO TC43/SC1/WG54 producing standards on soundscape assessment. He led Defra project (NANR200) to advise on UK soundscape policy (Payne et al, 2009). Davies is currently Vice-President (International) of the Institute of Acoustics, and leads work on perception of complex auditory scenes on the EPSRC-funded S3A Programme Grant.

Trevor Cox

School of Computing, Science & Engineering
University of Salford, Salford M5 4WT, UK

Professor of Acoustic Engineering and a past President of the Institute of Acoustics (IOA). He was awarded the IOA’s Tyndall Medal in 2004. He has been an investigator on EPSRC projects on room acoustics, signal processing, perception and public engagement. These include EP/J013013/1 on the perception and automatic detection of audio recording errors using perceptual testing and blind signal processing methods (Jackson et al, 2014). Cox leads the qualitative and quantitative perceptual work on the EPSRC S3A Programme Grant. He was an EPSRC Senior Media Fellow and has presented 21 science documentaries on BBC Radio, authored articles for The Guardian, New Scientist and Sound on Sound. His popular science book Sonic Wonderland was published in 2014. He pioneered psychoacoustic testing as a method for engaging the public, working with BBC R&D and the British Science Association on theme tunes (Davies et al, 2011), a technique that will be exploited in this project.

Postdoctoral Researchers

Oliver Bones

School of Computing, Science & Engineering
University of Salford, Salford M5 4WT, UK

A Research Fellow with research interests in sound perception and auditory science. He studied for his PhD at the University of Manchester, researching the neural basis for individual differences in the perception of musical consonance with Chris Plack. He then took a postdoctoral research position with Patrick Wong at the Chinese University of Hong Kong researching pitch perception in tone-language speakers, before joining the Acoustic Research Centre at the University of Salford as a Research Fellow working with Bill Davies and Trevor Cox.

Qiang Huang

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

A Research Fellow now working on the project of Making Sense of Sound. He started his PhD research in speech recognition and natural language processing at the University of East Anglia. He has worked on several EPSRC projects on speech recognition, information retrieval, sports video analysis and multimodal dialogue system. He is now focusing on multimodal information processing for sound understanding.

Yong Xu

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Research Software Developer

Christian Kroos

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Cognitive scientist with focus on algorithm development. He was awarded an MA and PhD in Phonetics and Speech Communication with Logic as minor by the Ludwig-Maximilian-Universität (Munich, Germany). Over the last two decades he has conducted research in Germany (Ludwig-Maximilian-Universität), Japan (ATR, Kyoto), USA (Haskins Laboratories, New Haven, CT), Australia (Western Sydney University, Sydney & Curtin University, Perth) and the UK, spanning cognitive science, artificial intelligence, robotics and the arts.

Associated Researchers

Iwona Sobieraj

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Qiuqiang Kong

Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

PhD student. Working on non-speech audio processing using deep learning methods.