Papers
What follows is a list of 495 papers that mention Freesound or use Freesound data for research.
This list is created automatically by finding articles that cite one of the main Freesound
reference papers. Some entries have also been added manually. Papers are sorted by year of publication
and alphabetically by first author surname.
If you have a paper which should be
on the list and is not, please send us an email at freesound@freesound.org.
2022 (78)
- A. Madhu, S. K.. Envgan: A Gan-Based Augmentation To Improve Environmental Sound Classification. Artificial Intelligence Review (2022).
- A. Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, T. Virtanen. Starss22: A Dataset Of Spatial Recordings Of Real Scenes With Spatiotemporal Annotations Of Sound Events. ArXiv (2022).
- A. Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, T. Virtanen. Starss22: A Dataset Of Spatial Recordings Of Real Scenes With Spatiotemporal Annotations Of Sound Events. ArXiv (2022).
- Ahmed Omran, Neil Zeghidour, Zalán Borsos, F. D. C. Quitry, M. Slaney, M. Tagliasacchi. Disentangling Speech From Surroundings In A Neural Audio Codec. ArXiv (2022).
- Alexander Alexander Ponomarchuk Ponomarchuk, Ilya Ilya Burenko Burenko, Elian Elian Malkin Malkin, Ivan Ivan Nazarov Nazarov, Vladimir Vladimir Kokh Kokh, Manvel Manvel Avetisian Avetisian, Leonid Leonid Zhukov Zhukov. Project Achoo: A Practical Model And Application For Covid-19 Detection From Recordings Of Breath, Voice, And Cough. Ieee Journal of Selected Topics in Signal Processing (2022).
- Alison B. Ma, Alexander Lerch. Representation Learning For The Automatic Indexing Of Sound Effects Libraries (2022).
- Ana Elisa Méndez Méndez, M. Cartwright, J. Bello, O. Nov. Eliciting Confidence For Improving Crowdsourced Audio Annotations. Proceedings of the ACM on Human-Computer Interaction (2022).
- Anam Bansal, N. Garg. Environmental Sound Classification: A Descriptive Review Of The Literature. Intelligent Systems with Applications (2022).
- Arsha Nagrani, P. H. Seo, Bryan Seybold, Anja Hauth, Santiago Manén, Chen Sun, C. Schmid. Learning Audio-Video Modalities From Image Captions. ArXiv (2022).
- Benjamin Elizalde, Soham Deshmukh, Mahmoud Al Ismail, Huaming Wang. Clap: Learning Audio Concepts From Natural Language Supervision. ArXiv (2022).
- Byeongil Ko, Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Seung-Deok Choi, Yong-Hwa Park. Data Augmentation And Squeeze-And-Excitation Network On Multiple Dimension For Sound Event Localization And Detection In Real Scenes (2022).
- Calum Heggan, S. Budgett, Timothy M. Hospedales, Mehrdad Yaghoobi. Metaaudio: A Few-Shot Audio Classification Benchmark. ArXiv (2022).
- Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang, Yu Tsao. Nastar: Noise Adaptive Speech Enhancement With Target-Conditional Resampling. ArXiv (2022).
- Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, N. Harada, K. Kashino. Introducing Auxiliary Text Query-Modifier To Content-Based Audio Retrieval. ArXiv (2022).
- Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, N. Harada, K. Kashino. Masked Spectrogram Modeling Using Masked Autoencoders For Learning General-Purpose Audio Representation. ArXiv (2022).
- Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, N. Harada, K. Kashino. Byol For Audio: Exploring Pre-Trained General-Purpose Audio Representations. ArXiv (2022).
- Daniel Lin. Contrastive Feature Learning For Audio Classification (2022).
- E. Guizzo, C. Marinoni, Marco Pennese, Xinlei Ren, Xiguang Zheng, Chen Zhang, B. Masiero, A. Uncini, D. Comminiello. L3Das22 Challenge: Learning 3D Audio Sources In A Real Office Environment (2022).
- Efthymios Tzinis, Yossi Adi, V. Ithapu, Buye Xu, P. Smaragdis, Anurag Kumar. Remixit: Continual Self-Training Of Speech Enhancement Models Via Bootstrapped Remixing. IEEE Journal of Selected Topics in Signal Processing (2022).
- Efthymios Tzinis, Yossi Adi, V. Ithapu, Buye Xu, P. Smaragdis, Anurag Kumar. Remixit: Continual Self-Training Of Speech Enhancement Models Via Bootstrapped Remixing (2022).
- Eleonora Grassucci, Gioia Mancini, Christian Brignone, A. Uncini, D. Comminiello. Dual Quaternion Ambisonics Array For Six-Degree-Of-Freedom Acoustic Representation. ArXiv (2022).
- Enric Gus'o, Jordi Pons, Santiago Pascual, J. Serrà. On Loss Functions And Evaluation Metrics For Music Source Separation (2022).
- Francesca Incitti, Federico Urli, L. Snidaro. Beyond Word Embeddings: A Survey. Information Fusion (2022).
- Francesca Ronchini, R. Serizel. A Benchmark Of State-Of-The-Art Sound Event Detection Systems Evaluated On Synthetic Soundscapes. ArXiv (2022).
- Gasser Elbanna, Neil Scheidwasser-Clow, M. Kegler, P. Beckmann, Karl El Hajal, M. Cernak. Byol-S: Learning Self-Supervised Speech Representations By Bootstrapping. ArXiv (2022).
- Gasser Elbanna, Neil Scheidwasser-Clow, M. Kegler, P. Beckmann, Karl El Hajal, M. Cernak. Byol-S: Learning Self-Supervised Speech Representations By Bootstrapping (2022).
- Grant Van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, S. Belongie. Exploring Fine-Grained Audiovisual Categorization With The Ssw60 Dataset. ArXiv (2022).
- Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang. Pseudo Strong Labels For Large Scale Weakly Supervised Audio Tagging. ICASSP (2022).
- Helin Wang, Dongchao Yang, Chao Weng, Jia-yi Yu, Yuexian Zou. Improving Target Sound Extraction With Timestamp Information. ArXiv (2022).
- Huang Xie, Samuel Lipping, T. Virtanen. Dcase 2022 Challenge Task 6B: Language-Based Audio Retrieval (2022).
- J. Rulff, Fábio Miranda, Maryam Hosseini, Marcos Lage, M. Cartwright, Graham Dove, J. Bello, Cláudio T. Silva. Urban Rhapsody: Large-Scale Exploration Of Urban Soundscapes. ArXiv (2022).
- Janek Ebbers, R. Serizel, Reinhold Haeb-Umbach. Threshold Independent Evaluation Of Sound Event Detection Scores. ArXiv (2022).
- Jingdong Li, Yuanyuan Zhu, Dawei Luo, Yun Liu, Guohui Cui, Zhaoxia Li. The Pcg-Aiid System For L3Das22 Challenge: Mimo And Miso Convolutional Recurrent Network For Multi Channel Speech Enhancement And Speech Recognition (2022).
- Joseph P. Turian, Jordie Shier, H. Khan, B. Raj, Björn Schuller, C. Steinmetz, C. Malloy, G. Tzanetakis, Gissel Velarde, K. McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, J. Salamon, P. Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk. Hear: Holistic Evaluation Of Audio Representations (2022).
- Joseph P. Turian, Jordie Shier, H. Khan, B. Raj, Björn Schuller, C. Steinmetz, C. Malloy, G. Tzanetakis, Gissel Velarde, K. McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, J. Salamon, P. Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk. Hear 2021: Holistic Evaluation Of Audio Representations. ArXiv (2022).
- Jun Shen, M. Khodak, Ameet S. Talwalkar. Efficient Architecture Search For Diverse Tasks. ArXiv (2022).
- Karn Nichakarn Watcharasupat, Kenneth Ooi, Bhan Lam, Trevor Wong, Zhen-Ting Ong, W. Gan. Autonomous In-Situ Soundscape Augmentation Via Joint Selection Of Masker And Gain. ArXiv (2022).
- Karn Nichakarn Watcharasupat, Kenneth Ooi, Bhan Lam, Trevor Wong, Zhen-Ting Ong, W. Gan. Autonomous In-Situ Soundscape Augmentation Via Joint Selection Of Masker And Gain. IEEE Signal Processing Letters (2022).
- Kenneth Ooi, Bhan Lam, J. Hong, Karn Nichakarn Watcharasupat, Zhen-Ting Ong, W. Gan. Singapore Soundscape Site Selection Survey (S5): Identification Of Characteristic Soundscapes Of Singapore Via Weighted K-Means Clustering. Sustainability (2022).
- Kenneth Ooi, Zhen-Ting Ong, Karn Nichakarn Watcharasupat, Bhan Lam, J. Hong, Woon-Seng Gan Nanyang Technological University, Singapore, C. University, Daejeon, R. Korea. Araus: A Large-Scale Dataset And Baseline Models Of Affective Responses To Augmented Urban Soundscapes. ArXiv (2022).
- Kevin Kilgour, Beat Gfeller, Qingqing Huang, A. Jansen, Scott Wisdom, M. Tagliasacchi. Text-Driven Separation Of Arbitrary Sounds. ArXiv (2022).
- Kohei Suzuki, Shoki Sakamoto, T. Taniguchi, H. Kameoka. Speak Like A Dog: Human To Non-Human Creature Voice Conversion (2022).
- Kuan-Po Huang, Yuanbin Fu, Yu Zhang, Hung-yi Lee. Improving Distortion Robustness Of Self-Supervised Speech Processing Tasks With Domain Adaptation. ArXiv (2022).
- Kuan-Po Huang, Yuanbin Fu, Yu Zhang, Hung-yi Lee. Improving Distortion Robustness Of Self-Supervised Speech Processing Tasks With Domain Adaptation. ArXiv (2022).
- Madhurananda Pahar, M. Klopper, Byron Reeve, R. Warren, G. Theron, A. Diacon, T. Niesler. Automatic Tuberculosis And Covid-19 Cough Classification Using Deep Learning. ArXiv (2022).
- Manthan Thakker, S. Eskimez, T. Yoshioka, Huaming Wang. Fast Real-Time Personalized Speech Enhancement: End-To-End Enhancement Network (E3Net) And Knowledge Distillation. ArXiv (2022).
- Marc Delcroix, Jorge Bennasar V'azquez, Tsubasa Ochiai, K. Kinoshita, Yasunori Ohishi, S. Araki. Soundbeam: Target Sound Extraction Conditioned On Sound-Class Labels And Enrollment Clues For Increased Performance And Continuous Learning. ArXiv (2022).
- Mashrur M. Morshed, Ahmad Omar Ahsan, Hasan Mahmud, Md. Kamrul Hasan. Learning Audio Representations With Mlps. ArXiv (2022).
- Michela Cantarini, L. Gabrielli, S. Squartini. Few-Shot Emergency Siren Detection. Sensors (2022).
- Michelle Charette, Elizabeth Lima, Denielle Elliott. Sonic Stories, Sensory Ethnography, And Listening With An Injured Mind. Multimodality & Society (2022).
- Mohammad MohammadAmini, D. Matrouf, J. Bonastre, Sandipana Dowerah, R. Serizel, D. Jouvet. A Comprehensive Exploration Of Noise Robustness And Noise Compensation In Resnet And Tdnn-Based Speaker Recognition Systems (2022).
- Nico M. Schmidt, Jordi Pons, M. Miron. Podcastmix: A Dataset For Separating Music And Speech In Podcasts. ArXiv (2022).
- Nikhil Singh, Guillermo Bernal, D. Savchenko, Elena L. Glassman. A Selective Summary Of Where To Hide A Stolen Elephant: Leaps In Creative Writing With Multimodal Machine Intelligence. IN2WRITING (2022).
- Oleg Rybakov, M. Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang. A S ] 2 8 Ju L 2 02 2 Real Time Spectrogram Inversion Onmobile Phone (2022).
- Oleg Rybakov, M. Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang, Fadi Biadsy. Real Time Spectrogram Inversion On Mobile Phone. ArXiv (2022).
- P. Tremblay, Gerard Roma, Owen Green. Enabling Programmatic Data Mining As Musicking: The Fluid Corpus Manipulation Toolkit. Computer Music Journal (2022).
- Pranay Manocha, Zeyu Jin, A. Finkelstein. Sqapp: No-Reference Speech Quality Assessment Via Pairwise Preference (2022).
- Qingqing Huang, A. Jansen, Joonseok Lee, R. Ganti, Judith Yue Li, D. Ellis. Mulan: A Joint Embedding Of Music Audio And Natural Language (2022).
- Qiu-shi Zhu, J. Zhang, Zitian Zhang, Lirong Dai. Joint Training Of Speech Enhancement And Self-Supervised Model For Noise-Robust Asr. ArXiv (2022).
- Qiu-shi Zhu, Jie Zhang, Zi-qiang Zhang, Ming Wu, Xin Fang, Lirong Dai. A Noise-Robust Self-Supervised Pre-Training Model Based Speech Representation Learning For Automatic Speech Recognition (2022).
- Roberto San Millán-Castillo, L. Martino, E. Morgado, F. Llorente. An Exhaustive Variable Selection Study For Linear Models Of Soundscape Emotions: Rankings And Gibbs Analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2022).
- S. Budgett, Mehrdad Yaghoobi. M Eta A Udio : A F Ew -S Hot A Udio C Lassification B Enchmark ∗ (2022).
- Samuel Lipping, Parthasaarathy Sudarsanam, K. Drossos, T. Virtanen. Clotho-Aqa: A Crowdsourced Dataset For Audio Question Answering. ArXiv (2022).
- Sandipana Dowerah, R. Serizel, D. Jouvet, Mohammad, Mohammadamini, D. Matrouf. Compensating Noise And Reverberation In Far-Field Multichannel Speaker Verification (2022).
- Sreyan Ghosh, Ashish Seth, S. Umesh. Delores: Decorrelating Latent Spaces For Low-Resource Audio Representation Learning. ArXiv (2022).
- Sreyan Ghosh, Ashish Seth, S. Umesh. Delores: Decorrelating Latent Spaces For Low-Resource Audio Representation Learning. ArXiv (2022).
- Swapnil Bhosale, Rupayan Chakraborty, S. Kopparapu. Automatic Audio Captioning Using Attention Weighted Event Based Embeddings. ArXiv (2022).
- Tara Vanhatalo, P. Legrand, M. Desainte-Catherine, P. Hanna, Antoine Brusco, Guillaume Pille, Yann Bayle. A Review Of Neural Network-Based Emulation Of Guitar Amplifiers. Applied Sciences (2022).
- Xinhao Mei, Xubo Liu, MarkD . Plumbley, Wenwu Wang. Automated Audio Captioning: An Overview Of Recent Progress And New Challenges (2022).
- Xuenan Xu, Mengyue Wu, K. Yu. A Comprehensive Survey Of Automated Audio Captioning. ArXiv (2022).
- Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe. Towards Low-Distortion Multi-Channel Speech Enhancement: The Espnet-Se Submission To The L3Das22 Challenge (2022).
- Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhongqiu Wang, Yu Tsao, Y. Qian, Shinji Watanabe. Espnet-Se++: Speech Enhancement For Robust Speech Recognition, Translation, And Understanding. ArXiv (2022).
- Yuan Gong, Jingbo Yu, James R. Glass. Vocalsound: A Dataset For Improving Human Vocal Sounds Recognition. ICASSP (2022).
- Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James R. Glass. Cmkd: Cnn/Transformer-Based Cross-Model Knowledge Distillation For Audio Classification. ArXiv (2022).
- Yunjung Lee, Hwayeon Joh, Suhyeon Yoo, U. Oh. Accesscomics2: Understanding The User Experience Of An Accessible Comic Book Reader For Blind People With Textual Sound Effects. ACM Transactions on Accessible Computing (2022).
- Zhong-Qiu Wang, G. Wichern, Shinji Watanabe, Jonathan Le Roux. Stft-Domain Neural Speech Enhancement With Very Low Algorithmic Latency. ArXiv (2022).
- Zhongqiu Wang, Shinji Watanabe. Improving Frame-Online Neural Speech Enhancement With Overlapped-Frame Prediction. IEEE Signal Processing Letters (2022).
- Zubayer Islam, M. Abdel-Aty. Deep Convolutional Neural Network For Roadway Incident Surveillance Using Audio Data. ArXiv (2022).
2021 (121)
- . R Evisiting Transposed Convolutions For In Terpreting Raw Waveform Sound Event Recog Nition Cnn S By Sonification (2021).
- A. Aleluia, G. Cabral. Rapid Prototyping: Using Wizard Of Oz To Emulate Machine Learning Features For Interactive Artistic Applications. Anais do XVIII Simpósio Brasileiro de Computação Musical (SBCM 2021) (2021).
- A. Copiaco, C. Ritz, S. Fasciani, N. Abdulaziz. Dasee A Synthetic Database Of Domestic Acoustic Scenes And Events In Dementia Patients Environment. ArXiv (2021).
- A. Correya, Jorge Marcos-Fernández, Luis Joglar-Ongay, Pablo Alonso-Jiménez, X. Serra, D. Bogdanov. Audio And Music Analysis On The Web Using Essentia.Js. Trans. Int. Soc. Music. Inf. Retr. (2021).
- A. Jensenius. Best Versus Good Enough Practices For Open Music Research. Empirical Musicology Review (2021).
- A. Madhu, S. Kumaraswamy. Envgan: Adversarial Synthesis Of Environmental Sounds For Data Augmentation. ArXiv (2021).
- A. P. Mishra, N. S. Harper, J. Schnupp. Exploring The Distribution Of Statistical Feature Parameters For Natural Sound Textures. PloS one (2021).
- A. S. Koepke, Andreea-Maria Oncescu, João F. Henriques, Zeynep Akata, Samuel Albanie. Audio Retrieval With Natural Language Queries: A Benchmark Study. IEEE Transactions on Multimedia (2021).
- Aaron Valero Puche, Sukhan Lee. Caesynth: Real-Time Timbre Interpolation And Pitch Control With Conditional Autoencoders. 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) (2021).
- Abdulaziz Saleh Ba Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Nouar AlDahoul, Sarina Mansor, M. F. A. Fauzi, John See, Ahmad Syazwan Naim. Design And Implementation Of Fast Spoken Foul Language Recognition With Different End-To-End Deep Neural Network Architectures. Sensors (2021).
- Adri'an Barahona-R'ios, Tom Collins. Specsingan: Sound Effect Variation Synthesis Using Single-Image Gans. ArXiv (2021).
- Adri'an Barahona-R'ios, Tom Collins. Specsingan: Sound Effect Variation Synthesis Using Single-Image Gans. ArXiv (2021).
- Alexander Ponomarchuk, I. Burenko, Elian Malkin, I. Nazarov, V. Kokh, Manvel Avetisian, L. Zhukov. Project Achoo: A Practical Model And Application For Covid-19 Detection From Recordings Of Breath, Voice, And Cough. IEEE Journal of Selected Topics in Signal Processing (2021).
- Alexander Ponomarchuk, I. Burenko, Elian Malkin, Ivan Nazarov, V. Kokh, Manvel Avetisian, L. Zhukov. Project Achoo: A Practical Model And Application For Covid-19 Detection From Recordings Of Breath, Voice, And Cough. ArXiv (2021).
- Andreea-Maria Oncescu, A. S. Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie. Audio Retrieval With Natural Language Queries. Interspeech 2021 (2021).
- Anis Haron. Tone Color 音色排序的计算分类 (2021).
- Anna Xambó. A Live Coding Session With The Cloud And A Virtual Agent (2021).
- Anna Xambó, Gerard Roma, Sam Roig, Eduard Solaz. Live Coding With The Cloud And A Virtual Agent (2021).
- Archiki Prasad, P. Jyothi, R. Velmurugan. An Investigation Of End-To-End Models For Robust Speech Recognition. ArXiv (2021).
- Ariane Stolfi, D. P. S. D. Novais. Improvisation In Isolation: Quarentena Liv(R)E And Noise Symphony With The Playsound Online Music Making Tool (2021).
- Aswin Sivaraman, Sunwoo Kim, Minje Kim. Personalized Speech Enhancement Through Self-Supervised Data Augmentation And Purification. Interspeech 2021 (2021).
- B. Weck, Xavier Favory, Konstantinos Drossos, X. Serra. Evaluating Off-The-Shelf Machine Listening And Natural Language Models For Automated Audio Captioning. ArXiv (2021).
- Chandan K. A. Reddy, Vishak Gopa, Harishchandra Dubey, Sergiy Matusevych, Ross Cutler, R. Aichner. Musicnet: Compact Convolutional Neural Network For Real-Time Background Music Detection. ArXiv (2021).
- Chandan K.A. Reddy, Vishak Gopa, Harishchandra Dubey, Sergiy Matusevych, Ross Cutler, R. Aichner. Musicnet: Compact Convolutional Neural Network For Real-Time Background Music Detection. ArXiv (2021).
- Chao Xie, Yi-Chiao Wu, Patrick Lumban Tobing, Wen-Chin Huang, Tomoki Toda. Noisy-To-Noisy Voice Conversion Framework With Denoising Model. ArXiv (2021).
- D. Arteaga, J. Pons. Multichannel-Based Learning For Audio Object Extraction. ArXiv (2021).
- D. Arteaga, Jordi Pons. Multichannel-Based Learning For Audio Object Extraction. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2021).
- D. Jain. Protosound: A Personalized And Scalable Sound Recognition System For Deaf And Hard-Of-Hearing Users (2021).
- Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, N. Harada, K. Kashino. Byol For Audio: Self-Supervised Learning For General-Purpose Audio Representation. 2021 International Joint Conference on Neural Networks (IJCNN) (2021).
- Darius Petermann, G. Wichern, Zhong-Qiu Wang, Jonathan Le Roux. The Cocktail Fork Problem: Three-Stem Audio Separation For Real-World Soundtracks. ICASSP (2021).
- Darius Petermann, G. Wichern, Zhong-Qiu Wang, Jonathan Le Roux. The Cocktail Fork Problem: Three-Stem Audio Separation For Real-World Soundtracks. ArXiv (2021).
- Diego De Benito-Gorrón, Daniel Ramos, D. Toledano. A Multi-Resolution Crnn-Based Approach For Semi-Supervised Sound Event Detection In Dcase 2020 Challenge. IEEE Access (2021).
- Diego de Benito-Gorrón, Daniel Ramos, D. Toledano. An Analysis Of Sound Event Detection Under Acoustic Degradation Using Multi-Resolution Systems. IberSPEECH (2021).
- E. Guizzo, C. Marinoni, Marco Pennese, Xinlei Ren, Xiguang Zheng, Chen Zhang, B. Masiero, D. Comminiello. L3Das22 Challenge: Machine Learning For 3D Audio Signal Processing (2021).
- E. Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, C. Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo Nucciarelli, Ludovica Paglialunga, M. Pennese, Sveva Pepe, Enrico Rocchi, A. Uncini, D. Comminiello. L3Das21 Challenge: Machine Learning For 3D Audio Signal Processing. 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) (2021).
- E. Gómez. Deep Noise Suppression For Real Time Speech Enhancement In A Single Channel Wide Band Scenario (2021).
- Eduardo Fonseca, Andrés Ferraro, Xavier Serra. Improving Sound Event Classification By Increasing Shift Invariance In Convolutional Neural Networks. ArXiv (2021).
- Eduardo Fonseca, Andrés Ferraro, Xavier Serra. J Ul 2 02 1 Improving Sound Event Classification By Increasing Shift Invariance In Convolutional Neural Networks (2021).
- Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, P. Smaragdis. Separate But Together: Unsupervised Federated Learning For Speech Enhancement From Non-Iid Data. ArXiv (2021).
- Efthymios Tzinis, Yossi Adi, V. Ithapu, Buye Xu, Anurag Kumar. Continual Self-Training With Bootstrapped Remixing For Speech Enhancement. ArXiv (2021).
- Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar. Continual Self-Training With Bootstrapped Remixing For Speech Enhancement. ArXiv (2021).
- F. Font. Source: A Freesound Community Music Sampler. Audio Mostly Conference (2021).
- Francesc Lluís, V. Chatziioannou, A. Hofmann. Music Source Separation Conditioned On 3D Point Clouds. ArXiv (2021).
- Francesca Ronchini, R. Serizel, Nicolas Turpault, Samuele Cornell. The Impact Of Non-Target Events In Synthetic Soundscapes For Sound Event Detection. ArXiv (2021).
- Félix Gontier, Vincent Lostanlen, M. Lagrange, N. Fortin, C. Lavandier, J. Petiot. Polyphonic Training Set Synthesis Improves Self-Supervised Urban Sound Classification.. The Journal of the Acoustical Society of America (2021).
- Gonzalo Montero, F. Corbera. Generating Sound Palettes For A Freesound Concatenative Synthesizer To Support Creativity (2021).
- Haron Anis, Chee Onn Wong, Soon Hin Hew. Algorithmic Identification Of Tone Color: A Comparison Of Algorithmic Identification And Identification By Survey Respondents. 10th International Conference on Digital and Interactive Arts (2021).
- Hassan Taherian, S. Eskimez, T. Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang. One Model To Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement. ArXiv (2021).
- Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, J. Bello. Wav2Clip: Learning Robust Audio Representations From Clip. ArXiv (2021).
- Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, J. Bello. Wav2Clip: Learning Robust Audio Representations From Clip. ArXiv (2021).
- J. Abeßer. Usm-Sed - A Dataset For Polyphonic Sound Event Detection In Urban Sound Monitoring Scenarios. ArXiv (2021).
- J. Abeßer, Saichand Gourishetti, Andr'as K'atai, Tobias Clauss, Prachi Sharma, Judith Liebetrau. Idmt-Traffic: An Open Benchmark Dataset For Acoustic Traffic Monitoring Research. ArXiv (2021).
- Jialu Li, M. Hasegawa-Johnson, Nancy L. McElwain. Analysis Of Acoustic And Voice Quality Features For The Classification Of Infant And Mother Vocalizations. Speech Commun. (2021).
- Joseph P. Turian, Jordie Shier, G. Tzanetakis, K. McNally, Max Henry. One Billion Audio Sounds From Gpu-Enabled Modular Synthesis. ArXiv (2021).
- Juliette Millet, J. King. Inductive Biases, Pretraining And Fine-Tuning Jointly Account For Brain Responses To Speech. ArXiv (2021).
- Jurgen Vandendriessche, Nick Wouters, Bruno da Silva, Mimoun Lamrini, Mohamed Yassin Chkouri, Abdellah Touhafi. Environmental Sound Recognition On Embedded Systems: From Fpgas To Tpus. Electronics (2021).
- Karn Nichakarn Watcharasupat, Thi Ngoc Tho Nguyen, Ngoc Khanh Nguyen, Zhen Jian Lee, Douglas L. Jones, W. Gan. Improving Polyphonic Sound Event Detection On Multichannel Recordings With The Sørensen-Dice Coefficient Loss And Transfer Learning. ArXiv (2021).
- Kenneth Ooi, Karn N. Watcharasupat, Santi Peksi, Furi Andi Karnapi, Zhen-Ting Ong, Danny Chua, Hui-Wen Leow, Li-Long Kwok, Xin-Lei Ng, Zhen-Ann Loh, W. Gan. A Strongly-Labelled Polyphonic Dataset Of Urban Sounds With Spatiotemporal Context. ArXiv (2021).
- Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh, G. Widmer. Efficient Training Of Audio Transformers With Patchout. ArXiv (2021).
- Kwanghee Choi, Martin Kersner, Jacob Morton, Buru Chang. Temporal Knowledge Distillation For On-Device Audio Classification. ArXiv (2021).
- Kwanghee Choi, Martin Kersner, Jacob Morton, Buru Chang. Temporal Knowledge Distillation For On-Device Audio Classification. ArXiv (2021).
- Lijian Gao, Qirong Mao, Jingjing Chen, Ming Dong, R. Chinnam, L. Sassatelli, Miguel Fabian Romero-Rondón, Ujjwal Sharma. Reproducibility Companion Paper: On Learning Disentangled Representation For Acoustic Event Detection. ACM Multimedia (2021).
- Léo Cances, E. Labbé, T. Pellegrini. Improving Deep-Learning-Based Semi-Supervised Audio Tagging With Mixup. ArXiv (2021).
- M. Delcroix, Jorge Bennasar V'azquez, Tsubasa Ochiai, K. Kinoshita, S. Araki. Few-Shot Learning Of New Sound Classes For Target Sound Extraction. Interspeech 2021 (2021).
- M. Geravanchizadeh, Sepideh Akhtari Khosroshahi, S. Zakeri. Extraction Of Weighted Saliency Maps In Modelling Bottom-Up Auditory Attention (2021).
- M. Neumann, Ngoc Thang Vu. Investigations On Audiovisual Emotion Recognition In Noisy Conditions. 2021 IEEE Spoken Language Technology Workshop (SLT) (2021).
- Madhurananda Pahar, M. Klopper, Robin Warren, T. Niesler. Covid-19 Detection In Cough, Breath And Speech Using Deep Transfer Learning And Bottleneck Features (2021).
- Madhurananda Pahar, T. Niesler. Deep Transfer Learning Based Covid-19 Detection In Cough, Breath And Speech Using Bottleneck Features (2021).
- Marc C. Green, MarkD . Plumbley. Federated Learning With Highly Imbalanced Audio Data. ArXiv (2021).
- Michael Taenzer, S. Mimilakis, J. Abeßer. Deep Learning-Based Music Instrument Recognition: Exploring Learned Feature Representations (2021).
- Mohammad Mohammadamini, D. Matrouf, J. Bonastre, R. Serizel, Sandipana Dowerah, Denis, Jouvet. Compensate Multiple Distortions For Speaker Recognition Systems (2021).
- Motohiro Sunouchi, Masaharu Yoshioka. Proposal Of The Aesthetic Experience-Oriented Evaluation Framework For Field-Recording Sound Retrieval System: Experiments Using Acoustic Feature Signatures Based On Multiscale Fractal Dimension. IVSP (2021).
- Motohiro Sunouchi, Masaharu Yoshioka. Diversity-Robust Acoustic Feature Signatures Based On Multiscale Fractal Dimension For Similarity Search Of Environmental Sounds. IEICE Transactions on Information and Systems (2021).
- Motohiro Sunouchi, Masaharu Yoshioka. Diversity-Robust Acoustic Feature Signatures Based On Multiscale Fractal Dimension For Similarity Search Of Environmental Sounds. ArXiv (2021).
- Muddsair Sharif, Mayur Hotwani, Huseyin Seker, Gero Lückemeyer. Imobilakou: The Role Of Machine Listening To Detect Vehicle Using Sound Acoustics. ICAAI (2021).
- N. Orio, B. D. Carolis, Francesco Liotard. Locate Your Soundscape: Interacting With The Acoustic Environment. Multim. Tools Appl. (2021).
- N. Orio, B. De Carolis, Francesco Liotard. Locate Your Soundscape: Interacting With The Acoustic Environment. Multimedia tools and applications (2021).
- N. Siminski, S. Böhme, M. Herrmann. Bnst And Amygdala Activation To Threat: Effects Of Temporal Predictability And Threat Mode. Behavioural Brain Research (2021).
- N. Singh. The Sound Sketchpad: Expressively Combining Large And Diverse Audio Collections. IUI (2021).
- Nicolas Furnon, R. Serizel, S. Essid, I. Illina. Attention-Based Distributed Speech Enhancement For Unconstrained Microphone Arrays With Varying Number Of Nodes. ArXiv (2021).
- Pablo Zinemanas, Martín Rocamora, M. Miron, F. Font, X. Serra. An Interpretable Deep Learning Model For Automatic Sound Classification (2021).
- Pranay Manocha, Buye Xu, Anurag Kumar. Noresqa - A Framework For Speech Quality Assessment Using Non-Matching References. ArXiv (2021).
- Prateek Verma. Attention Is All You Need? Good Embeddings With Statistics Are Enough Audio Understanding Without Convolutions/Transformers/Berts/Mixers/Attention/Rnns (2021).
- Prateek Verma. Large Scale Audio Understanding Without Transformers/ Convolutions/ Berts/ Mixers/ Attention/ Rnns Or. ArXiv (2021).
- Prateek Verma. Large Scale Audio Understanding Without Transformers/ Convolutions/ Berts/ Mixers/ Attention/ Rnns Or. ArXiv (2021).
- Prateek Verma, J. Berger. Audio Transformers: Transformer Architectures For Large Scale Audio Understanding. Adieu Convolutions. ArXiv (2021).
- Qichen Han, Weiqiang Yuan, Dong Liu, X. Li, Zhen Yang. Automated Audio Captioning With Weakly Supervised Pre-Training And Word Selection Methods. DCASE (2021).
- Qiuying Shi, Jiqing Han. Semantic Feature Extraction Based On Subspace Learning With Temporal Constraints For Acoustic Event Recognition. Digit. Signal Process. (2021).
- Renbo Tu, M. Khodak, Nicholas Roberts, Ameet S. Talwalkar. Nas-Bench-360: Benchmarking Diverse Tasks For Neural Architecture Search. ArXiv (2021).
- Renbo Tu, Nicholas Roberts, M. Khodak, Jun Shen, Frederic Sala, Ameet S. Talwalkar. Nas-Bench-360: Benchmarking Neural Architecture Search On Diverse Tasks (2021).
- Ria Sinha. Digital Assistant For Sound Classification Using Spectral Fingerprinting. International Journal for Research in Applied Science and Engineering Technology (2021).
- Rishabh Garg, Ruohan Gao, Kristen Grauman. Geometry-Aware Multi-Task Learning For Binaural Audio Generation From Video (2021).
- Robert Müller, Steffen Illium, C. Linnhoff-Popien. A Deep And Recurrent Architecture For Primate Vocalization Classification. Interspeech (2021).
- S. Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang. Personalized Speech Enhancement: New Models And Comprehensive Evaluation. ArXiv (2021).
- S. Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, T. Yoshioka. Human Listening And Live Captioning: Multi-Task Training For Speech Enhancement. Interspeech 2021 (2021).
- S. Graetzer, Jon Barker, T. Cox, M. Akeroyd, J. Culling, G. Naylor, Eszter Porter, Rhoddy Viveros Muñoz. Clarity-2021 Challenges: Machine Learning Challenges For Advancing Hearing Aid Processing. Interspeech 2021 (2021).
- Sangwoo Park, David K. Han, Mounya Elhilali. Cross-Referencing Self-Training Network For Sound Event Detection In Audio Mixtures. ArXiv (2021).
- Sarthak Yadav, M. Foster. Gise-51: A Scalable Isolated Sound Events Dataset. ArXiv (2021).
- Seokjin Lee, Minhan Kim, S. Shin, Sooyoung Park, Youngho Jeong. Data-Dependent Feature Extraction Method Based On Non-Negative Matrix Factorization For Weakly Supervised Domestic Sound Event Detection. Applied Sciences (2021).
- Siddharth Gururani, Alexander Lerch. Semi-Supervised Audio Classification With Partially Labeled Data. 2021 IEEE International Symposium on Multimedia (ISM) (2021).
- Steven M. Goodman, Ping Liu, Emma J. McDonnell, Jon Froehlich, Steven M. Goodman, Ping Liu, Dhruv Jain, Emma J. McDonnell, Jon Froehlich. Toward User-Driven Sound Recognizer Personalization With People Who Are D/Deaf Or Hard Of Hearing. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (2021).
- Tiago B. Lacerda, Péricles B. C. Miranda, André Câmara, Ana Paula C. Furtado. Deep Learning And Mel-Spectrograms For Physica Violence Detection In Audio. Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2021) (2021).
- Tony Liu, A. Amirsoleimani, Jianxiong Xu, F. Alibart, Y. Beilliard, S. Ecoffey, Dominique Drouin, R. Genov. Codex: Stochastic Encoding Method To Relax Resistive Crossbar Accelerator Design Requirements. IEEE Transactions on Circuits and Systems II: Express Briefs (2021).
- Turab Iqbal, Yin Cao, A. Bailey, MarkD . Plumbley, Wenwu Wang. Arca23K: An Audio Dataset For Investigating Open-Set Label Noise. DCASE (2021).
- Turab Iqbal, Yin Cao, Andrew Bailey, MarkD . Plumbley, Wenwu Wang. Arca23K: An Audio Dataset For Investigating Open-Set Label Noise. ArXiv (2021).
- Valeria Mordoh, Y. Zigel. Audio Source Separation To Reduce Sleeping Partner Sounds: A Simulation Study. Physiological measurement (2021).
- Vasileios Tsouvalas, Aaqib Saeed, T. Ozcelebi. Federated Self-Training For Semi-Supervised Audio Recognition. ACM Transactions on Embedded Computing Systems (2021).
- Vasileios Tsouvalas, Aaqib Saeed, T. Ozcelebi. Federated Self-Training For Semi-Supervised Audio Recognition. ArXiv (2021).
- W. Kleijn, Andrew Storus, M. Chinen, T. Denton, Felicia S. C. Lim, Alejandro Luebs, J. Skoglund, Hengchin Yeh. Generative Speech Coding With Predictive Variance Regularization. ArXiv (2021).
- Wookey Lee, Jessica Jiwon Seong, Busra Ozlu, B. Shim, Azizbek Marakhimov, Suan Lee. Biosignal Sensors And Deep Learning-Based Speech Recognition: A Review. Sensors (2021).
- Xubo Liu, Turab Iqbal, Jinzheng Zhao, Qiushi Huang, Mark D. Plumbley, Wenwu Wang. Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning. 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) (2021).
- Y. Campos-Roca. Multidisciplinary Project-Based Learning: Improving Student Motivation For Learning Signal Processing. IEEE Signal Processing Magazine (2021).
- Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, Yejin Choi. Connecting The Dots Between Audio And Text Without Parallel Data Through Visual Knowledge Transfer. ArXiv (2021).
- Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, Yejin Choi. Connecting The Dots Between Audio And Text Without Parallel Data Through Visual Knowledge Transfer. ArXiv (2021).
- Yasha Iravantchi, Karan Ahuja, Mayank Goel, Chris Harrison, A. Sample. Privacymic: Utilizing Inaudible Frequencies For Privacy Preserving Daily Activity Recognition. CHI (2021).
- Yu Wang, Nicholas J. Bryan, J. Salamon, M. Cartwright, J. Bello. Who Calls The Shots? Rethinking Few-Shot Learning For Audio. ArXiv (2021).
- Yuan Gong, Yu-An Chung, James R. Glass. Psla: Improving Audio Tagging With Pretraining, Sampling, Labeling, And Aggregation. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2021).
- Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, K. Nakadai. Multichannel Environmental Sound Segmentation. Appl. Intell. (2021).
- Zhong-Qiu Wang, G. Wichern, Jonathan Le Roux. Leveraging Low-Distortion Target Estimates For Improved Speech Enhancement. ArXiv (2021).
- Ziqiang Shi, Liu Liu, Huibin Lin, R. Liu. Hodge And Podge: Hybrid Supervised Sound Event Detection With Multi-Hot Mixmatch And Composition Consistence Training. 2020 28th European Signal Processing Conference (EUSIPCO) (2021).
- Ziyang Chen, Xixi Hu, Andrew Owens. Structure From Silence: Learning Scene Structure From Ambient Sound. ArXiv (2021).
2020 (98)
- A. Correya, D. Bogdanov, Luis Joglar-Ongay, X. Serra. Essentia.Js: A Javascript Library For Music And Audio Analysis On The Web. ISMIR (2020).
- Abdulaziz Saleh Ba Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar AlDahoul, M. Fauzi, John See. Spectrogram-Based Classification Of Spoken Foul Language Using Deep Cnn. 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP) (2020).
- Alessandro Ragano, Emmanouil Benetos, A. Hines. Audio Impairment Recognition Using A Correlation-Based Feature Representation. 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX) (2020).
- Alessandro Ragano, Emmanouil Benetos, Andrew Hines. Audio Impairment Recognition Using A Correlation-Based Feature Representation. 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX) (2020).
- Ambika P. Mishra, N. S. Harper, Jan W. H. Schnupp. Exploring The Distribution Of Statistical Feature Parameters For Natural Sound Textures (2020).
- Andreas Hüwel, K. Adiloglu, Jörg-Hendrik Bach. Hearing Aid Research Data Set For Acoustic Environment Recognition. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Andrey Guzhov, Federico Raue, J. Hees, Andreas Dengel. Esresnet: Environmental Sound Classification Based On Visual Domain Models. ArXiv (2020).
- Ant'onio Ramires, F. Font, D. Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, B. Chen, Yueh-Kao Wu, Hsu Wei-Han, X. Serra. The Freesound Loop Dataset And Annotation Tool. ArXiv (2020).
- Ant'onio Ramires, Gilberto Bernardes, M. Davies, X. Serra. Tiv.Lib: An Open-Source Library For The Tonal Description Of Musical Audio. ArXiv (2020).
- Ant'onio Ramires, Pritish Chandna, Xavier Favory, E. Gómez, X. Serra. Neural Percussive Synthesis Parameterised By High-Level Timbral Features. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- António Ramires, F. Font, D. Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, X. Serra. The Freesound Loop Dataset And Annotation Tool. ISMIR (2020).
- Beat Gfeller, Dominik Roblek, M. Tagliasacchi. One-Shot Conditional Audio Filtering Of Arbitrary Sounds. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Beat Gfeller, Dominik Roblek, M. Tagliasacchi. One-Shot Conditional Audio Filtering Of Arbitrary Sounds. (2020).
- Bowei Hou, Kacper Radzikowski, A. Farid. Fine-Tuning Using Grid Search & Gradient Visualization Technical Report (2020).
- C. Asplund, Takashi Obana, P. Bhatnagar, Xun Quan Koh, Simon T. Perrault. It’S All In The Timing. ACM Trans. Comput. Hum. Interact. (2020).
- Charles Bales, C. John, Hasan Farooq, Usama Masood, Muhammad Nabeel, A. Imran. Can Machine Learning Be Used To Recognize And Diagnose Coughs?. 2020 International Conference on e-Health and Bioengineering (EHB) (2020).
- Charles Bales, Charles N. John, H. Farooq, Usama Masood, M. Nabeel, A. Imran. Can Machine Learning Be Used To Recognize And Diagnose Coughs?. 2020 International Conference on e-Health and Bioengineering (EHB) (2020).
- Chung-il Kim, Yongjang Cho, Seung-Won Jung, Jehyeok Rew, Eenjun Hwang. Animal Sounds Classification Scheme Based On Multi-Feature Network With Mixed Datasets. KSII Transactions on Internet and Information Systems (2020).
- D. Elliott, Evan Martino, C. Otero, Anthony O. Smith, A. Peter, Benjamin Luchterhand, Eric Lam, S. Leung. Cyber-Physical Analytics: Environmental Sound Classification At The Edge. 2020 IEEE 6th World Forum on Internet of Things (WF-IoT) (2020).
- D. Liang, Wenting Song, E. Thomaz. Characterizing The Effect Of Audio Degradation On Privacy Perception And Inference Performance In Audio-Based Human Activity Recognition. MobileHCI (2020).
- Daiki Takeuchi, Y. Koizumi, Y. Ohishi, N. Harada, Kunio Kashino. Effects Of Word-Frequency Based Pre- And Post- Processings For Audio Captioning. ArXiv (2020).
- Danula Hettiachchi, Zhanna Sarsenbayeva, F. Allison, N. V. Berkel, Tilman Dingler, Gabriele Marini, V. Kostakos, J. Gonçalves. 'Hi! I Am The Crowd Tasker' Crowdsourcing Through Digital Voice Assistants. CHI (2020).
- Dhruv Jain, Hung Q. Ngo, P. Patel, Steven Goodman, Leah Findlater, Jon Froehlich. Soundwatch: Exploring Smartwatch-Based Deep Learning Approaches To Support Sound Awareness For Deaf And Hard Of Hearing Users. ASSETS (2020).
- Dhruv Jain, Kelly Mack, Akli Amrous, Matt Wright, S. Goodman, Leah Findlater, Jon Froehlich. Homesound: An Iterative Field Deployment Of An In-Home Sound Awareness System For Deaf Or Hard Of Hearing Users. CHI (2020).
- E. Fonseca, Diego Ortego, K. McGuinness, N. O'Connor, X. Serra. Unsupervised Contrastive Learning Of Sound Event Representations. ArXiv (2020).
- E. Fonseca, Shawn Hershey, M. Plakal, D. Ellis, A. Jansen, R. C. Moore. Addressing Missing Labels In Large-Scale Sound Event Recognition Using A Teacher-Student Framework With Loss Masking. IEEE Signal Processing Letters (2020).
- E. Fonseca, Xavier Favory, J. Pons, F. Font, X. Serra. Fsd50K: An Open Dataset Of Human-Labeled Sound Events. ArXiv (2020).
- Eduardo Fonseca, Shawn Hershey, M. Plakal, D. Ellis, A. Jansen, R. C. Moore. Addressing Missing Labels In Large-Scale Sound Event Recognition Using A Teacher-Student Framework With Loss Masking. IEEE Signal Processing Letters (2020).
- Eduardo Fonseca, Xavier Favory, Jordi Pons, F. Font, X. Serra. Fsd50K: An Open Dataset Of Human-Labeled Sound Events. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020).
- Etienne Richan, Jean Rouat. A Proposal And Evaluation Of New Timbre Visualization Methods For Audio Sample Browsers. Personal and Ubiquitous Computing (2020).
- F. Naccari, I. Guarneri, S. Curti, A. Savi. Embedded Acoustic Scene Classification For Low Power Microcontroller Devices. DCASE (2020).
- Fei Jia, Somshubra Majumdar, B. Ginsburg. Marblenet: Deep 1D Time-Channel Separable Convolutional Neural Network For Voice Activity Detection. ArXiv (2020).
- Felicia Lim, W. Kleijn, M. Chinen, J. Skoglund. Robust Low Rate Speech Coding Based On Cloned Networks And Wavenet. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Francisco Bernardo. Interactive Machine Learning For User-Innovation Toolkits : An Action Design Research Approach (2020).
- G. Lavrentyeva, M. Volkova, A. Avdeeva, S. Novoselov, Artem Gorlanov, Tseren Andzhukaev, A. Ivanov, A. Kozlov. Blind Speech Signal Quality Estimation For Speaker Verification Systems. INTERSPEECH (2020).
- Gabriel Meseguer-Brocal, Alice Cohen-Hadria, Geoffroy Peeters. Creating Dali, A Large Dataset Of Synchronized Audio, Lyrics, And Notes. Trans. Int. Soc. Music. Inf. Retr. (2020).
- H. Xie, T. Virtanen. Zero-Shot Audio Classification Via Semantic Embeddings. (2020).
- Hitham Jleed, M. Bouchard. Open Set Audio Recognition For Multi-Class Classification With Rejection. IEEE Access (2020).
- Honglie Chen, Weidi Xie, A. Vedaldi, Andrew Zisserman. Vggsound: A Large-Scale Audio-Visual Dataset. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Huang Xie, Tuomas Virtanen. Zero-Shot Audio Classification Via Semantic Embeddings. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020).
- Hyeong-Seok Choi, Hye-Seong Heo, J. H. Lee, K. Lee. Phase-Aware Single-Stage Speech Denoising And Dereverberation With U-Net. ArXiv (2020).
- Ivo Trowitzsch. Robust Sound Event Detection In Binaural Computational Auditory Scene Analysis (2020).
- J. Balam, Jocelyn Huang, V. Lavrukhin, Slyne Deng, Somshubra Majumdar, B. Ginsburg. Improving Noise Robustness Of An End-To-End Neural Model For Automatic Speech Recognition (2020).
- Jae-Bin Kim, Seongkyu Mun, Myungwoo Oh, Soyeon Choe, Yong-Hyeok Lee, Hyung-Min Park. Overcoming Label Noise In Audio Event Detection Using Sequential Labeling. ArXiv (2020).
- Jiale Yang, Ying Zhang, Yang Hai. Retrieval And Management System For Layer Sound Effect Library (2020).
- Jin Sean Lim. Ensemble Learning Of High Dimension Datasets (2020).
- Jinta Zheng, Shih-Hsuan Hung, Kyle Hiebel, Y. Zhang. Real-Time Rendering Of Decorative Sound Textures For Soundscapes. ACM Trans. Graph. (2020).
- Joann Ching, Ant'onio Ramires, Y. Yang. Instrument Role Classification: Auto-Tagging For Loop Based Music (2020).
- Joseph P. Turian, M. Henry. I'M Sorry For Your Loss: Spectrally-Based Audio Distances Are Bad At Pitch. ArXiv (2020).
- João Pedro Duarte Galileu. Urban Sound Event Classification For Audio-Based Surveillance Systems (2020).
- K. He, Yu-Han Shen, W. Zhang, J. Liu. Staged Training Strategy And Multi-Activation For Audio Tagging With Noisy And Sparse Multi-Label Data. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- K. Miyazaki, Tatsuya Komatsu, T. Hayashi, Shinji Watanabe, T. Toda, K. Takeda. Weakly-Supervised Sound Event Detection With Self-Attention. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- K. Prinz, A. Flexer. End-To-End Adversarial White Box Attacks On Music Instrument Classification. ArXiv (2020).
- K. Prinz, A. Flexer, G. Widmer. The Impact Of Label Noise On A Music Tagger. ArXiv (2020).
- Kohki Mametani, Xavier Favory, Co-Supervisor Frederic Font. Learning Sound Representations Using Triplet-Loss (2020).
- Konstantinos Drossos, Samuel Lipping, T. Virtanen. Clotho: An Audio Captioning Dataset. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- L. Delphin-Poulat, R. Nicol, Cyril Plapous, Katell Peron. Comparative Assessment Of Data Augmentation For Semi-Supervised Polyphonic Sound Event Detection. 2020 27th Conference of Open Innovations Association (FRUCT) (2020).
- L. Gao, Kele Xu, H. Wang, Yu-xing Peng. Multi-Representation Knowledge Distillation For Audio Classification. ArXiv (2020).
- L. Turchet, G. Fazekas, M. Lagrange, H. S. Ghadikolaei, C. Fischione. The Internet Of Audio Things: State Of The Art, Vision, And Challenges. IEEE Internet of Things Journal (2020).
- L. Turchet, Jhonny Hueller. Promoting Awareness On Sustainable Behavior Through An Ar-Based Art Gallery. AVR (2020).
- L. Wijayasingha, J. Stankovic. Robustness To Noise For Speech Emotion Classification Using Cnns And Attention Mechanisms (2020).
- L. Zhang, Ziqiang Shi, Jiqing Han. Pyramidal Temporal Pooling With Discriminative Mapping For Audio Classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020).
- Lu Cao, Yu-long Chen, Dandan Huang, Y. Zhang. Investigating Rich Feature Sources For Conceptual Representation Encoding. COGALEX (2020).
- Luca Turchet, Alex Zanetti. Voice-Based Interface For Accessible Soundscape Composition: Composing Soundscapes By Vocally Querying Online Sounds Repositories. Audio Mostly Conference (2020).
- Luca Turchet, J. Pauwels, C. Fischione, György Fazekas. Cloud-Smart Musical Instrument Interactions. ACM Trans. Internet Things (2020).
- M. Tagliasacchi, Y. Li, Karolis Misiunas, Dominik Roblek. Seanet: A Multi-Modal Speech Enhancement Network. INTERSPEECH (2020).
- M. Tagliasacchi, Yunpeng Li, Karolis Misiunas, Dominik Roblek. Seanet: A Multi-Modal Speech Enhancement Network. INTERSPEECH (2020).
- Michael Wand, Jiirgen Schmidhuber. Fusion Architectures For Word-Based Audiovisual Speech Recognition. INTERSPEECH (2020).
- Michela Cantarini, L. Serafini, L. Gabrielli, E. Principi, S. Squartini. Emergency Siren Recognition In Urban Scenarios: Synthetic Dataset And Deep Learning Models. ICIC (2020).
- Nicolas Furnon, Romain Serizel, I. Illina, S. Essid. Dnn-Based Mask Estimation For Distributed Speech Enhancement In Spatially Unconstrained Microphone Arrays (2020).
- Nicolas Turpault, Romain Serizel. Training Sound Event Detection On A Heterogeneous Dataset. ArXiv (2020).
- Nicolas Turpault, Romain Serizel, E. Vincent. Limitations Of Weak Labels For Embedding And Tagging. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Nicolas Turpault, Romain Serizel, Scott T. Wisdom, H. Erdogan, J. Hershey, E. Fonseca, P. Seetharaman, Justin Salamon. Sound Event Detection And Separation: A Benchmark On Desed Synthetic Soundscapes. ArXiv (2020).
- Nicolas Turpault, S. Wisdom, H. Erdogan, J. Hershey, Romain Serizel, E. Fonseca, P. Seetharaman, Justin Salamon. Improving Sound Event Detection In Domestic Environments Using Sound Separation. ArXiv (2020).
- R. Guo, Y. Yang, Johnson Kuang, X. Bin, Dhruv Jain, Steven Goodman, Leah Findlater, Jon Froehlich. Holosound: Combining Speech And Sound Identification For Deaf Or Hard Of Hearing Users On A Head-Mounted Display. ASSETS (2020).
- Romain Serizel, Nicolas Turpault, Ankit Shah, Justin Salamon. Sound Event Detection In Synthetic Domestic Environments. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- S. Barbosa, P. Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu, K. Sivalingam, D. Slezak, T. Washio, Xiaokang Yang, J. Yuan, R. Prates, S. Bernardi, V. Vittorini, Francesco Flammini, R. Nardone, S. Marrone, R. Adler, Daniel Schneider, P. Schleiss, Nicola Nostro, R. Olsen, Amleto Di Salle, P. Masci. Dependable Computing - Edcc 2020 Workshops: Ai4Rails, Dreams, Dsogri, Serene 2020, Munich, Germany, September 7, 2020, Proceedings. EDCC Workshops (2020).
- S. Deshmukh, B. Raj, R. Singh. Multi-Task Learning For Interpretable Weakly Labelled Sound Event Detection. ArXiv (2020).
- S. Veena, M. Nerisai, J. Remya, S. SaiTejah.. Challenges And Issues Of Sound Archives For Environmental Sound Classification (2020).
- S. Wisdom, Efthymios Tzinis, H. Erdogan, Ron J. Weiss, K. Wilson, J. Hershey. Unsupervised Sound Separation Using Mixture Invariant Training. NeurIPS (2020).
- S. Wisdom, Efthymios Tzinis, H. Erdogan, Ron J. Weiss, K. Wilson, J. Hershey. Unsupervised Sound Separation Using Mixtures Of Mixtures. ArXiv (2020).
- S. Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, K. Wilson, J. Hershey. Unsupervised Sound Separation Using Mixture Invariant Training. NeurIPS (2020).
- S. Yoon, Min-Sung Koh, Ha-Jin Yu. Fuzzy Restricted Boltzmann Machine Based Probabilistic Linear Discriminant Analysis For Noise-Robust Text-Dependent Speaker Verification On Short Utterances (2020).
- Sangwook Park, Ashwin Bellur, Sandeep Reddy Kothinti, Masoumeh Heidari Kapourchali, M. Elhilali. Joint Acoustic And Supervised Inference For Sound Event Detection Technical Report (2020).
- Scott T. Wisdom, H. Erdogan, D. Ellis, Romain Serizel, Nicolas Turpault, E. Fonseca, Justin Salamon, P. Seetharaman, J. Hershey. What'S All The Fuss About Free Universal Sound Separation Data?. ArXiv (2020).
- Somshubra Majumdar, B. Ginsburg. Matchboxnet: 1D Time-Channel Separable Convolutional Neural Network Architecture For Speech Commands Recognition. INTERSPEECH (2020).
- Somshubra Majumdar, Boris Ginsburg. Matchboxnet: 1D Time-Channel Separable Convolutional Neural Network Architecture For Speech Commands Recognition. INTERSPEECH (2020).
- T. Iqbal, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang. Learning With Out-Of-Distribution Data For Audio Classification. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020).
- Theodoros Psallidas, Alexander Mitsou, George Pikramenos, E. Spyrou, Theodore Giannakopoulos. Archeo: A Dataset For Sound Event Detection In Areas Of Touristic Interest. 2020 15th International Workshop on Semantic and Social Media Adaptation and Personalization (SMA (2020).
- Tom Denton, Alejandro Luebs, Felicia S. C. Lim, Andrew Storus, Hengchin Yeh, W. Kleijn, J. Skoglund. Handling Background Noise In Neural Speech Generation. 2020 54th Asilomar Conference on Signals, Systems, and Computers (2020).
- Tom Mudd, - KatieWilkie, Mckenna, A. Mcpherson, M. Wanderley. Embodied Musical Interaction Body Physiology, Cross Modality, And Sonic Experience (2020).
- Tony Marteau, Sitou Afanou, D. Sodoyer, Sébastien Ambellouis, F. Elbahhar. Audio Events Detection In Noisy Embedded Railway Environments. EDCC Workshops (2020).
- Xavier Favory, F. Font, X. Serra. Search Result Clustering In Collaborative Sound Collections. ICMR (2020).
- Xavier Favory, Konstantinos Drossos, T. Virtanen, X. Serra. Learning Contextual Tag Embeddings For Cross-Modal Alignment Of Audio And Tags. ArXiv (2020).
- Xavier Favory, Konstantinos Drossos, T. Virtanen, X. Serra. Coala: Co-Aligned Autoencoders For Learning Semantically Enriched Audio Representations. ArXiv (2020).
- Y. Koizumi, Ryo Masumura, Kyosuke Nishida, M. Yasuda, S. Saito. A Transformer-Based Audio Captioning Model With Keyword Estimation. INTERSPEECH (2020).
- You-Siang Chen, Zi Jie Lin, Shang-En Li, Chih-Yuan Koh, M. R. Bai, Jen-Tzung Chien, Yi-Wen Liu. Combined Sound Event Detection And Sound Event Separation Networks For Dcase 2020 Task 4 Technical Report (2020).
- Yuma Koizumi, Ryo Masumura, Kyosuke Nishida, Masahiro Yasuda, S. Saito. A Transformer-Based Audio Captioning Model With Keyword Estimation. INTERSPEECH (2020).
2019 (70)
- . Development Of Algorithms For Gunshot Detection (2019).
- A. Kumar, Ankit Shah, A. Hauptmann, B. Raj. Learning Sound Events From Webly Labeled Data. IJCAI (2019).
- A. Salekin, Shabnam Ghaffarzadegan, Zhe Feng, J. Stankovic. A Real-Time Audio Monitoring Framework With Limited Data For Constrained Devices. 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS) (2019).
- A. Tanaka. Embodied Musical Interaction - Body Physiology, Cross Modality, And Sonic Experience. New Directions in Music and Human-Computer Interaction (2019).
- Ant'onio Ramires, X. Serra. Data Augmentation For Instrument Classification Robust To Audio Effects. ArXiv (2019).
- António Ramires, Pritish Chandna, Xavier Favory, Emilia G'omez, X. Serra. Neural Percussive Synthesis Parameterised By High-Level Timbral Features. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- Ariane Stolfi, A. Milo, M. Barthet. Playsound.Space: Improvising In The Browser With Semantic Sound Objects (2019).
- B. Elizalde, Shuayb Zarar, B. Raj. Cross Modal Audio Search And Retrieval With Joint Embeddings Based On Text And Audio. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- B. H. D. Koh, W. L. Woo. Multi-View Temporal Ensemble For Classification Of Non-Stationary Signals. IEEE Access (2019).
- B. McFee, J. Kim, M. Cartwright, Justin Salamon, Rachel M. Bittner, J. Bello. Open-Source Practices For Music Signal Processing Research: Recommendations For Transparent, Sustainable, And Reproducible Audio Research. IEEE Signal Processing Magazine (2019).
- B. Silva, Axel W. Happi, An Braeken, A. Touhafi. Evaluation Of Classical Machine Learning Techniques Towards Urban Sound Recognitionon Embedded Systems. Applied Sciences (2019).
- B. Zhu, Kele Xu, D. Wang, Mathurin Aché. Detection And Classification Of Acoustic Scenes And Events 2019 Challenge Multi-Label Audio Tagging With Noisy Labels And Variable Length Technical Report (2019).
- Boyang Zhang Jared Leitner, Samuel Thornton. Audio Recognition Using Mel Spectrograms And Convolution Neural Networks (2019).
- C. Kim, Byeongchang Kim, Hyunmin Lee, Gunhee Kim. Audiocaps: Generating Captions For Audios In The Wild. NAACL (2019).
- Ceren Can. Automatic Discrimination Of Domestic Cat Sounds And Imitations (2019).
- Chenliang Xu. Preprint-Work In Progress (2019).
- D. Liang, E. Thomaz. Audio-Based Activities Of Daily Living (Adl) Recognition With Large-Scale Acoustic Embeddings From Online Videos. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (2019).
- Dimitra Emmanouilidou, H. Gamper. The Effect Of Room Acoustics On Audio Event Classification (2019).
- E. Fonseca, F. Font, Xavier Serra. Model-Agnostic Approaches To Handling Noisy Labels When Training Sound Event Classifiers. 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2019).
- E. Fonseca, M. Plakal, D. Ellis, F. Font, Xavier Favory, X. Serra. Learning Sound Event Classifiers From Web Audio With Noisy Labels. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- E. Fonseca, M. Plakal, F. Font, D. Ellis, X. Serra. Audio Tagging With Noisy Labels And Minimal Supervision. ArXiv (2019).
- Eero-Pekka Damskägg, Lauri Juvela, Etienne Thuillier, V. Välimäki. Deep Learning For Tube Amplifier Emulation. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- Etienne Richan, J. Rouat. A Study Comparing Shape, Colour And Texture As Visual Labels In Audio Sample Browsers. Audio Mostly Conference (2019).
- Evren Kanalici, Gokhan Bilgin. Scattering Wavelet Hash Fingerprints For Musical Audio Recognition (2019).
- F. J. M. Ortega, Sergio I. Giraldo, A. Pérez, R. Ramírez. Phrase-Level Modeling Of Expression In Violin Performances. Front. Psychol. (2019).
- H. Koh, W. L. Woo. Multi-View Temporal Ensemble For Classification Of Non-Stationary Signals (2019).
- H. Xie, T. Virtanen. Zero-Shot Audio Classification Based On Class Label Embeddings. 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2019).
- Haikun Huang, M. Solah, Dingzeyu Li, Lap-Fai Yu. Audible Panorama: Automatic Spatial Audio Generation For Panorama Imagery. CHI (2019).
- Harishchandra Dubey, Dimitra Emmanouilidou, I. Tashev. Cure Dataset: Ladder Networks For Audio Event Classification. 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) (2019).
- Harsh Purohit, R. Tanabe, K. Ichige, T. Endo, Y. Nikaido, Kaori Suefusa, Y. Kawaguchi. Mimii Dataset: Sound Dataset For Malfunctioning Industrial Machine Investigation And Inspection. ArXiv (2019).
- Ivo Trowitzsch, Jalil Taghia, Youssef Kashef, K. Obermayer. The Nigens General Sound Events Database. ArXiv (2019).
- J. He, Penghao Rao, B. Sun, Lejun Yu. Audio Tagging With Minimal Supervision Based On Mean Teacher For Dcase 2019 Challenge Task 2 Technical Report (2019).
- J. Pons, J. Serrà, X. Serra. Training Neural Audio Classifiers With Few Data. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- J. Ramírez, M. Flores. Machine Learning For Music Genre: Multifaceted Review And Experimentation With Audioset. Journal of Intelligent Information Systems (2019).
- Jonas Margraf. Master'S Thesis: Self-Organizing Maps For Sound Corpus Organization (2019).
- K. Ahmad, N. Conci. How Deep Features Have Improved Event Recognition In Multimedia. ACM Trans. Multim. Comput. Commun. Appl. (2019).
- K. He, Yu-Han Shen, W. Zhang. Multiple Neural Networks With Ensemble Method For Audio Tagging With Noisy Labels And Minimal Supervision (2019).
- K. Prinz, A. Flexer. Weak Multi-Label Audio-Tagging With Class Noise (2019).
- K. Salo. Modular Audio Platform For Youth Engagement In A Museum Context (2019).
- Kele Xu, B. Zhu, Qiuqiang Kong, Haibo Mi, B. Ding, D. Wang, H. Wang. General Audio Tagging With Ensembling Convolutional Neural Network And Statistical Features. The Journal of the Acoustical Society of America (2019).
- Kexin He, Yuhan Shen, W. Zhang. Thuee System For Dcase 2019 Challenge Task 2 Technical Report (2019).
- L. Gao, Haibo Mi, B. Zhu, Da-wei Feng, Yicong Li, Y. Peng. An Adversarial Feature Distillation Method For Audio Classification. IEEE Access (2019).
- L. Gao, Qirong Mao, M. Dong, Y. Jing, R. Chinnam. On Learning Disentangled Representation For Acoustic Event Detection. ACM Multimedia (2019).
- L. Lin, X. Wang, Hong Liu, Yueliang Qian. Guided Learning Convolution System For Dcase 2019 Task 4. ArXiv (2019).
- Lluis Suros. Clustering Of Multiple-Event Online Sound Collections With The Codebook Approach (2019).
- Luca Turchet, M. Barthet. An Ubiquitous Smart Guitar System For Collaborative Musical Practice (2019).
- Léo Cances, T. Pellegrini, Patrice Guyot. Multi-Task Learning And Post Processing Optimization For Sound Event Detection Technical Report (2019).
- M. Cartwright, Ana Elisa Méndez Méndez, J. Cramer, Vincent Lostanlen, G. Dove, Ho-Hsiang Wu, Justin Salamon, Oded Nov, J. Bello. Sonyc Urban Sound Tagging (Sonyc-Ust): A Multilabel Dataset From An Urban Acoustic Sensor Network (2019).
- Masayuki Karasuyama, Masashi Sugiyama. Title Canonical Dependency Analysis Based On Squared-Loss Mutualinformation (2019).
- Md. Rahat-uz-Zaman, Shadmaan Hye, M. Hasan. Audio Future Block Prediction With Conditional Generative Adversarial Network. 2019 3rd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE) (2019).
- Miles Thorogood. Soundscape Generation Systems (2019).
- Miles Thorogood, Jianyu Fan, P. Pasquier. A Framework For Computer-Assisted Sound Design Systems Supported By Modelling Affective And Perceptual Properties Of Soundscape (2019).
- Nicolas Turpault, R. Serizel, Ankit Shah, Justin Salamon. Sound Event Detection In Domestic Environments With Weakly Labeled Data And Soundscape Synthesis (2019).
- Nicolas Turpault, R. Serizel, E. Vincent. Semi-Supervised Triplet Loss Based Learning Of Ambient Audio Embeddings. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- O. Akiyama, J. Sato. Dcase 2019 Task 2: Multitask Learning, Semi-Supervised Learning And Model Ensemble With Noisy Data For Audio Tagging (2019).
- Qiuqiang Kong, Yin Cao, T. Iqbal, Y. Xu, W. Wang, Mark D. Plumbley. Cross-Task Learning For Audio Tagging, Sound Event Detection And Spatial Localization: Dcase 2019 Baseline Systems. ArXiv (2019).
- S. A. Shahriyar, M. Akhand, N. Siddique, T. Shimamura. Speech Enhancement Using Convolutional Denoising Autoencoder. 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (2019).
- S. Astapov, G. Svirskiy, A. Lavrentyev, Tatyana Prisyach, D. Popov, Dmitriy Ubskiy, Vladimir Kabarov. Acoustic Event Mixing To Multichannel Ami Data For Distant Speech Recognition And Acoustic Event Classification Benchmarking. SPECOM (2019).
- S. Singh, A. Pankajakshan, Emmanouil Benetos, Events. Audio Tagging Using A Linear Noise Modelling Layer (2019).
- Shota Ikawa, Kunio Kashino. Neural Audio Captioning Based On Conditional Sequence-To-Sequence Model (2019).
- Szu-Yu Chou, Kai-Hsiang Cheng, J. Jang, Y. Yang. Learning To Match Transient Sound Events Using Attentional Similarity For Few-Shot Sound Recognition. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019).
- Tobias Goehring, M. Keshavarzi, R. Carlyon, B. Moore. Using Recurrent Neural Networks To Improve The Perception Of Speech In Non-Stationary Noise By People With Cochlear Implants.. The Journal of the Acoustical Society of America (2019).
- W. Wang, F. Seraj, N. Meratnia, P. Havinga. Privacy-Aware Environmental Sound Classification For Indoor Human Activity Recognition. PETRA (2019).
- Wootaek Lim. Specaugment For Sound Event Detection In Domestic Environments Using Ensemble Of Convolutional Recurrent Neural Networks (2019).
- Wootaek Lim, S. Suh, Sooyoung Park, Youngho Jeong. Sound Event Detection In Domestic Environments Using Ensemble Of Convolutional Recurrent Neural Networks Technical Report (2019).
- Xavier Favory, X. Serra. Multi Web Audio Sequencer: Collaborative Music Making. ArXiv (2019).
- Yapeng Tian, Chenliang Xu, Dingzeyu Li. Deep Audio Prior. ArXiv (2019).
- Yuma Koizumi, S. Saito, H. Uematsu, N. Harada, Keisuke Imoto. Toyadmos: A Dataset Of Miniature-Machine Operating Sounds For Anomalous Sound Detection. 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2019).
- Z. Podwinska, B. Fazenda, W. Davies. Testing Spatial Aspects Of Auditory Salience (2019).
- Ziqiang Shi, L. Liu, Huibin Lin, R. Liu, Anyan Shi. Hodgepodge: Sound Event Detection Based On Ensemble Of Semi-Supervised Learning Methods. ArXiv (2019).
2018 (37)
- Andreu Boadas Rabassedas. Study Of The Signal Properties Of Music Genres (2018).
- Aniel Rossi. Event Recognition Of Domestic Sounds Using Semi-Supervised Learning (2018).
- Anna Xambó, G. Roma, Alexander Lerch, M. Barthet, György Fazekas. Live Repurposing Of Sounds: Mir Explorations With Personal And Crowdsourced Databases. NIME (2018).
- Ariane de Souza Stolfi, Miguel Ceriani, Luca Turchet, M. Barthet. Playsound.Space: Inclusive Free Music Improvisations Using Audio Commons. NIME (2018).
- Chris Baume. Semantic Audio Tools For Radio Production (2018).
- E. Fonseca, M. Plakal, F. Font, D. Ellis, Xavier Favory, J. Pons, X. Serra. General-Purpose Tagging Of Freesound Audio With Audioset Labels: Task Description, Dataset, And Baseline. ArXiv (2018).
- F. Viola, A. Stolfi, A. Milo, Miguel Ceriani, M. Barthet, György Fazekas. Playsound.Space: Enhancing A Live Music Performance Tool With Semantic Recommendations. SAAM@ISWC (2018).
- F. Viola, Ariane Stolfi, A. Milo, Miguel Ceriani, M. Barthet, György Fazekas. Playsound.Space: Enhancing A Live Performance Tool With Semantic Recommendations (2018).
- G. Roma, Owen Green, Anna Xambó, P. Tremblay. A Javascript Library For Flexible Visualization Of Audio Descriptors (2018).
- Gabriel Meseguer-Brocal, Alice Cohen-Hadria, Geoffroy Peeters. Dali: A Large Dataset Of Synchronized Audio, Lyrics And Notes, Automatically Created Using Teacher-Student Machine Learning Paradigm. ISMIR (2018).
- Gerard Llorach, G. Grimm, Maartje M. E. Hendrikse, V. Hohmann. Towards Realistic Immersive Audiovisual Simulations For Hearing Research: Capture, Virtual Scenes And Reproduction. AVSU@MM (2018).
- Gierad Laput, K. Ahuja, Mayank Goel, C. Harrison. Ubicoustics: Plug-And-Play Acoustic Activity Recognition. UIST (2018).
- Gierad Laput, Karan Ahuja, Mayank Goel, Chris Harrison. Ubicoustics. Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (2018).
- Henry Kvinge, Elin Farnell, M. Kirby, C. Peterson. Monitoring The Shape Of Weather, Soundscapes, And Dynamical Systems: A New Statistic For Dimension-Driven Data Analysis On Large Datasets. 2018 IEEE International Conference on Big Data (Big Data) (2018).
- J. Palomaki, Olivia Rhinehart, Michael Tseng. A Case For A Range Of Acceptable Annotations. SAD/CrowdBias@HCOMP (2018).
- Kele Xu, B. Zhu, D. Wang, Yu-xing Peng, H. Wang, Lilun Zhang, B. Li. Meta Learning Based Audio Tagging (2018).
- Kevin Wilkinghoff. General-Purpose Audio Tagging By Ensembling Convolutional Neural Networks Based On Multiple Features (2018).
- L. Turchet, M. Barthet. Jamming With A Smart Mandolin And Freesound-Based Accompaniment. 2018 23rd Conference of Open Innovations Association (FRUCT) (2018).
- Linus Lexfors, Malte Johansson. Audio Representation For Environmental Sound Classification Using Convolutional Neural Networks (2018).
- M. Dorfer, G. Widmer. Training General-Purpose Audio Tagging Networks With Noisy Labels And Iterative Self-Verification (2018).
- M. Mancas, Christian Frisson, E. al., Noé Tits. Proceedings Of Enterface 2015 Workshop On Intelligent Interfaces. ArXiv (2018).
- MeMAD Deliverable. Memad Deliverable D 2 . 1 Libraries And Tools For Multimodal Content Analysis (2018).
- Michael Wand, Ngoc Thang Vu, J. Schmidhuber. Investigations On End- To-End Audiovisual Fusion. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018).
- Naoya Takahashi, Michael Gygli, L. V. Van Gool. Aenet: Learning Deep Audio Features For Video Analysis. IEEE Transactions on Multimedia (2018).
- Philip Tovstogan. Exploring Music Similarity With Acousticbrainz (2018).
- Shota Ikawa, Kunio Kashino. Acoustic Event Search With An Onomatopoeic Query: Measuring Distance Between Onomatopoeic Words And Sounds (2018).
- Sophie Skach, Anna Xambó, L. Turchet, A. Stolfi, R. Stewart, M. Barthet. Embodied Interactions With E-Textiles And The Internet Of Sounds For Performing Arts. Tangible and Embedded Interaction (2018).
- T. Iqbal, Qiuqiang Kong, Mark D. Plumbley, W. Wang. General-Purpose Audio Tagging From Noisy Labels Using Convolutional Neural Networks (2018).
- T. Malon, G. Roman-Jimenez, Patrice Guyot, S. Chambon, V. Charvillat, A. Crouzil, A. Péninou, J. Pinquier, F. Sèdes, C. Sénac. Toulouse Campus Surveillance Dataset: Scenarios, Soundtracks, Synchronized Videos With Overlapping And Disjoint Views. MMSys (2018).
- Thi Ngoc Tho Nguyen, Ngoc Khanh Nguyen, Douglas L. Jones, W. Gan. Dcase 2018 Task 2: Iterative Training, Label Smoothing, And Background Noise Normalization For Audio Event Tagging. DCASE (2018).
- Tian-Xiang Chen, Udit Gupta. Attention-Based Convolutional Neural Network For Audio Event Classification With Feature Transfer Learning (2018).
- Turab Iqbal, Qiuqiang Kong, D. Plumbley, Mark D. Plumbley. Stacked Convolutional Neural Networks For General-Purpose Audio Tagging Technical Report (2018).
- V. Subramanian, Alexander Lerch. Concert Stitch: Organization And Synchronization Of Crowd Sourced Recordings. ISMIR (2018).
- Venkatesh S. Kadandale. Musical Instrument Recognition In Multi-Instrument Audio Contexts (2018).
- Xavier Favory, E. Fonseca, F. Font, X. Serra. Facilitating The Manual Annotation Of Sounds When Using Large Taxonomies. ArXiv (2018).
- Zhicun Xu. Audio Event Classification Using Deep Learning Methods (2018).
- Zhicun Xu, P. Smit, M. Kurimo. The Aalto System Based On Fine-Tuned Audioset Features For Dcase2018 Task2 - General Purpose Audio Tagging (2018).
2017 (17)
- A. C. D. C. Junior. Mobile Technologies For Music Interaction (2017).
- A. Correya. Retrieving Ambiguous Sounds Using Perceptual Timbral Attributes In Audio Production Environments (2017).
- A. Stolfi, M. Barthet, Fábio Goródscy, A. C. D. C. Junior. Open Band: A Platform For Collective Sound Dialogues. Audio Mostly Conference (2017).
- Akito van Troyer. Score Instruments : A New Paradigm Of Musical Instruments To Guide Musical Wonderers (2017).
- Aleksandr Diment, T. Virtanen. Transfer Learning Of Weakly Labelled Audio. 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2017).
- Ashwin K. Vijayakumar, Ramakrishna Vedantam, D. Parikh. Sound-Word2Vec: Learning Word Representations Grounded In Sounds. EMNLP (2017).
- D. Hernández-Leo, Kostantinos Michos, B. Cabrero, Daniel, A. Martínez-Rodríguez, M. Muñoz, Carla Ten Ventura, K. Sharma, Manaswi Mishra, S. Bhardwaj, Adrian A Perez, Giorgos Neokleous, Pantelis Stylianides, Vibhor Bajpai, N. Delgado, Tessy Troes, Meghana Sudhindra, H. Cuesta. Phd Selection: Factors To Take Into Account (2017).
- Douwe Kiela. Deep Embodiment: Grounding Semantics In Perceptual Modalities (2017).
- Douwe Kiela, Stephen Clark. Learning Neural Audio Embeddings For Grounding Semantics In Auditory Perception. J. Artif. Intell. Res. (2017).
- E. Cherny. A Method For Automatic Whoosh Sound Description (2017).
- E. Fonseca, J. Pons, Xavier Favory, F. Font, D. Bogdanov, Andrés Ferraro, S. Oramas, A. Porter, X. Serra. Freesound Datasets: A Platform For The Creation Of Open Audio Datasets. ISMIR (2017).
- Emiel van Miltenburg. Pragmatic Descriptions Of Perceptual Stimuli. EACL (2017).
- Georgios Paraskevopoulos, Giannis Karamanolakis, E. Iosif, A. Pikrakis, A. Potamianos. Sensory-Aware Multimodal Fusion For Word Semantic Similarity Estimation (2017).
- Hernán Ordiales, Matías Lennie Bruno. Sound Recycling From Public Databases: Another Bigdata Approach To Sound Collections. Audio Mostly Conference (2017).
- M. Briani, A. Cuyt, W. Lee. Validated Exponential Analysis For Harmonic Sounds (2017).
- S. R. Park, J. Lee. A Fully Convolutional Neural Network For Speech Enhancement. INTERSPEECH (2017).
- Vincent Lostanlen. Convolutional Operators In The Time-Frequency Domain (2017).
2016 (20)
- Chris Donahue. Extensions To Convolution For Generalized Cross-Synthesis (2016).
- Chris Donahue, T. Erbe, M. Puckette. Extended Convolution Techniques For Cross-Synthesis. ICMC (2016).
- Douwe Kiela. Mmfeat: A Toolkit For Extracting Multi-Modal Features. ACL (2016).
- Elliot Creager. Musical Source Separation By Coherent Frequency Modulation Cues (2016).
- Emiel van Miltenburg, Benjamin Timmermans, Lora Aroyo. The Vu Sound Corpus: Adding More Fine-Grained Annotations To The Freesound Database. LREC (2016).
- Etto L. Salomons, P. Havinga, H. V. Leeuwen. Inferring Human Activity Recognition With Ambient Sound On Wireless Sensor Nodes. Sensors (2016).
- F. Font, T. Brookes, G. Fazekas, M. Guerber, Amaury La Burthe, David Plans, Mark D. Plumbley, Meir Shaashua, W. Wang, X. Serra. Audio Commons: Bringing Creative Commons Audio Content To The Creative Industries (2016).
- F. Font, X. Serra. Tempo Estimation For Music Loops And A Simple Confidence Measure. ISMIR (2016).
- Giannis Karamanolakis, E. Iosif, A. Zlatintsi, A. Pikrakis, A. Potamianos. Audio-Based Distributional Representations Of Meaning Using A Fusion Of Feature Encodings. INTERSPEECH (2016).
- Giuseppe Bandiera, O. Picas, Hiroshi Tokuda, Wataru Hariya, K. Oishi, X. Serra. Good-Sounds.Org: A Framework To Explore Goodness In Instrumental Sounds. ISMIR (2016).
- H. Meutzner, D. Kolossa. A Non-Speech Audio Captcha Based On Acoustic Event Detection And Classification. 2016 24th European Signal Processing Conference (EUSIPCO) (2016).
- J. R. Delgado-Contreras, J. García-Vázquez, R. Brena. Optimizing The Length Of An Environmental Audio Fingerprint For Place Classification. 2016 International Conference on Electronics, Communications and Computers (CONIELECOMP) (2016).
- J. Serrà, Josep Lluís Arcos. Particle Swarm Optimization For Time Series Motif Discovery. Knowl. Based Syst. (2016).
- Long-Van Nguyen-Dinh. Wearable Activity Recognition With Crowdsourced Annotation (2016).
- M. F. Assaneo, J. Sitt, G. Varoquaux, M. Sigman, L. Cohen, M. Trevisan. Exploring The Anatomical Encoding Of Voice With A Mathematical Model Of The Vocal System. NeuroImage (2016).
- Mark D. Plumbley, C. Kroos, J. Bello, G. Richard, D. Ellis, A. Mesaros. Proceedings Of The Detection And Classification Of Acoustic Scenes And Events 2018 Workshop (Dcase2018) (2016).
- Naoya Takahashi, Michael Gygli, B. Pfister, L. Gool. Deep Convolutional Neural Networks And Data Augmentation For Acoustic Event Recognition. INTERSPEECH (2016).
- Naoya Takahashi, Michael Gygli, B. Pfister, L. Gool. Deep Convolutional Neural Networks And Data Augmentation For Acoustic Event Detection (2016).
- S. Parekh, F. Font, X. Serra. Improving Audio Retrieval Through Loudness Profile Categorization. 2016 IEEE International Symposium on Multimedia (ISM) (2016).
- V. Goudarzi, A. Gioti. Engagement And Interaction In Participatory Sound Art (2016).
2015 (19)
- A. Lopopolo, Emiel van Miltenburg. Sound-Based Distributional Models. IWCS (2015).
- Anna Xambó. Tabletop Tangible Interfaces For Music Performance : Design And Evaluation (2015).
- C. Roberts, Matthew Wright, J. Kuchera-Morin. Music Programming In Gibber. ICMC (2015).
- Diego Castán, David Tavarez, Paula Lopez-Otero, J. Franco-Pedroso, H. Delgado, E. Navas, L. Fernández, D. Ramos-Castro, J. Serrano, A. Ortega, E. Lleida. Albayzín-2014 Evaluation: Audio Segmentation And Classification In Broadcast News Domains. EURASIP J. Audio Speech Music. Process. (2015).
- Douwe Kiela, Stephen Clark. Multi- And Cross-Modal Semantics Beyond Vision: Grounding In Auditory Perception. EMNLP (2015).
- F. Font. Tag Recommendation Using Folksonomy Information For Online Sound Sharing Platforms (2015).
- F. Font, J. Serrà, X. Serra. Analysis Of The Impact Of A Tag Recommendation System In A Real-World Folksonomy. TIST (2015).
- G. Roma, X. Serra. Music Performance By Discovering Community Loops (2015).
- G. Roma, X. Serra. Querying Freesound With A Microphone (2015).
- H. Nishino, R. Nakatsu. Computer Music Languages And Systems: The Synergy Between Technology And Creativity (2015).
- Jainesh Doshi, Vishrant Tripathi, O. Desai, Shreyas Mangalgi. Instrument Classification Using Spiking Neural Networks (2015).
- Karol J. Piczak. Esc: Dataset For Environmental Sound Classification. ACM Multimedia (2015).
- Niklas Klügel. Collaborative Music-Making With Interactive Tabletops (2015).
- O. Picas, H. P. Rodriguez, Dara Dabiri, Hiroshi Tokuda, Wataru Hariya, K. Oishi, X. Serra. A Real-Time System For Measuring Sound Goodness In Instrumental Sounds (2015).
- Pablo Villegas. Content-Preserving Reconstruction Of Electronic Music Sessions Using Freely Available Musical Building-Blocks (2015).
- Qingchang Zhu, Z. Chen, Y. Soh. Using Unlabeled Acoustic Data With Locality-Constrained Linear Coding For Energy-Related Activity Recognition In Buildings. 2015 IEEE International Conference on Automation Science and Engineering (CASE) (2015).
- T. Kelkar, Anon Ray, Venkatesh Choppella. Sangeetkosh: An Open Web Platform For Music Education. 2015 IEEE 15th International Conference on Advanced Learning Technologies (2015).
- V. Apopei. Detection Dangerous Events In Environmental Sounds - A Preliminary Evaluation. 2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) (2015).
- Vito Claudio Ostuni, T. D. Noia, E. D. Sciascio, S. Oramas, X. Serra. A Semantic Hybrid Approach For Sound Recommendation. WWW (2015).
2014 (10)
- C. Jacoby. Automatic Urban Sound Classification Using Feature Learning Techniques (2014).
- D. Wolff. Spot The Odd Song Out : Similarity Model Adaptation And Analysis Using Relative Human Ratings (2014).
- F. Font, J. Serrà, X. Serra. Audio Clip Classification Using Social Tags And The Effect Of Tag Expansion. Semantic Audio (2014).
- F. Font, J. Serrà, X. Serra. Class-Based Tag Recommendation And User-Based Evaluation In Online Audio Clip Sharing. Knowl. Based Syst. (2014).
- F. Font, S. Oramas, György Fazekas, X. Serra. Extending Tagging Ontologies With Domain Specific Knowledge. International Semantic Web Conference (2014).
- J. R. Delgado-Contreras, J. García-Vázquez, R. Brena, C. E. Galván-Tejada, J. I. Galván-Tejada. Feature Selection For Place Classification Through Environmental Sounds. EUSPN/ICTH (2014).
- João Paulo Cordeiro. Sound Based Social Networks (2014).
- L. Wyse. Interactive Audio Web Development Workflow. ACM Multimedia (2014).
- Ohad Fried, Zeyu Jin, Reid Oda, A. Finkelstein. Audioquilt: 2D Arrangements Of Audio Samples Using Metric Learning And Kernelized Sorting. NIME (2014).
- Patrice Guyot. Caractérisation Et Reconnaissance De Sons D'Eau Pour Le Suivi Des Activités De La Vie Quotidienne : Une Approche Fondée Sur Le Signal, L'Acoustique Et La Perception (2014).
2013 (7)
- D. Wolff, Tillman Weyde. Learning Music Similarity From Relative User Ratings. Information Retrieval (2013).
- F. Font, J. Serrà, X. Serra. Folksonomy-Based Tag Recommendation For Collaborative Tagging Systems. Int. J. Semantic Web Inf. Syst. (2013).
- Long-Van Nguyen-Dinh, U. Blanke, G. Tröster. Towards Scalable Activity Recognition: Adapting Zero-Effort Crowdsourced Acoustic Models. MUM (2013).
- Miles Thorogood, P. Pasquier. Computationally Created Soundscapes With Audio Metaphor. ICCC (2013).
- Motohiro Sunouchi, Yuzuru Tanaka. Similarity Search Of Freesound Environmental Sound Based On Their Enhanced Multiscale Fractal Dimension (2013).
- Niklas Klügel, G. Groh. Towards Mapping Timbre To Emotional Affect. NIME (2013).
- Patrice Guyot, J. Pinquier, R. André-Obrecht. Water Sound Recognition Based On Physical Models. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013).
2012 (10)
- Brandon Mechtley, Andreas Spanias, P. Cook. Shortest Path Techniques For Annotation And Retrieval Of Environmental Sounds. ISMIR (2012).
- F. Font, G. Roma, P. Herrera, X. Serra. Characterization Of The Freesound Online Community. 2012 3rd International Workshop on Cognitive Information Processing (CIP) (2012).
- F. Font, J. Serrà, X. Serra. Folksonomy-Based Tag Recommendation For Online Audio Clip Sharing. ISMIR (2012).
- F. Font, X. Serra. Analysis Of The Folksonomy Of Freesound (2012).
- G. Roma, Anna Xambó, P. Herrera, Robin C. Laney. Factors In Human Recognition Of Timbre Lexicons Generated By Data Clustering (2012).
- G. Roma, P. Herrera, M. Zanin, S. Marín, F. Font, X. Serra. Small World Networks And Creativity In Audio Clip Sharing. Int. J. Soc. Netw. Min. (2012).
- M. Rossi, G. Tröster, O. Amft. Recognizing Daily Life Context Using Web-Collected Audio Data. 2012 16th International Symposium on Wearable Computers (2012).
- M. Sordo, Gopala K. Koduri, Sankalp Gulati, X. Serra. A Musically Aware System For Browsing And Interacting With Audio Music Collections (2012).
- Masayuki Karasuyama, Masashi Sugiyama. Canonical Dependency Analysis Based On Squared-Loss Mutual Information. Neural Networks (2012).
- Miles Thorogood, P. Pasquier, Arne Eigenfeldt. Audio Metaphor: Audio Information Retrieval For Soundscape Composition (2012).
2011 (3)
- J. Janer, G. Roma, S. Kersten. Authoring Augmented Soundscapes With User-Contributed Content (2011).
- J. Janer, S. Kersten, Mattian Schirosa, G. Roma. An Online Platform For Interactive Soundscapes With User-Contributed Audio Content (2011).
- Nuno N. Correia. Av Clash, Online Audiovisual Project: A Case Study Of Evaluation In New Media Art. Advances in Computer Entertainment Technology (2011).
2010 (3)
- G. Roma, J. Janer, S. Kersten, Mattia Schirosa, P. Herrera, X. Serra. Ecological Acoustics Perspective For Content-Based Retrieval Of Environmental Sounds. EURASIP J. Audio Speech Music. Process. (2010).
- G. Roma, P. Herrera. Graph Grammar Representation For Collaborative Sample-Based Music Creation. Audio Mostly Conference (2010).
- G. Roma, P. Herrera. Community Structure In Audio Clip Sharing. 2010 International Conference on Intelligent Networking and Collaborative Systems (2010).
2009 (2)
- Gerard Roma Trepat, Perfecto Herrera-Boyer, X. Serra. Freesound Radio: Supporting Music Creation By Exploration Of A Sound Database (2009).
- M. Magas, Polina Proutskova. A Location-Tracking Interface For Ethnomusicological Collections (2009).