Since the development of sound recording technologies, the palette of sound timbres available for music creation was extended way beyond traditional musical instruments. The organization and categorization of timbre has been a common endeavor. The availability of large databases of sound clips provides an opportunity for obtaining datadriven timbre categorizations via content-based clustering. In this article we describe an experiment aimed at understanding what factors influence the process of learning a given clustering of sound samples. We clustered a large database of short sound clips, and analyzed the success of participants in assigning sounds to the “correct” clusters after listening to a few examples of each. The results of the experiment suggest a number of relevant factors related both to the strategies followed by users and to the quality measures of the clustering solution, which can guide the design of creative applications based on audio clip clustering.

Authors: Gerard Roma, Anna Xambó, Perfecto Herrera, Robin Laney
Published in: Sound and Music Computing Conference (2012)
URL: http://mtg.upf.edu/node/2563