Embeddings, representational vectors of features within a high dimensional space, are increasingly growing in popularity in domains spanning natural language processing, biomedical engineering, and geospatial decision making. Applications of embeddings are especially demonstrated in audio processing, where popular semantic models such as Word2Vec and GloVe aid in the feature identification, classification, and diagnosis of converted sound to text data. Conversely, when phonemes and semantics are of less importance, audio waveforms themselves are embedded using features such as amplitude, frequency, and phase shift.
Unfortunately, little research exists surrounding visual embedding representations of the audio waveforms. CCRi’s prior research conducted on track image chips suggest that image signatures of geospatial events embed to provide just as meaningful observations as the geocoordinates themselves. Following this example through an ensemble of Magenta--an opensource Python library for music and image manipulation--and CCRi’s image processing software, we will demonstrate to what degree music signatures successfully cluster together in embedding space and whether classifications can be made from these findings.
You need this ticket from Eventbrite to sign up:
Applied Machine Learning Conference.