Signal Processing Laboratory

We present a systematic approach for audio mixing based on synchronized User-Generated audio Recordings (UGRs), e.g., audio recordings contributed by users attending the same public event. We discuss the challenges that relate to creating a mixture with such recordings, mainly due to the fact that each audio stream spans a different portion of the event of interest and comes with different signal level characteristics. We propose an approach to combine the available recordings based on a normalization step and a mixing step. The normalization step defines a fixed-with-time gain that is specific to each UGR. In the mixing step, a mechanism that reduces the master gain in accordance with the number of activated inputs at each time is employed. An approach called orthogonal mixing is presented, which is designed based on the assumption that the mixture components are mutually independent. The presented mixing process allows the combination of multiple short duration UGRs to produce a longer audio stream with potentially better quality than any one of its constituent parts.},
keywords = {automatic mixing, normalization, user-generated content, user-generated recordings