About the frequency data distribution of Analyzer output in Web Audio

Original link: https://jw1.dev/2022/08/19/a01.html

With the update of iOS 16 beta 5, careful friends may have noticed that when the system is playing music, there is a small spectrum display in the upper right corner of the music console. For someone like me who likes music, watching it move with the music seems to make me feel better. So the question is, is it possible to implement such a function on the browser? The answer is yes, but there seems to be a small problem.

Untitled.jpeg


Let’s put aside our concerns first and directly check the MDN heap code to see the effect.

 < div id = "app" >
< audio id = "music" src = "test-2.mp3" > </ audio >
< button @ click = "play" > play </ button >
< div style = "display: flex; height: 60px; align-items: flex-end; margin-top: 20px" >
< div class = "item" v-for = "item in fData" :style = "{height: item / 255 * 100 + '%'}" > </ div >
</ div >
</ div >

< script src = "vue.js" > </ script >
 .item {
width : 6px ;
min-height : 6px ;
background : #333333 ;
align-items : flex-end;
border-radius : 6px ;
margin-right : 3px ;
}
 new Vue({
el : '#app' ,
data : function ( ) {
return {
fData : []
}
},
methods : {
play ( ) {
let _ = this
let audio = document .getElementById( 'music' )
audio.play()

let audioContext = new AudioContext()
let audioSrc = audioContext.createMediaElementSource(audio)
let analyzer = audioContext.createAnalyser()
analyzer.fftSize = 32

audioSrc.connect(analyzer)
analyzer.connect(audioContext.destination)

let bufferLength = analyzer.frequencyBinCount
let frequencyData = new Uint8Array (bufferLength)

setInterval ( () => {
analyzer.getByteFrequencyData(frequencyData)
_.fData = _.uint8ArrayToArray(frequencyData)
}, 1000 / 24 )
},
uint8ArrayToArray ( uint8Array ) {
let array = []

for ( let i = 0 ; i < uint8Array.byteLength; i++) {
array[i] = uint8Array[i]
}

return array
}
}
})

result:

After this operation, a simple spectrum display is completed (there is an old sense of achievement), and the principle is probably very simple:

  1. Create an audio上下文( Audio Context )
  2. Use context map <audio> and assign it to audioSrc
  3. Create分析仪Node ( Analyzer Node )
  4. audioSrc is connected to the分析仪, and the分析仪is connected to ctx.destination , which is the output of the user terminal
  5. Graphical分析仪Data

Now, here comes the real problem: no matter what music I try, my spectrogram is always like this ⬇️ (the sense of achievement is gone with a snap)

Untitled

There may be some small differences, but the effect is basically the same: the low frequency seems to be full, the high frequency is almost absent

At this time, as a “quasi-professional arranger”, I have an intuition that the spectral data given by the Web Audio API may be linearly distributed . To verify my guess, I made a sine wave audio that slowly rises from 30Hz to 20000Hz, increased the analyzer’s fftSize , and put it in the code to see the output:

Let’s take a look at the output of the same audio in Spectrum Linear mode in Ableton Live:

This performance can be said to be basically the same, and it also verifies my conjecture – the spectral data output by the Web Audio API is linearly distributed. What is a linear distribution? It’s easy to understand: in a coordinate system, the distance 10 to 20 on the X axis needs to travel is the same as the distance 2000 to 2010 needs to travel. However, for humans, the solution of linear distribution is not optimal. We can first look at some screenshots of professional EQ plug-ins to see how they do spectral distribution.

FabFilter Pro Q3:

Untitled

Eiosis Air EQ Premium:

Untitled

SlateDigital Inf EQ:

Untitled

Ableton Live EQ Eight

Untitled

It can be seen that these distributions are basically piecewise logarithmic distributions at frequencies similar to the exponential distribution , and seem to follow the optimization of the equal loudness curve .

We often hear the word “exponential growth”, to give a simple example: 10, 100, 1000, 10000, this sequence is exponential growth. In the above spectrogram, the frequency distribution is basically exponential.

We don’t need to know too much about logarithms. The logarithmic distribution can also be extended based on my understanding of the linear distribution above: in a coordinate system, the distance that 10 to 20 on the X axis needs to travel, and 2000 to 2010, the same is 10, but the distance that needs to be traveled is different. If we regard the visualization panel of the spectrum analyzer as a coordinate system according to the distribution rule that Ableton Live prefers, then on the X-axis, 10 – 10000 is divided into three segments, 10 – 100, 100 – 1000, 1000 – 10000, each distance is the same and is divided into 9 parts for logarithmic distribution.

When it comes to the equal loudness curve, it is not objective like logarithm, but a subjective concept. If the content of this concept is explained in vernacular, it is: the sound of the same volume but different frequencies will sound different to people. That is: 4000Hz sine wave of 60Db and 10000Hz sine wave of 60Db, the former will sound louder. This concept can also be used to explain that when you delete the part of the song above 10kHz (which can be understood as cutting in half ), you will not feel that the quality of the entire song has dropped by 50%, and it may only sound 10% subjectively.

Personally, I believe that the distribution pattern of spectrum is actually to make up for the shortcomings of the human ear, to visually simulate the “defect” similar to hearing, and to truly achieve the consistency of audio and video to improve the user experience.


As for how to implement it in the browser, it is actually very simple, we only need to simulate the exponential distribution, without considering the logarithmic distribution in the segmentation. Demo address: https://jw1.dev/frequency-test/test-2.html

Now look at the comparison again:

Untitled

Untitled

Which experience is better, you can see at a glance.

At this point, even if a spectrum display that is not perfect but looks good is completed, I have also raised an issue for the webAudio open source project, hoping to add support in the future to bring a better experience to developers!

✌️

This article is reprinted from: https://jw1.dev/2022/08/19/a01.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment