Poor resampling quality when using AudioContext sampleRate parameter

cdumez · cdumez · commit dd0dea201ba7 · 2020-11-21T01:45:54.000Z
https://bugs.webkit.org/show_bug.cgi?id=219201 Reviewed by Geoff Garen. Source/WebCore: MultiChannelResampler uses a SincResampler per audio channel. In MultiChannelResampler::process(), it was calling SincResampler::process() for each channel, which would potentially end up calling MultiChannelResampler::ChannelProvider::provideInput() to provide channel data used for resampling. The issue was that MultiChannelResampler::ChannelProvider::provideInput() is implemented in such a way that things will break if provideInput() gets called more than once per channel. When using an AudioContext's sample rate larger than the hardware sample rate, provideInput() was getting called more than once per channel and this resulted in very poor resampling quality. To address the issue, MultiChannelResampler::process() now processes the data in chunks that are small enough to guarantee that MultiChannelResampler::ChannelProvider::provideInput() will never get called more than once per audio channel. The fix is based on the corresponding MultiChannelResampler / SincResampler implementation in Chrome: - https://github.com/chromium/chromium/blob/master/media/base/multi_channel_resampler.cc - https://github.com/chromium/chromium/blob/master/media/base/sinc_resampler.cc Tests: webaudio/audiocontext-large-samplerate.html webaudio/audiocontext-low-samplerate.html * platform/audio/MultiChannelResampler.cpp: (WebCore::MultiChannelResampler::ChannelProvider::setProvider): (WebCore::MultiChannelResampler::ChannelProvider::setCurrentChannel): (WebCore::MultiChannelResampler::process): * platform/audio/MultiChannelResampler.h: * platform/audio/SincResampler.cpp: (WebCore::calculateChunkSize): (WebCore::SincResampler::updateRegions): * platform/audio/SincResampler.h: LayoutTests: Add layout test coverage that would hit assertions in debug. * webaudio/audiocontext-large-samplerate-expected.txt: Added. * webaudio/audiocontext-large-samplerate.html: Added. * webaudio/audiocontext-low-samplerate-expected.txt: Added. * webaudio/audiocontext-low-samplerate.html: Added. Canonical link: https://commits.webkit.org/231850@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@270141 268f45cc-cd09-0410-ab3c-d52691b4dbfc
diff --git a/LayoutTests/ChangeLog b/LayoutTests/ChangeLog
@@ -1,3 +1,17 @@
+2020-11-20  Chris Dumez  <cdumez@apple.com>
+
+        Poor resampling quality when using AudioContext sampleRate parameter
+        https://bugs.webkit.org/show_bug.cgi?id=219201
+
+        Reviewed by Geoff Garen.
+
+        Add layout test coverage that would hit assertions in debug.
+
+        * webaudio/audiocontext-large-samplerate-expected.txt: Added.
+        * webaudio/audiocontext-large-samplerate.html: Added.
+        * webaudio/audiocontext-low-samplerate-expected.txt: Added.
+        * webaudio/audiocontext-low-samplerate.html: Added.
+
 2020-11-20  Chris Dumez  <cdumez@apple.com>
 
         (r267253) [ Mac ] webaudio/AudioParam/audioparam-processing.html is flaky failing
diff --git a/LayoutTests/webaudio/audiocontext-large-samplerate-expected.txt b/LayoutTests/webaudio/audiocontext-large-samplerate-expected.txt
@@ -0,0 +1,11 @@
+Tests that we do not crash when using a very large sample rate
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS context.sampleRate is 384000
+PASS context.state is "running"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+
diff --git a/LayoutTests/webaudio/audiocontext-large-samplerate.html b/LayoutTests/webaudio/audiocontext-large-samplerate.html
@@ -0,0 +1,22 @@
+<!DOCTYPE html>
+<html>
+<script src="../resources/js-test.js"></script>
+<body>
+<script>
+description("Tests that we do not crash when using a very large sample rate");
+jsTestIsAsync = true;
+
+context = new AudioContext({ sampleRate: 384000 });
+shouldBe("context.sampleRate", "384000");
+
+node = new ConstantSourceNode(context, { offset: 0.5 });
+node.connect(context.destination);
+node.start();
+
+setTimeout(() => {
+    shouldBeEqualToString("context.state", "running");
+    finishJSTest();
+}, 100);
+</script>
+</body>
+</html>
diff --git a/LayoutTests/webaudio/audiocontext-low-samplerate-expected.txt b/LayoutTests/webaudio/audiocontext-low-samplerate-expected.txt
@@ -0,0 +1,11 @@
+Tests that we do not crash when using a very low sample rate
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS context.sampleRate is 3000
+PASS context.state is "running"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+
diff --git a/LayoutTests/webaudio/audiocontext-low-samplerate.html b/LayoutTests/webaudio/audiocontext-low-samplerate.html
@@ -0,0 +1,22 @@
+<!DOCTYPE html>
+<html>
+<script src="../resources/js-test.js"></script>
+<body>
+<script>
+description("Tests that we do not crash when using a very low sample rate");
+jsTestIsAsync = true;
+
+context = new AudioContext({ sampleRate: 3000 });
+shouldBe("context.sampleRate", "3000");
+
+node = new ConstantSourceNode(context, { offset: 0.5 });
+node.connect(context.destination);
+node.start();
+
+setTimeout(() => {
+    shouldBeEqualToString("context.state", "running");
+    finishJSTest();
+}, 100);
+</script>
+</body>
+</html>
diff --git a/Source/WebCore/ChangeLog b/Source/WebCore/ChangeLog
@@ -1,3 +1,40 @@
+2020-11-20  Chris Dumez  <cdumez@apple.com>
+
+        Poor resampling quality when using AudioContext sampleRate parameter
+        https://bugs.webkit.org/show_bug.cgi?id=219201
+
+        Reviewed by Geoff Garen.
+
+        MultiChannelResampler uses a SincResampler per audio channel. In MultiChannelResampler::process(),
+        it was calling SincResampler::process() for each channel, which would potentially end up calling
+        MultiChannelResampler::ChannelProvider::provideInput() to provide channel data used for resampling.
+        The issue was that MultiChannelResampler::ChannelProvider::provideInput() is implemented in such
+        a way that things will break if provideInput() gets called more than once per channel. When using
+        an AudioContext's sample rate larger than the hardware sample rate, provideInput() was getting
+        called more than once per channel and this resulted in very poor resampling quality.
+
+        To address the issue, MultiChannelResampler::process() now processes the data in chunks that
+        are small enough to guarantee that MultiChannelResampler::ChannelProvider::provideInput() will
+        never get called more than once per audio channel.
+
+        The fix is based on the corresponding MultiChannelResampler / SincResampler implementation in
+        Chrome:
+        - https://github.com/chromium/chromium/blob/master/media/base/multi_channel_resampler.cc
+        - https://github.com/chromium/chromium/blob/master/media/base/sinc_resampler.cc
+
+        Tests: webaudio/audiocontext-large-samplerate.html
+               webaudio/audiocontext-low-samplerate.html
+
+        * platform/audio/MultiChannelResampler.cpp:
+        (WebCore::MultiChannelResampler::ChannelProvider::setProvider):
+        (WebCore::MultiChannelResampler::ChannelProvider::setCurrentChannel):
+        (WebCore::MultiChannelResampler::process):
+        * platform/audio/MultiChannelResampler.h:
+        * platform/audio/SincResampler.cpp:
+        (WebCore::calculateChunkSize):
+        (WebCore::SincResampler::updateRegions):
+        * platform/audio/SincResampler.h:
+
 2020-11-20  Kate Cheney  <katherine_cheney@apple.com>
 
         PCM: Persist pending ad clicks and attributions so they can survive browser restart
diff --git a/Source/WebCore/platform/audio/MultiChannelResampler.cpp b/Source/WebCore/platform/audio/MultiChannelResampler.cpp
@@ -33,62 +33,59 @@
 #include "MultiChannelResampler.h"
 
 #include "AudioBus.h"
+#include "AudioUtilities.h"
 
 namespace WebCore {
 
 // ChannelProvider provides a single channel of audio data (one channel at a time) for each channel
 // of data provided to us in a multi-channel provider.
-class MultiChannelResampler::ChannelProvider : public AudioSourceProvider {
+class MultiChannelResampler::ChannelProvider {
     WTF_MAKE_FAST_ALLOCATED;
 public:
     explicit ChannelProvider(unsigned numberOfChannels)
-        : m_numberOfChannels(numberOfChannels)
+        : m_multiChannelBus(AudioBus::create(numberOfChannels, AudioUtilities::renderQuantumSize))
     {
     }
 
     void setProvider(AudioSourceProvider* multiChannelProvider)
     {
-        m_currentChannel = 0;
-        m_framesToProcess = 0;
         m_multiChannelProvider = multiChannelProvider;
     }
 
     // provideInput() will be called once for each channel, starting with the first channel.
     // Each time it's called, it will provide the next channel of data.
-    void provideInput(AudioBus* bus, size_t framesToProcess) override
+    void provideInputForChannel(AudioBus* bus, size_t framesToProcess, unsigned channelIndex)
     {
+        ASSERT(channelIndex < m_multiChannelBus->numberOfChannels());
+        ASSERT(framesToProcess <= AudioUtilities::renderQuantumSize);
+        if (framesToProcess > AudioUtilities::renderQuantumSize)
+            return;
+
         bool isBusGood = bus && bus->numberOfChannels() == 1;
         ASSERT(isBusGood);
         if (!isBusGood)
             return;
 
         // Get the data from the multi-channel provider when the first channel asks for it.
         // For subsequent channels, we can just dish out the channel data from that (stored in m_multiChannelBus).
-        if (!m_currentChannel) {
+        if (!channelIndex) {
             m_framesToProcess = framesToProcess;
-            m_multiChannelBus = AudioBus::create(m_numberOfChannels, framesToProcess);
             m_multiChannelProvider->provideInput(m_multiChannelBus.get(), framesToProcess);
         }
 
         // All channels must ask for the same amount. This should always be the case, but let's just make sure.
-        bool isGood = m_multiChannelBus.get() && framesToProcess == m_framesToProcess;
+        bool isGood = framesToProcess == m_framesToProcess;
         ASSERT(isGood);
         if (!isGood)
             return;
 
         // Copy the channel data from what we received from m_multiChannelProvider.
-        ASSERT(m_currentChannel <= m_numberOfChannels);
-        if (m_currentChannel < m_numberOfChannels) {
-            memcpy(bus->channel(0)->mutableData(), m_multiChannelBus->channel(m_currentChannel)->data(), sizeof(float) * framesToProcess);
-            ++m_currentChannel;
-        }
+        memcpy(bus->channel(0)->mutableData(), m_multiChannelBus->channel(channelIndex)->data(), sizeof(float) * framesToProcess);
     }
 
 private:
     AudioSourceProvider* m_multiChannelProvider { nullptr };
     RefPtr<AudioBus> m_multiChannelBus;
-    unsigned m_numberOfChannels { 0 };
-    unsigned m_currentChannel { 0 };
     size_t m_framesToProcess { 0 }; // Used to verify that all channels ask for the same amount.
 };
 
@@ -112,14 +109,25 @@ void MultiChannelResampler::process(AudioSourceProvider* provider, AudioBus* des
     // channelProvider wraps the original multi-channel provider and dishes out one channel at a time.
     m_channelProvider->setProvider(provider);
 
-    for (unsigned channelIndex = 0; channelIndex < m_numberOfChannels; ++channelIndex) {
-        // Depending on the sample-rate scale factor, and the internal buffering used in a SincResampler
-        // kernel, this call to process() will only sometimes call provideInput() on the channelProvider.
-        // However, if it calls provideInput() for the first channel, then it will call it for the remaining
-        // channels, since they all buffer in the same way and are processing the same number of frames.
-        m_kernels[channelIndex]->process(m_channelProvider.get(),
-                                         destination->channel(channelIndex)->mutableData(),
-                                         framesToProcess);
+    // We need to ensure that SincResampler only calls provideInput() once for each channel or it will confuse the logic
+    // inside ChannelProvider. To ensure this, we chunk the number of requested frames into SincResampler::chunkSize()
+    // sized chunks. SincResampler guarantees it will only call provideInput() once once we resample this way.
+    m_outputFramesReady = 0;
+    while (m_outputFramesReady < framesToProcess) {
+        size_t chunkSize = m_kernels[0]->chunkSize();
+        size_t framesThisTime = std::min(framesToProcess - m_outputFramesReady, chunkSize);
+
+        for (unsigned channelIndex = 0; channelIndex < m_numberOfChannels; ++channelIndex) {
+            ASSERT(chunkSize == m_kernels[channelIndex]->chunkSize());
+            bool wasProvideInputCalled = false;
+            m_kernels[channelIndex]->process(destination->channel(channelIndex)->mutableData() + m_outputFramesReady, framesThisTime, [this, channelIndex, &wasProvideInputCalled](AudioBus* bus, size_t framesToProcess) {
+                ASSERT_WITH_MESSAGE(!wasProvideInputCalled, "provideInputForChannel should only be called once");
+                wasProvideInputCalled = true;
+                m_channelProvider->provideInputForChannel(bus, framesToProcess, channelIndex);
+            });
+        }
+
+        m_outputFramesReady += framesThisTime;
     }
 
     m_channelProvider->setProvider(nullptr);
diff --git a/Source/WebCore/platform/audio/MultiChannelResampler.h b/Source/WebCore/platform/audio/MultiChannelResampler.h
@@ -59,6 +59,7 @@ class MultiChannelResampler final {
 
     class ChannelProvider;
     std::unique_ptr<ChannelProvider> m_channelProvider;
+    size_t m_outputFramesReady { 0 };
 };
 
 } // namespace WebCore
diff --git a/Source/WebCore/platform/audio/SincResampler.cpp b/Source/WebCore/platform/audio/SincResampler.cpp
@@ -115,6 +115,11 @@ constexpr unsigned defaultRequestFrames { 512 };
 constexpr unsigned kernelSize { 32 };
 constexpr unsigned numberOfKernelOffsets { 32 };
 
+static size_t calculateChunkSize(unsigned blockSize, double scaleFactor)
+{
+    return blockSize / scaleFactor;
+}
+
 SincResampler::SincResampler(double scaleFactor, Optional<unsigned> requestFrames)
     : m_scaleFactor(scaleFactor)
     , m_kernelStorage(kernelSize * (numberOfKernelOffsets + 1))
@@ -137,6 +142,7 @@ void SincResampler::updateRegions(bool isSecondLoad)
     m_r3 = m_r0 + m_requestFrames - kernelSize;
     m_r4 = m_r0 + m_requestFrames - kernelSize / 2;
     m_blockSize = m_r4 - m_r2;
+    m_chunkSize = calculateChunkSize(m_blockSize, m_scaleFactor);
 
     // m_r1 at the beginning of the buffer.
     ASSERT(m_r1 == m_inputBuffer.data());
@@ -187,10 +193,10 @@ void SincResampler::initializeKernel()
     }
 }
 
-void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames)
+void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput)
 {
-    ASSERT(m_sourceProvider);
-    if (!m_sourceProvider)
+    ASSERT(provideInput);
+    if (!provideInput)
         return;
 
     // Wrap the provided buffer by an AudioBus for use by the source provider.
@@ -200,14 +206,14 @@ void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames)
     // FIXME: Find a way to make the following const-correct:
     m_internalBus->setChannelMemory(0, buffer, numberOfSourceFrames);
     
-    m_sourceProvider->provideInput(m_internalBus.get(), numberOfSourceFrames);
+    provideInput(m_internalBus.get(), numberOfSourceFrames);
 }
 
 namespace {
 
-// BufferSourceProvider is an AudioSourceProvider wrapping an in-memory buffer.
+// BufferSourceProvider is an audio source provider wrapping an in-memory buffer.
 
-class BufferSourceProvider : public AudioSourceProvider {
+class BufferSourceProvider {
 public:
     explicit BufferSourceProvider(const float* source, size_t numberOfSourceFrames)
         : m_source(source)
@@ -216,7 +222,7 @@ class BufferSourceProvider : public AudioSourceProvider {
     }
     
     // Consumes samples from the in-memory buffer.
-    void provideInput(AudioBus* bus, size_t framesToProcess) override
+    void provideInput(AudioBus* bus, size_t framesToProcess)
     {
         ASSERT(m_source && bus);
         if (!m_source || !bus)
@@ -253,27 +259,27 @@ void SincResampler::process(const float* source, float* destination, unsigned nu
     
     while (remaining) {
         unsigned framesThisTime = std::min(remaining, m_requestFrames);
-        process(&sourceProvider, destination, framesThisTime);
+        process(destination, framesThisTime, [&sourceProvider](AudioBus* bus, size_t framesToProcess) {
+            sourceProvider.provideInput(bus, framesToProcess);
+        });
         
         destination += framesThisTime;
         remaining -= framesThisTime;
     }
 }
 
-void SincResampler::process(AudioSourceProvider* sourceProvider, float* destination, size_t framesToProcess)
+void SincResampler::process(float* destination, size_t framesToProcess, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput)
 {
-    ASSERT(sourceProvider);
-    if (!sourceProvider)
+    ASSERT(provideInput);
+    if (!provideInput)
         return;
-    
-    m_sourceProvider = sourceProvider;
 
     unsigned numberOfDestinationFrames = framesToProcess;
 
     // Step (1)
     // Prime the input buffer at the start of the input stream.
     if (!m_isBufferPrimed) {
-        consumeSource(m_r0, m_requestFrames);
+        consumeSource(m_r0, m_requestFrames, provideInput);
         m_isBufferPrimed = true;
     }
     
@@ -521,7 +527,7 @@ void SincResampler::process(AudioSourceProvider* sourceProvider, float* destinat
 
         // Step (5)
         // Refresh the buffer with more input.
-        consumeSource(m_r0, m_requestFrames);
+        consumeSource(m_r0, m_requestFrames, provideInput);
     }
 }
 
diff --git a/Source/WebCore/platform/audio/SincResampler.h b/Source/WebCore/platform/audio/SincResampler.h
@@ -45,15 +45,17 @@ class SincResampler final {
     // requestFrames controls the size in frames of the buffer requested by each provideInput() call.
     SincResampler(double scaleFactor, Optional<unsigned> requestFrames = WTF::nullopt);
     
+    size_t chunkSize() const { return m_chunkSize; }
+
     // Processes numberOfSourceFrames from source to produce numberOfSourceFrames / scaleFactor frames in destination.
     void process(const float* source, float* destination, unsigned numberOfSourceFrames);
 
-    // Process with input source callback function for streaming applications.
-    void process(AudioSourceProvider*, float* destination, size_t framesToProcess);
+    // Process with provideInput callback function for streaming applications.
+    void process(float* destination, size_t framesToProcess, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput);
 
 protected:
     void initializeKernel();
-    void consumeSource(float* buffer, unsigned numberOfSourceFrames);
+    void consumeSource(float* buffer, unsigned numberOfSourceFrames, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput);
     void updateRegions(bool isSecondLoad);
     
     double m_scaleFactor;
@@ -72,6 +74,8 @@ class SincResampler final {
     // The number of source frames processed per pass.
     unsigned m_blockSize { 0 };
 
+    size_t m_chunkSize { 0 };
+
     // Source is copied into this buffer for each processing pass.
     AudioFloatArray m_inputBuffer;
 
@@ -85,9 +89,6 @@ class SincResampler final {
 
     const float* m_source { nullptr };
     unsigned m_sourceFramesAvailable { 0 };
-    
-    // m_sourceProvider is used to provide the audio input stream to the resampler.
-    AudioSourceProvider* m_sourceProvider { nullptr };
 
     // The buffer is primed once at the very beginning of processing.
     bool m_isBufferPrimed { false };