Poor resampling quality when using AudioContext sampleRate parameter

cdumez · cdumez · commit 1c7c49a5c555 · 2020-11-22T04:51:13.000Z
https://bugs.webkit.org/show_bug.cgi?id=219201 Reviewed by Geoff Garen. Source/WebCore: MultiChannelResampler uses a SincResampler per audio channel. In MultiChannelResampler::process(), it was calling SincResampler::process() for each channel, which would potentially end up calling MultiChannelResampler::ChannelProvider::provideInput() to provide channel data used for resampling. The issue was that MultiChannelResampler::ChannelProvider::provideInput() is implemented in such a way that things will break if provideInput() gets called more than once per channel. When using an AudioContext's sample rate larger than the hardware sample rate, provideInput() was getting called more than once per channel and this resulted in very poor resampling quality. To address the issue, MultiChannelResampler::process() now processes the data in chunks that are small enough to guarantee that MultiChannelResampler::ChannelProvider::provideInput() will never get called more than once per audio channel. The fix is based on the corresponding MultiChannelResampler / SincResampler implementation in Chrome: - https://github.com/chromium/chromium/blob/master/media/base/multi_channel_resampler.cc - https://github.com/chromium/chromium/blob/master/media/base/sinc_resampler.cc Tests: webaudio/audiocontext-large-samplerate.html webaudio/audiocontext-low-samplerate.html * platform/audio/MultiChannelResampler.cpp: (WebCore::MultiChannelResampler::ChannelProvider::setProvider): (WebCore::MultiChannelResampler::ChannelProvider::setCurrentChannel): (WebCore::MultiChannelResampler::process): * platform/audio/MultiChannelResampler.h: * platform/audio/SincResampler.cpp: (WebCore::calculateChunkSize): (WebCore::SincResampler::updateRegions): * platform/audio/SincResampler.h: LayoutTests: Add layout test coverage that would hit assertions in debug. * webaudio/audiocontext-large-samplerate-expected.txt: Added. * webaudio/audiocontext-large-samplerate.html: Added. * webaudio/audiocontext-low-samplerate-expected.txt: Added. * webaudio/audiocontext-low-samplerate.html: Added. Canonical link: https://commits.webkit.org/231866@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@270157 268f45cc-cd09-0410-ab3c-d52691b4dbfc
diff --git a/LayoutTests/ChangeLog b/LayoutTests/ChangeLog
@@ -1,3 +1,17 @@
+2020-11-21  Chris Dumez  <cdumez@apple.com>
+
+        Poor resampling quality when using AudioContext sampleRate parameter
+        https://bugs.webkit.org/show_bug.cgi?id=219201
+
+        Reviewed by Geoff Garen.
+
+        Add layout test coverage that would hit assertions in debug.
+
+        * webaudio/audiocontext-large-samplerate-expected.txt: Added.
+        * webaudio/audiocontext-large-samplerate.html: Added.
+        * webaudio/audiocontext-low-samplerate-expected.txt: Added.
+        * webaudio/audiocontext-low-samplerate.html: Added.
+
 2020-11-21  Chris Dumez  <cdumez@apple.com>
 
         Unreviewed, reverting r270141.
diff --git a/LayoutTests/webaudio/audiocontext-large-samplerate-expected.txt b/LayoutTests/webaudio/audiocontext-large-samplerate-expected.txt
@@ -0,0 +1,11 @@
+Tests that we do not crash when using a very large sample rate
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS context.sampleRate is 384000
+PASS context.state is "running"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+
diff --git a/LayoutTests/webaudio/audiocontext-large-samplerate.html b/LayoutTests/webaudio/audiocontext-large-samplerate.html
@@ -0,0 +1,22 @@
+<!DOCTYPE html>
+<html>
+<script src="../resources/js-test.js"></script>
+<body>
+<script>
+description("Tests that we do not crash when using a very large sample rate");
+jsTestIsAsync = true;
+
+context = new AudioContext({ sampleRate: 384000 });
+shouldBe("context.sampleRate", "384000");
+
+node = new ConstantSourceNode(context, { offset: 0.5 });
+node.connect(context.destination);
+node.start();
+
+setTimeout(() => {
+    shouldBeEqualToString("context.state", "running");
+    finishJSTest();
+}, 100);
+</script>
+</body>
+</html>
diff --git a/LayoutTests/webaudio/audiocontext-low-samplerate-expected.txt b/LayoutTests/webaudio/audiocontext-low-samplerate-expected.txt
@@ -0,0 +1,11 @@
+Tests that we do not crash when using a very low sample rate
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS context.sampleRate is 3000
+PASS context.state is "running"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+
diff --git a/LayoutTests/webaudio/audiocontext-low-samplerate.html b/LayoutTests/webaudio/audiocontext-low-samplerate.html
@@ -0,0 +1,22 @@
+<!DOCTYPE html>
+<html>
+<script src="../resources/js-test.js"></script>
+<body>
+<script>
+description("Tests that we do not crash when using a very low sample rate");
+jsTestIsAsync = true;
+
+context = new AudioContext({ sampleRate: 3000 });
+shouldBe("context.sampleRate", "3000");
+
+node = new ConstantSourceNode(context, { offset: 0.5 });
+node.connect(context.destination);
+node.start();
+
+setTimeout(() => {
+    shouldBeEqualToString("context.state", "running");
+    finishJSTest();
+}, 100);
+</script>
+</body>
+</html>
diff --git a/Source/WebCore/ChangeLog b/Source/WebCore/ChangeLog
@@ -1,3 +1,40 @@
+2020-11-21  Chris Dumez  <cdumez@apple.com>
+
+        Poor resampling quality when using AudioContext sampleRate parameter
+        https://bugs.webkit.org/show_bug.cgi?id=219201
+
+        Reviewed by Geoff Garen.
+
+        MultiChannelResampler uses a SincResampler per audio channel. In MultiChannelResampler::process(),
+        it was calling SincResampler::process() for each channel, which would potentially end up calling
+        MultiChannelResampler::ChannelProvider::provideInput() to provide channel data used for resampling.
+        The issue was that MultiChannelResampler::ChannelProvider::provideInput() is implemented in such
+        a way that things will break if provideInput() gets called more than once per channel. When using
+        an AudioContext's sample rate larger than the hardware sample rate, provideInput() was getting
+        called more than once per channel and this resulted in very poor resampling quality.
+
+        To address the issue, MultiChannelResampler::process() now processes the data in chunks that
+        are small enough to guarantee that MultiChannelResampler::ChannelProvider::provideInput() will
+        never get called more than once per audio channel.
+
+        The fix is based on the corresponding MultiChannelResampler / SincResampler implementation in
+        Chrome:
+        - https://github.com/chromium/chromium/blob/master/media/base/multi_channel_resampler.cc
+        - https://github.com/chromium/chromium/blob/master/media/base/sinc_resampler.cc
+
+        Tests: webaudio/audiocontext-large-samplerate.html
+               webaudio/audiocontext-low-samplerate.html
+
+        * platform/audio/MultiChannelResampler.cpp:
+        (WebCore::MultiChannelResampler::ChannelProvider::setProvider):
+        (WebCore::MultiChannelResampler::ChannelProvider::setCurrentChannel):
+        (WebCore::MultiChannelResampler::process):
+        * platform/audio/MultiChannelResampler.h:
+        * platform/audio/SincResampler.cpp:
+        (WebCore::calculateChunkSize):
+        (WebCore::SincResampler::updateRegions):
+        * platform/audio/SincResampler.h:
+
 2020-11-21  Simon Fraser  <simon.fraser@apple.com>
 
         Propagate the 'wheelEventGesturesBecomeNonBlocking' setting to the ScrollingTree
diff --git a/Source/WebCore/platform/audio/MultiChannelResampler.cpp b/Source/WebCore/platform/audio/MultiChannelResampler.cpp
@@ -38,63 +38,59 @@ namespace WebCore {
 
 // ChannelProvider provides a single channel of audio data (one channel at a time) for each channel
 // of data provided to us in a multi-channel provider.
-class MultiChannelResampler::ChannelProvider : public AudioSourceProvider {
+class MultiChannelResampler::ChannelProvider {
     WTF_MAKE_FAST_ALLOCATED;
 public:
-    explicit ChannelProvider(unsigned numberOfChannels)
-        : m_numberOfChannels(numberOfChannels)
+    explicit ChannelProvider(unsigned numberOfChannels, unsigned requestFrames)
+        : m_multiChannelBus(AudioBus::create(numberOfChannels, requestFrames))
     {
     }
 
     void setProvider(AudioSourceProvider* multiChannelProvider)
     {
-        m_currentChannel = 0;
-        m_framesToProcess = 0;
         m_multiChannelProvider = multiChannelProvider;
     }
 
     // provideInput() will be called once for each channel, starting with the first channel.
     // Each time it's called, it will provide the next channel of data.
-    void provideInput(AudioBus* bus, size_t framesToProcess) override
+    void provideInputForChannel(AudioBus* bus, size_t framesToProcess, unsigned channelIndex)
     {
+        ASSERT(channelIndex < m_multiChannelBus->numberOfChannels());
+        ASSERT(framesToProcess <= m_multiChannelBus->length());
+        if (framesToProcess > m_multiChannelBus->length())
+            return;
+
         bool isBusGood = bus && bus->numberOfChannels() == 1;
         ASSERT(isBusGood);
         if (!isBusGood)
             return;
 
         // Get the data from the multi-channel provider when the first channel asks for it.
         // For subsequent channels, we can just dish out the channel data from that (stored in m_multiChannelBus).
-        if (!m_currentChannel) {
+        if (!channelIndex) {
             m_framesToProcess = framesToProcess;
-            m_multiChannelBus = AudioBus::create(m_numberOfChannels, framesToProcess);
             m_multiChannelProvider->provideInput(m_multiChannelBus.get(), framesToProcess);
         }
 
         // All channels must ask for the same amount. This should always be the case, but let's just make sure.
-        bool isGood = m_multiChannelBus.get() && framesToProcess == m_framesToProcess;
+        bool isGood = framesToProcess == m_framesToProcess;
         ASSERT(isGood);
         if (!isGood)
             return;
 
         // Copy the channel data from what we received from m_multiChannelProvider.
-        ASSERT(m_currentChannel <= m_numberOfChannels);
-        if (m_currentChannel < m_numberOfChannels) {
-            memcpy(bus->channel(0)->mutableData(), m_multiChannelBus->channel(m_currentChannel)->data(), sizeof(float) * framesToProcess);
-            ++m_currentChannel;
-        }
+        memcpy(bus->channel(0)->mutableData(), m_multiChannelBus->channel(channelIndex)->data(), sizeof(float) * framesToProcess);
     }
 
 private:
     AudioSourceProvider* m_multiChannelProvider { nullptr };
     RefPtr<AudioBus> m_multiChannelBus;
-    unsigned m_numberOfChannels { 0 };
-    unsigned m_currentChannel { 0 };
     size_t m_framesToProcess { 0 }; // Used to verify that all channels ask for the same amount.
 };
 
-MultiChannelResampler::MultiChannelResampler(double scaleFactor, unsigned numberOfChannels, Optional<unsigned> requestFrames)
+MultiChannelResampler::MultiChannelResampler(double scaleFactor, unsigned numberOfChannels, unsigned requestFrames)
     : m_numberOfChannels(numberOfChannels)
-    , m_channelProvider(makeUnique<ChannelProvider>(m_numberOfChannels))
+    , m_channelProvider(makeUnique<ChannelProvider>(m_numberOfChannels, requestFrames))
 {
     // Create each channel's resampler.
     for (unsigned channelIndex = 0; channelIndex < numberOfChannels; ++channelIndex)
@@ -112,14 +108,25 @@ void MultiChannelResampler::process(AudioSourceProvider* provider, AudioBus* des
     // channelProvider wraps the original multi-channel provider and dishes out one channel at a time.
     m_channelProvider->setProvider(provider);
 
-    for (unsigned channelIndex = 0; channelIndex < m_numberOfChannels; ++channelIndex) {
-        // Depending on the sample-rate scale factor, and the internal buffering used in a SincResampler
-        // kernel, this call to process() will only sometimes call provideInput() on the channelProvider.
-        // However, if it calls provideInput() for the first channel, then it will call it for the remaining
-        // channels, since they all buffer in the same way and are processing the same number of frames.
-        m_kernels[channelIndex]->process(m_channelProvider.get(),
-                                         destination->channel(channelIndex)->mutableData(),
-                                         framesToProcess);
+    // We need to ensure that SincResampler only calls provideInput() once for each channel or it will confuse the logic
+    // inside ChannelProvider. To ensure this, we chunk the number of requested frames into SincResampler::chunkSize()
+    // sized chunks. SincResampler guarantees it will only call provideInput() once once we resample this way.
+    m_outputFramesReady = 0;
+    while (m_outputFramesReady < framesToProcess) {
+        size_t chunkSize = m_kernels[0]->chunkSize();
+        size_t framesThisTime = std::min(framesToProcess - m_outputFramesReady, chunkSize);
+
+        for (unsigned channelIndex = 0; channelIndex < m_numberOfChannels; ++channelIndex) {
+            ASSERT(chunkSize == m_kernels[channelIndex]->chunkSize());
+            bool wasProvideInputCalled = false;
+            m_kernels[channelIndex]->process(destination->channel(channelIndex)->mutableData() + m_outputFramesReady, framesThisTime, [this, channelIndex, &wasProvideInputCalled](AudioBus* bus, size_t framesToProcess) {
+                ASSERT_WITH_MESSAGE(!wasProvideInputCalled, "provideInputForChannel should only be called once");
+                wasProvideInputCalled = true;
+                m_channelProvider->provideInputForChannel(bus, framesToProcess, channelIndex);
+            });
+        }
+
+        m_outputFramesReady += framesThisTime;
     }
 
     m_channelProvider->setProvider(nullptr);
diff --git a/Source/WebCore/platform/audio/MultiChannelResampler.h b/Source/WebCore/platform/audio/MultiChannelResampler.h
@@ -41,7 +41,7 @@ class MultiChannelResampler final {
     WTF_MAKE_FAST_ALLOCATED;
 public:   
     // requestFrames constrols the size of the buffer in frames when AudioSourceProvider::provideInput() is called.
-    explicit MultiChannelResampler(double scaleFactor, unsigned numberOfChannels, Optional<unsigned> requestFrames = WTF::nullopt);
+    explicit MultiChannelResampler(double scaleFactor, unsigned numberOfChannels, unsigned requestFrames = SincResampler::defaultRequestFrames);
     ~MultiChannelResampler();
 
     // Process given AudioSourceProvider for streaming applications.
@@ -59,6 +59,7 @@ class MultiChannelResampler final {
 
     class ChannelProvider;
     std::unique_ptr<ChannelProvider> m_channelProvider;
+    size_t m_outputFramesReady { 0 };
 };
 
 } // namespace WebCore
diff --git a/Source/WebCore/platform/audio/SincResampler.cpp b/Source/WebCore/platform/audio/SincResampler.cpp
@@ -111,14 +111,18 @@
 
 namespace WebCore {
 
-constexpr unsigned defaultRequestFrames { 512 };
 constexpr unsigned kernelSize { 32 };
 constexpr unsigned numberOfKernelOffsets { 32 };
 
-SincResampler::SincResampler(double scaleFactor, Optional<unsigned> requestFrames)
+static size_t calculateChunkSize(unsigned blockSize, double scaleFactor)
+{
+    return blockSize / scaleFactor;
+}
+
+SincResampler::SincResampler(double scaleFactor, unsigned requestFrames)
     : m_scaleFactor(scaleFactor)
     , m_kernelStorage(kernelSize * (numberOfKernelOffsets + 1))
-    , m_requestFrames(requestFrames.valueOr(defaultRequestFrames))
+    , m_requestFrames(requestFrames)
     , m_inputBuffer(m_requestFrames + kernelSize) // See input buffer layout above.
     , m_r1(m_inputBuffer.data())
     , m_r2(m_inputBuffer.data() + kernelSize / 2)
@@ -137,6 +141,7 @@ void SincResampler::updateRegions(bool isSecondLoad)
     m_r3 = m_r0 + m_requestFrames - kernelSize;
     m_r4 = m_r0 + m_requestFrames - kernelSize / 2;
     m_blockSize = m_r4 - m_r2;
+    m_chunkSize = calculateChunkSize(m_blockSize, m_scaleFactor);
 
     // m_r1 at the beginning of the buffer.
     ASSERT(m_r1 == m_inputBuffer.data());
@@ -187,10 +192,10 @@ void SincResampler::initializeKernel()
     }
 }
 
-void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames)
+void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput)
 {
-    ASSERT(m_sourceProvider);
-    if (!m_sourceProvider)
+    ASSERT(provideInput);
+    if (!provideInput)
         return;
 
     // Wrap the provided buffer by an AudioBus for use by the source provider.
@@ -200,14 +205,14 @@ void SincResampler::consumeSource(float* buffer, unsigned numberOfSourceFrames)
     // FIXME: Find a way to make the following const-correct:
     m_internalBus->setChannelMemory(0, buffer, numberOfSourceFrames);
     
-    m_sourceProvider->provideInput(m_internalBus.get(), numberOfSourceFrames);
+    provideInput(m_internalBus.get(), numberOfSourceFrames);
 }
 
 namespace {
 
-// BufferSourceProvider is an AudioSourceProvider wrapping an in-memory buffer.
+// BufferSourceProvider is an audio source provider wrapping an in-memory buffer.
 
-class BufferSourceProvider : public AudioSourceProvider {
+class BufferSourceProvider {
 public:
     explicit BufferSourceProvider(const float* source, size_t numberOfSourceFrames)
         : m_source(source)
@@ -216,7 +221,7 @@ class BufferSourceProvider : public AudioSourceProvider {
     }
     
     // Consumes samples from the in-memory buffer.
-    void provideInput(AudioBus* bus, size_t framesToProcess) override
+    void provideInput(AudioBus* bus, size_t framesToProcess)
     {
         ASSERT(m_source && bus);
         if (!m_source || !bus)
@@ -253,27 +258,27 @@ void SincResampler::process(const float* source, float* destination, unsigned nu
     
     while (remaining) {
         unsigned framesThisTime = std::min(remaining, m_requestFrames);
-        process(&sourceProvider, destination, framesThisTime);
+        process(destination, framesThisTime, [&sourceProvider](AudioBus* bus, size_t framesToProcess) {
+            sourceProvider.provideInput(bus, framesToProcess);
+        });
         
         destination += framesThisTime;
         remaining -= framesThisTime;
     }
 }
 
-void SincResampler::process(AudioSourceProvider* sourceProvider, float* destination, size_t framesToProcess)
+void SincResampler::process(float* destination, size_t framesToProcess, const Function<void(AudioBus* bus, size_t framesToProcess)>& provideInput)
 {
-    ASSERT(sourceProvider);
-    if (!sourceProvider)
+    ASSERT(provideInput);
+    if (!provideInput)
         return;
-    
-    m_sourceProvider = sourceProvider;
 
     unsigned numberOfDestinationFrames = framesToProcess;
 
     // Step (1)
     // Prime the input buffer at the start of the input stream.
     if (!m_isBufferPrimed) {
-        consumeSource(m_r0, m_requestFrames);
+        consumeSource(m_r0, m_requestFrames, provideInput);
         m_isBufferPrimed = true;
     }
     
@@ -521,7 +526,7 @@ void SincResampler::process(AudioSourceProvider* sourceProvider, float* destinat
 
         // Step (5)
         // Refresh the buffer with more input.
-        consumeSource(m_r0, m_requestFrames);
+        consumeSource(m_r0, m_requestFrames, provideInput);
     }
 }
 
diff --git a/Source/WebCore/platform/audio/SincResampler.h b/Source/WebCore/platform/audio/SincResampler.h