ios - 几分钟后 SpeechRecognizer 失败

coder 2024-01-28 原文

我正在开发一个使用 SFSpeechRecognizer 的 iOS 项目，它在开始时运行良好。我说了一些话，它就回应了。但是一两分钟后，它就失败了。它不提供任何认可结果的反馈。我想知道这是否与缓冲区有关，但我不知道如何解决。

我基本上是用SpeechRecognizer的demo来搭建工程的。不同的是我把识别出来的结果一个字一个字的存储在一个数组中。程序会分析数组并响应某些单词，例如“播放”或先前设置的其他一些命令。程序响应命令后，删除该数组元素。

话不多说，代码如下:

识别器，可以看到supportedCommands数组过滤了一些特定的词让程序响应。其他部分与https://developer.apple.com/library/content/samplecode/SpeakToMe/Listings/SpeakToMe_ViewController_swift.html#//apple_ref/doc/uid/TP40017110-SpeakToMe_ViewController_swift-DontLinkElementID_6的demo类似。

class SpeechRecognizer: NSObject, SFSpeechRecognizerDelegate {

    private var speechRecognizer: SFSpeechRecognizer!
    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest!
    private var recognitionTask: SFSpeechRecognitionTask!
    private let audioEngine = AVAudioEngine()
    private let locale = Locale(identifier: "en-US")

    private var lastSavedString: String = ""
    private let supportedCommands = ["more", "play"]

    var speechInputQueue: [String] = [String]()

    func load() {
        print("load")
        prepareRecognizer(locale: locale)

        authorize()
    }

    func start() {
        print("start")
        if !audioEngine.isRunning {
            try! startRecording()
        }
    }

    func stop() {
        if audioEngine.isRunning {
            audioEngine.stop()
            recognitionRequest?.endAudio()

        }
    }

    private func authorize() {
        SFSpeechRecognizer.requestAuthorization { authStatus in
            OperationQueue.main.addOperation {
                switch authStatus {
                case .authorized:
                    print("Authorized!")
                case .denied:
                    print("Unauthorized!")
                case .restricted:
                    print("Unauthorized!")
                case .notDetermined:
                    print("Unauthorized!")
                }
            }
        }
    }

    private func prepareRecognizer(locale: Locale) {
        speechRecognizer = SFSpeechRecognizer(locale: locale)!
        speechRecognizer.delegate = self
    }

    private func startRecording() throws {

        // Cancel the previous task if it's running.
        if let recognitionTask = recognitionTask {
            recognitionTask.cancel()
            self.recognitionTask = nil
        }

        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: .defaultToSpeaker)
        try audioSession.setMode(AVAudioSessionModeDefault)
        try audioSession.setActive(true, with: .notifyOthersOnDeactivation)

        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

        let inputNode = audioEngine.inputNode
        guard let recognitionRequest = recognitionRequest else { fatalError("Unable to created a SFSpeechAudioBufferRecognitionRequest object") }

        // Configure request so that results are returned before audio recording is finished
        recognitionRequest.shouldReportPartialResults = true

        // A recognition task represents a speech recognition session.
        // We keep a reference to the task so that it can be cancelled.
        recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
            var isFinal = false

            if let result = result {

                let temp = result.bestTranscription.formattedString.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines).lowercased()
                //print("temp", temp)
                if temp != self.lastSavedString && temp.count > self.lastSavedString.count {

                    var tempSplit = temp.split(separator: " ")
                    var lastSplit = self.lastSavedString.split(separator: " ")
                    while lastSplit.count > 0 {
                        if String(tempSplit[0]) == String(lastSplit[0]) {
                            tempSplit.remove(at: 0)
                            lastSplit.remove(at: 0)
                        }
                        else {
                            break
                        }
                    }

                    for command in tempSplit {
                        if self.supportedCommands.contains(String(command)) {
                            self.speechInputQueue.append(String(command))
                        }
                    }
                    self.lastSavedString = temp

                }
                isFinal = result.isFinal
            }

            if error != nil || isFinal {
                self.audioEngine.stop()
                inputNode.removeTap(onBus: 0)
                self.recognitionRequest = nil
                self.recognitionTask = nil
            }
        }

        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
            self.recognitionRequest?.append(buffer)
        }

        audioEngine.prepare()

        try audioEngine.start()

    }
}

我们如何使用它:

    if self.speechRecognizer.speechInputQueue.count > 0 {
    if self.speechRecognizer.speechInputQueue[0] == "more" {
        print("temp", temp)
        print("content", content)
       // isSpeakingContent = true
        self.textToSpeech(text: content)
    }
    else if self.speechRecognizer.speechInputQueue[0] == "play" {
        print("try to play")
        let soundURL = URL(fileURLWithPath: Bundle.main.path(forResource: "cascade", ofType: "wav")!)

        do {
            audioPlayer = try AVAudioPlayer(contentsOf: soundURL)
        }
        catch {
            print(error)
        }
        audioPlayer.prepareToPlay()
        audioPlayer.play()
    }
    else {
        self.textToSpeech(text: "unrecognized command")
    }
    self.speechRecognizer.speechInputQueue.remove(at: 0)
    print("after :", self.speechRecognizer.speechInputQueue)
}

它响应某些命令并播放一些音频。

Buffer有问题吗？也许识别一两分钟后，缓冲区就满了？识别器只是随着时间的推移而失败。

最佳答案

来自 WWDC 2016 Session 509: Speech Recognition API :

For iOS 10 we're starting with a strict audio duration limit of about one minute which is similar to that of keyboard dictation.

关于ios - 几分钟后 SpeechRecognizer 失败，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49878238/

SpeechRecognizer ios 34 self recognitionRequest swift speech-recognition speech-to-text sfspeechrecognizer

有关ios - 几分钟后 SpeechRecognizer 失败的更多相关文章

ruby - 即使失败也继续进行多主机测试 - 2
我已经构建了一些serverspec代码来在多个主机上运行一组测试。问题是当任何测试失败时，测试会在当前主机停止。即使测试失败，我也希望它继续在所有主机上运行。Rakefile:namespace:specdotask:all=>hosts.map{|h|'spec:'+h.split('.')[0]}hosts.eachdo|host|begindesc"Runserverspecto#{host}"RSpec::Core::RakeTask.new(host)do|t|ENV['TARGET_HOST']=hostt.pattern="spec/cfengine3/*_spec.r
ruby - 如何验证 IO.copy_stream 是否成功 - 2
这里有一个很好的答案解释了如何在Ruby中下载文件而不将其加载到内存中:https://stackoverflow.com/a/29743394/4852737require'open-uri'download=open('http://example.com/image.png')IO.copy_stream(download,'~/image.png')我如何验证下载文件的IO.copy_stream调用是否真的成功——这意味着下载的文件与我打算下载的文件完全相同，而不是下载一半的损坏文件？documentation说IO.copy_stream返回它复制的字节数，但是当我还没有下
Ruby 文件 IO 定界符？ - 2
我正在尝试解析一个文本文件，该文件每行包含可变数量的单词和数字，如下所示:foo4.500bar3.001.33foobar如何读取由空格而不是换行符分隔的文件？有什么方法可以设置File("file.txt").foreach方法以使用空格而不是换行符作为分隔符？最佳答案接受的答案将slurp文件，这可能是大文本文件的问题。更好的解决方案是IO.foreach.它是惯用的，将按字符流式传输文件:File.foreach(filename,""){|string|putsstring}包含“thisisanexample”结果的
Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting - 2
1.错误信息：Errorresponsefromdaemon:Gethttps://registry-1.docker.io/v2/:net/http:requestcanceledwhilewaitingforconnection(Client.Timeoutexceededwhileawaitingheaders)或者：Errorresponsefromdaemon:Gethttps://registry-1.docker.io/v2/:net/http:TLShandshaketimeout2.报错原因：docker使用的镜像网址默认为国外，下载容易超时，需要修改成国内镜像地址（首先阿里
ruby-on-rails - 创建 ruby 数据库时惰性符号绑定(bind)失败 - 2
我正在尝试在Rails上安装ruby，到目前为止一切都已安装，但是当我尝试使用rakedb:create创建数据库时，我收到一个奇怪的错误:dyld:lazysymbolbindingfailed:Symbolnotfound:_mysql_get_client_infoReferencedfrom:/Library/Ruby/Gems/1.8/gems/mysql2-0.3.11/lib/mysql2/mysql2.bundleExpectedin:flatnamespacedyld:Symbolnotfound:_mysql_get_client_infoReferencedf
ruby - 正则表达式在哪个位置失败？ - 2
我需要一个非常简单的字符串验证器来显示第一个符号与所需格式不对应的位置。我想使用正则表达式，但在这种情况下，我必须找到与表达式相对应的字符串停止的位置，但我找不到可以做到这一点的方法。(这一定是一种相当简单的方法……也许没有？)例如，如果我有正则表达式:/^Q+E+R+$/带字符串:"QQQQEEE2ER"期望的结果应该是7 最佳答案一个想法:你可以做的是标记你的模式并用可选的嵌套捕获组编写它:^(Q+(E+(R+($)?)?)?)?然后你只需要计算你获得的捕获组的数量就可以知道正则表达式引擎在模式中停止的位置，你可以确定匹配结束
ruby - 使用 rbenv 和 ruby-build 构建 ruby 失败，出现 undefined symbol : SSLv2_method - 2
我正在尝试在配备ARMv7处理器的SynologyDS215j上安装ruby2.2.4或2.3.0。我用了optware-ng安装gcc、make、openssl、openssl-dev和zlib。我根据README中的说明安装了rbenv(版本1.0.0-19-g29b4da7)和ruby-build插件。.这些是随optware-ng安装的软件包及其版本binutils-2.25.1-1gcc-5.3.0-6gconv-modules-2.21-3glibc-opt-2.21-4libc-dev-2.21-1libgmp-6.0.0a-1libmpc-1.0.2-1libm
ruby-on-rails - Ruby 的 'open_uri' 是否在读取或失败后可靠地关闭套接字？ - 2
一段时间以来，我一直在使用open_uri下拉ftp路径作为数据源，但突然发现我几乎连续不断地收到“530抱歉，允许的最大客户端数(95)已经连接。”我不确定我的代码是否有问题，或者是否是其他人在访问服务器，不幸的是，我无法真正确定谁有问题。本质上，我正在读取FTPURI:defself.read_uri(uri)beginuri=open(uri).readuri=="Error"?nil:urirescueOpenURI::HTTPErrornilendend我猜我需要在这里添加一些额外的错误处理代码...我想确保我采取一切预防措施来关闭所有连接，这样我的连接就不是问题所在，但是我
ruby - 为什么不能使用类IO的实例方法noecho？ - 2
print"Enteryourpassword:"pass=STDIN.noecho(&:gets)puts"Yourpasswordis#{pass}!"输出:Enteryourpassword:input.rb:2:in`':undefinedmethod`noecho'for#>(NoMethodError) 最佳答案一开始require'io/console'后来的Ruby1.9.3 关于ruby-为什么不能使用类IO的实例方法noecho？，我们在StackOverflow上
ruby-on-rails - Ruby 流量控制 : throw an exception, 返回 nil 还是让它失败？ - 2
我在思考流量控制的最佳实践。我应该走哪条路？1)不要检查任何东西并让程序失败(更清晰的代码，自然的错误消息):defself.fetch(feed_id)feed=Feed.find(feed_id)feed.fetchend2)通过返回nil静默失败(但是，“CleanCode”说，你永远不应该返回null):defself.fetch(feed_id)returnunlessfeed_idfeed=Feed.find(feed_id)returnunlessfeedfeed.fetchend3)抛出异常(因为不按id查找feed是异常的):defself.fetch(feed_id

ios - 几分钟后 SpeechRecognizer 失败

有关ios - 几分钟后 SpeechRecognizer 失败的更多相关文章

随机推荐