The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 07, 2023

Filed:

Feb. 20, 2019
Applicant:

Robert Bosch Gmbh, Stuttgart, DE;

Inventors:

Asif Salekin, Charlottesville, VA (US);

Zhe Feng, Mountain View, CA (US);

Shabnam Ghaffarzadegan, San Mateo, CA (US);

Assignee:

Robert Bosch GmbH, Stuttgart, DE;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G08B 13/16 (2006.01); G06F 16/683 (2019.01); G06N 3/049 (2023.01); G10L 15/26 (2006.01); G10L 25/30 (2013.01); G06N 3/045 (2023.01); G10L 15/16 (2006.01); G10L 25/51 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 15/08 (2006.01);
U.S. Cl.
CPC ...
G08B 13/1672 (2013.01); G06F 16/683 (2019.01); G06N 3/045 (2023.01); G06N 3/049 (2013.01); G10L 15/16 (2013.01); G10L 15/26 (2013.01); G10L 25/30 (2013.01); G10L 25/51 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 2015/088 (2013.01);
Abstract

A method and system for detecting and localizing a target audio event in an audio clip is disclosed. The method and system use utilizes a hierarchical approach in which a dilated convolutional neural network to detect the presence of the target audio event anywhere in an audio clip based on high level audio features. If the target audio event is detected somewhere in the audio clip, the method and system further utilizes a robust audio vector representation that encodes the inherent state of the audio as well as a learned relationship between state of the audio and the particular target audio event that was detected in the audio clip. A bi-directional long short term memory classifier is used to model long term dependencies and determine the boundaries in time of the target audio event within the audio clip based on the audio vector representations.


Find Patent Forward Citations

Loading…