The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12165654 B1

Date of Patent:

Dec. 10, 2024

Filed:

Aug. 31, 2020

Multi-speaker diarization of audio input using a neural network

Applicants:

The Johns Hopkins University, Baltimore, MD (US);

Hitachi, Ltd., Tokyo, JP;

Inventors:

Yusuke Fujita, Tokyo, JP;

Shinji Watanabe, Ellicott City, MD (US);

Naoyuki Kanda, Tokyo, JP;

Shota Horiguchi, Tokyo, JP;

Assignees:

The Johns Hopkins University, Baltimore, MD (US);

Hitachi, Ltd., Tokyo, JP;

Attorney:

Harrity & Harrity, LLP

Primary Examiner:

Seong-Ah A Shin

Int. Cl.

CPC ...

G10L 15/20 (2006.01); G06N 3/045 (2023.01); G10L 15/06 (2013.01); G10L 17/04 (2013.01); G10L 17/18 (2013.01);

U.S. Cl.

CPC ...

G10L 17/18 (2013.01); G06N 3/045 (2023.01); G10L 17/04 (2013.01);

Abstract

An audio analysis platform may receive a portion of an audio input, wherein the audio input corresponds to audio associated with a plurality of speakers. The audio analysis platform may process, using a neural network, the portion of the audio input to determine voice activity of the plurality of speakers during the portion of the audio input, wherein the neural network is trained using reference audio data and reference diarization data corresponding to the reference audio data. The audio analysis platform may determine, based on the neural network being used to process the portion of the audio input, a diarization output associated with the portion of the audio input, wherein the diarization output indicates individual voice activity of the plurality of speakers. The audio analysis platform may provide the diarization output to indicate the individual voice activity of the plurality of speakers during the portion of the audio input.

Find Patent Forward Citations