The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 03, 2020
Filed:
Apr. 21, 2020
Clinc, Inc., Ann Arbor, MI (US);
Joseph Peper, Ann Arbor, MI (US);
Parker Hill, Ann Arbor, MI (US);
Kevin Leach, Ann Arbor, MI (US);
Sean Stapleton, Ann Arbor, MI (US);
Jonathan K. Kummerfeld, Ann Arbor, MI (US);
Johann Hauswald, Ann Arbor, MI (US);
Michael Laurenzano, Ann Arbor, MI (US);
Lingjia Tang, Ann Arbor, MI (US);
Jason Mars, Ann Arbor, MI (US);
Clinc, Inc., Ann Arbor, MI (US);
Abstract
Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utterance from the second corpus of utterances and the second in-domain utterance from the first corpus of utterances.