The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 07, 2019
Filed:
Jun. 29, 2018
International Business Machines Corporation, Armonk, NY (US);
Md Faisal M. Chowdhury, White Plains, NY (US);
Michael R. Glass, Bayonne, NJ (US);
Alfio M. Gliozzo, Brooklyn, NY (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.