The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Apr. 21, 2020
Filed:
Feb. 07, 2017
Fmr Llc, Boston, MA (US);
Aravind Chandramouli, Bangalore, IN;
FMR LLC, Boston, MA (US);
Abstract
Methods and apparatuses are described for analyzing unstructured computer text for domain-specific stopword identification and removal. A computer data store stores unstructured text. A server computing device splits the unstructured text into phrases and generates tokens from the phrases. The server computing device generates a set of bootstrap keywords using the tokens. An artificial intelligence neural network executing on the server computing device generates a stopword training model. The server computing device generates a first set of candidate stopwords using the bootstrap keywords and the stopword training model. The server computing device generates regular expressions using the bootstrap keywords, and generates a second set of candidate stopwords using the regular expressions. The server computing device stores the candidate stopwords in the data store, and removes stopwords from the unstructured text using the data store.