The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Mar. 29, 2022
Filed:
Mar. 22, 2019
Pinterest, Inc., San Francisco, CA (US);
Heath Vinicombe, San Francisco, CA (US);
Yunsong Guo, San Mateo, CA (US);
Yu Liu, Los Altos, CA (US);
Anant Srinivas Subramanian, San Mateo, CA (US);
Pinterest, Inc., San Francisco, CA (US);
Abstract
Systems and methods are set forth for identifying key-words and key-phrases, collectively referred to as key-terms, from a document. A document is accessed and the document is tokenized, each token corresponding to a word or phrase occurring within the document. Term frequencies of the terms of the tokens may be determined and TF-IDF scores may be generated according to the term frequencies. Embedding vectors for the terms of the tokens may be generated and a document embedding vector may be generated according to the embedding vectors of the documents. A similarity score may be determined for each token according to the embedding vector of a token and the document embedding vector. Additionally, an overall score may be determined for each token according to the term of the token, a TF-IDF score, similarity scores, and the like. Terms from the highest scoring tokens are selected as the key-terms for the document.