The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 20, 2022

Filed:

Nov. 23, 2021
Applicant:

Ne47 Bio, Inc., Hillsborough, NC (US);

Inventors:

Tristan Bepler, Hillsborough, NC (US);

Bonnie Berger Leighton, Newtonville, MA (US);

Assignee:

NE47 Bio, Inc., Hillsborough, NC (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G01N 33/48 (2006.01); G01N 33/50 (2006.01); G16B 30/10 (2019.01); G06F 16/2453 (2019.01); G16B 40/20 (2019.01); G06N 3/08 (2006.01); G06N 3/04 (2006.01); G06F 16/22 (2019.01);
U.S. Cl.
CPC ...
G16B 30/10 (2019.02); G06F 16/2255 (2019.01); G06F 16/24534 (2019.01); G06N 3/0445 (2013.01); G06N 3/08 (2013.01); G16B 40/20 (2019.02);
Abstract

A method for efficient search of protein sequence databases for proteins that have sequence, structural, and/or functional homology with respect to information derived from a search query. The method involves transforming the protein sequences into vector representations and searching in a vector space. Given a database of protein sequences and a learned embedding model, the embedding model is applied to each amino acid sequence to transform it into a sequence of vector representations. A query sequence is also transformed into a sequence of vector representations, preferably using the same learned embedding model. Once the query has been embedded in this manner, proteins are retrieved from the database based on distance between the query embedding and the protein embeddings contained within the database. Rapid and accurate search of the vector space is carried out using exact search using metric data structures, or approximate search using locality sensitive hashing.


Find Patent Forward Citations

Loading…