The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 08, 2022

Filed:

Mar. 05, 2018
Applicant:

Yodlee, Inc., Redwood City, CA (US);

Inventors:

Deepak Chandrakant Patil, Dombivli, IN;

Rakesh Kumar Ranjan, Bangalore, IN;

Shibsankar Das, Bangalore, IN;

Siddhartha Saxena, Bangalore, IN;

Om Dadaji Deshmukh, Bangalore, IN;

Assignee:

Yodlee, Inc., San Mateo, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 20/00 (2019.01); G06F 16/35 (2019.01);
U.S. Cl.
CPC ...
G06N 20/00 (2019.01); G06F 16/355 (2019.01);
Abstract

Methods, systems and computer program products generating diverse and representative set of samples from a large amount of transaction data are disclosed. A data sampling system receives transaction records. Each transaction record has multiple text segments. The system selects a subset of transaction records that contain least frequently appeared text segments. The system determines a respective vector representation for each selected transaction record. The system can measure similarity between transaction records based on the vector representations. The system assigns the selected transaction records to multiple clusters based on the vector representations and designated dimensions of importance. The system identifies one or more anchors that include transaction records on boundaries between clusters. The system filters the subset of transaction records by removing transaction records that are close to the anchors. The system then provides the filtered subset as a representative set of samples to a sample consumer.


Find Patent Forward Citations

Loading…