The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 14, 2020

Filed:

Jun. 27, 2017
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Rishabh Singh, Kirkland, WA (US);

Jeevana Priya Inala, Cambridge, MA (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/18 (2020.01); H04L 29/12 (2006.01); H04L 29/08 (2006.01); G06F 3/0482 (2013.01); G06F 16/335 (2019.01); G06F 16/9535 (2019.01); G06F 16/25 (2019.01);
U.S. Cl.
CPC ...
G06F 40/18 (2020.01); G06F 3/0482 (2013.01); G06F 16/25 (2019.01); G06F 16/337 (2019.01); G06F 16/9535 (2019.01); H04L 61/1552 (2013.01); H04L 67/02 (2013.01); H04L 67/36 (2013.01);
Abstract

Provided are methods and systems for joining semi-structured data from the web with relational data in a spreadsheet table using input-output examples. A first sub-task performed by the system learns a string transformation program to transform input rows of a table to URL strings that correspond to the webpages where the relevant data is present. A second sub-task learns a program in a rich web data extraction language to extract desired data from the webpage given the example extractions. Hierarchical search and input-driven ranking are used to efficiently learn the programs using few input-output examples. The learnt programs are then run on the remaining spreadsheet entries to join desired data from the corresponding web pages.


Find Patent Forward Citations

Loading…