The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 14, 2020
Filed:
Jun. 27, 2017
Microsoft Technology Licensing, Llc, Redmond, WA (US);
Rishabh Singh, Kirkland, WA (US);
Jeevana Priya Inala, Cambridge, MA (US);
MICROSOFT TECHNOLOGY LICENSING, LLC, Redmond, WA (US);
Abstract
Provided are methods and systems for joining semi-structured data from the web with relational data in a spreadsheet table using input-output examples. A first sub-task performed by the system learns a string transformation program to transform input rows of a table to URL strings that correspond to the webpages where the relevant data is present. A second sub-task learns a program in a rich web data extraction language to extract desired data from the webpage given the example extractions. Hierarchical search and input-driven ranking are used to efficiently learn the programs using few input-output examples. The learnt programs are then run on the remaining spreadsheet entries to join desired data from the corresponding web pages.