The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 10, 2008

Filed:

Aug. 04, 2003
Applicants:

Alexander Franz, Palo Alto, CA (US);

Brian Milch, Berkeley, CA (US);

Eric Jackson, Palo Alto, CA (US);

Jenny Zhou, Sunnyvale, CA (US);

Benjamin Diament, Berkeley, CA (US);

Inventors:

Alexander Franz, Palo Alto, CA (US);

Brian Milch, Berkeley, CA (US);

Eric Jackson, Palo Alto, CA (US);

Jenny Zhou, Sunnyvale, CA (US);

Benjamin Diament, Berkeley, CA (US);

Assignee:

Google Inc., Mountain View, CA (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 17/20 (2006.01); G10L 15/00 (2006.01);
U.S. Cl.
CPC ...
Abstract

A system and method for identifying language attributes through probabilistic analysis is described. A set of language classes and a plurality of training documents are defined, Each language class identifies a language and a character set encoding. Occurrences of one or more document properties within each training document are evaluated. For each language class, a probability for the document properties set conditioned on the occurrence of the language class is calculated. Byte occurrences within each training document are evaluated. For each language class, a probability for the byte occurrences conditioned on the occurrence of the language class is calculated.


Find Patent Forward Citations

Loading…