The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11126797 B1

Date of Patent:

Sep. 21, 2021

Filed:

Jul. 02, 2019

Toxic vector mapping across languages

Applicant:

Spectrum Labs, Inc., San Francisco, CA (US);

Inventors:

Josh Newman, San Ramon, CA (US);

Yacov Salomon, Danville, CA (US);

Jonathan Thomas Purnell, Natick, MA (US);

Indrajit Haridas, San Francisco, CA (US);

Alexander Greene, East Palo Alto, CA (US);

Assignee:

Spectrum Labs, Inc., San Francisco, CA (US);

Attorney:

Holland & Hart LLP

Primary Examiner:

Paras D Shah

Assistant Examiner:

Darioush Agahi

Int. Cl.

CPC ...

G06F 40/30 (2020.01); G06F 16/33 (2019.01); G06N 7/00 (2006.01); G06N 3/08 (2006.01);

U.S. Cl.

CPC ...

G06F 40/30 (2020.01); G06F 16/3347 (2019.01); G06N 3/08 (2013.01); G06N 7/005 (2013.01);

Abstract

Methods, systems, and devices for language mapping are described. Some machine learning models may be trained to support multiple languages. However, word embedding alignments may be too general to accurately capture the meaning of certain words when mapping different languages into a single reference vector space. To improve the accuracy of vector mapping, a system may implement a supervised learning layer to refine the cross-lingual alignment of particular vectors corresponding to a vocabulary of interest (e.g., toxic language). This supervised learning layer may be trained using a dictionary of toxic words or phrases across the different supported languages in order to learn how to weight an initial vector alignment to more accurately map the meanings behind insults, threats, or other toxic words or phrases between languages. The vector output from this weighted mapping can be sent to supervised models, trained on the reference vector space, to determine toxicity scores.

Find Patent Forward Citations