This project is a wrapper for the PatentsView API.
- PatentsView Glossary provides a description of the variables.
- An API key is required, one can be requested here.
- There needs to be an environmental variable PATENTSVIEW_API_KEY set to the value of your API key.
- The endpoints of the new version of the API are at https://search.patentsview.org, they had last been at https://api.patentsview.org/.
- The API team produced a Swagger UI page for the new version of the API. Your API key can be entered by pressing the Authorize button.
- The new version of the API now returns USPC classifications, but we'll use CPC classifications. After May 2015 the patent office stopped assigning USPCs to utility patents.
- Patent Numbers are alphanumeric (they can include letters)
- PatentsView only includes information about the patent at issue. It does not include changes to patent information after the patent has been issued.
- This means that if the company changes name, it won't be reflected in the patent. Example: if "International Business Machines" renames itself to "IBM", patents issued to "International Business Machines" will still be issued to "International Business Machines" (and not "IBM").
- As an example:
NETFLIX, INC.
has anassignee_key_id
of17594
and anassignee_id
oforg_2lAuxOpAtNMvtTxhuLmX
;NETFLIX.COM, INC.
on the other hand anassignee_key_id
oforg_UNHkzir8tY7NlQrOJKT4
and anassignee_id
of363028
. (This of course assumesNETFLIX, INC.
andNETFLIX.COM, INC.
are the same company, which is highly probable). - The same applies for acquisitions. Example: Company A has patent X; once company B acquires company A, patent X would still show that it is assigned to company A.
- Probably the same thing holds if a company acquires certain patents of another company.
- The patents can be assigned to organizations (as opposed to individuals). This is indicated by the 'assignees.assignee_organization' field returned by the API.
- The assignee organizations (i.e. companies) are distinguished by name. Each organization name is a 'separate' company.
- This means that a patent can be assigned to "IBM", "IBM Inc.", "International Business Machines".
- Different organization names have different
assignee_id
s andassignee_key_id
s (seeNETFLIX
example above).
Create an Microsoft Excel spreadsheet (.xlsx
file) with the following structure:
Firm ID | Firm Name | Alternative names | ||||
---|---|---|---|---|---|---|
ID | Name 1 | Name 2 | Name 3 | Name 4 | ... | Name X |
ID2 | Company 2 Primary Name / Name 1 | Name 2 | Name 3 | Name 4 | ... | Name X |
ID1 | Company 1 Primary Name / Name 1 | Name 2 | Name 3 | Name 4 | ... | Name X |
Here is an Entity Relationship Diagram (ERD) of the database structure.
The sql
folder has some SQL scripts that might come in handy.
As an example, here is SQL query that selects patents between two dates:
SELECT
p.patent_number as "Patent Number",
p.patent_title as "Patent Title",
-- p.company_id as "Company ID",
c.name as "Company Name",
-- p.company_alternate_name_id as "Alternate Name ID",
an.name as "Company Name Listed on Patent",
p.year,
p.grant_date as "Grant Date",
p.cpc_group_id as "CPC Subsections"
FROM
patents as p
JOIN
companies as c
ON
p.company_id = c.id
LEFT JOIN
alternate_company_names as an
ON
p.company_alternate_name_id = an.id
WHERE
p.grant_date > DATE("2006-01-03") AND
p.grant_date < DATE("2010-06-13");
- DbVisualizer was used to generate the graphs
- DB Browser for SQLite was used to look at the data and execute SQL queries