rdfabout.net: Resource Description Framework
U.S. Securities and Exchange Commission Corporate Ownership RDF Data

First posted April 2008, last updated February 2010

NOTE: This project is no longer maintained.

This is a semantic web, RDF, linked-data, and SPARQL interface to U.S. corporate ownership information derived from filings to the U.S. Securities and Exchange Commission in its EDGAR database. It is 2.5 million triples in all. Some example dereferencable URIs are given below which you can use to explore the data in your browser, and below that you'll find a box to run SPARQL queries, and below the box some example queries. There are three parts to this database.

Part I: Individual Ownership via SEC forms 3, 4, 5

Data from U.S. SEC forms 3, 4, and 5 from the SEC EDGAR database, January 2004-June 2008, which contains information on stock ownership of corporations by officers, board members, executives, and large shareholders (10%). Large shareholders also includes corporate shareholders (i.e. parent companies). The records exist at time points when the interest (i.e. stock ownership) of an individual or corporation changes, but when they still have stock. It is thus possible (and likely) that individuals who are no longer in such a relation with a corporation are still listed as such in this data. I really hardly know anything about the SEC, so my interpretation of any of their data is quite fast and loose.

It is 1.8M triples. Source code to download from the SEC and transform the data are browsable here, and you can also download the N3 file with the triples (8MB gzipped).

Clearly not all U.S. corporations are required to make this filing, so the coverage is limited in a way I do not know.

Part II: Subsidiary Information via 10-K Filings via CorpWatch

The CorpWatch API contains subsidiary information scraped from SEC 10-K filings from 2003 to February 2009. I've imported their data as well. It is 2,725,160 triples.

Not all U.S. corporations are required to make this filing, but further not all corporations mentioned in the filing are required to file with the SEC, so some corporations with subsidiary relations do not have officer/board/etc. relations as described above.

Part III: Links to DBPedia

I've generated some owl:sameAs links to companies listed in DBPedia. It's just 86 triples.

Notable URIs

NOTE: These URLs are no longer dereferencable.

Here are some notable URIs in the data:

  • News Corp (owner of FOX and other media things): <http://www.rdfabout.com/rdf/usgov/sec/id/cik0001308161>
  • Rupert Murdoch (media mogul behind News Corp): <http://www.rdfabout.com/rdf/usgov/sec/id/cik0001024835>
  • Ford Motor Company: <http://www.rdfabout.com/rdf/usgov/sec/id/cik0000037996>
  • EBay: <http://www.rdfabout.com/rdf/usgov/sec/id/cik0001065088>

Schema

Here's a brief run-down of the schema. The main namespace ("sec:") is <http://www.rdfabout.com/rdf/schema/ussec/>. The sec:cik predicate relates a corporation or individual in the data to its Central Index Key (CIK), a plain literal value which can be found on EDGAR. Everything with a CIK is given a foaf:name. When known, corporations are typed foafcorp:Company and individuals foaf:Person, but the SEC data is light on that.

Corporations that have stock trading symbols have that information in the sec:tradingSymbol property (which gives a plain literal value).

Owners of corporations (either individuals or other corporations) have an address given in the vcard:ADR property (it links to a bnode which in turn has standard vcard properties). The sec:hasRelation property relates owners to "relation" bnodes that indicate the corporation (sec:corporation), type of relation (rdf:type to sec:DirectorRelation, sec:OfficerRelation, or sec:TenPercentOwnerRelation, or none of these if it is another type of relation), and the date the relation was indicated for (dc:date). In addition, a sec:officerTitle predicate of one of these relation bnodes indicates the title of the individual owner, such as "President". It is a plain literal.

The sec:hasSubsidiary relation indicates a relationship between a parent company and a subsidiary (but see the notes from CorpWatch about the reliability of this information).

SPARQL Query

NOTE: The SPARQL endpoint is no longer available.

Useful Namespaces

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>

PREFIX sec: <http://www.rdfabout.com/rdf/schema/ussec/>
PREFIX seccik: <http://www.rdfabout.com/rdf/usgov/sec/id/>

Example Queries

Try out these examples:

Find who is on the board of directors of both Ford and EBay:

PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX sec: <http://www.rdfabout.com/rdf/schema/ussec/>
PREFIX seccik: <http://www.rdfabout.com/rdf/usgov/sec/id/>
SELECT ?name WHERE {
    [foaf:name ?name]
        sec:hasRelation [ sec:corporation [foaf:name "FORD MOTOR CO"] ];
        sec:hasRelation [ sec:corporation [foaf:name "EBAY INC"] ].
}

Get a list of all officers/directors/etc. who own shares of any News Corp subsidiary (not including News Corp itself), subject to the data limitations above:

PREFIX rdf:  <ttp://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sec: <http://www.rdfabout.com/rdf/schema/ussec/>
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>

SELECT ?name ?corp WHERE {
  <http://www.rdfabout.com/rdf/usgov/sec/id/cik0001308161> sec:hasSubsidiary ?o .
  ?p foaf:name ?name .
  ?p sec:hasRelation [ sec:corporation ?o ] .
  ?o foaf:name ?corp .
}
This site is run by Joshua Tauberer.