Not logged in : Login
(Sponging disallowed)

About: Web Data Commons - RDFa, Microdata, and Microformats Data Sets - December 2014     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : schema:WebPage, within Data Space : kingsley.idehen.net associated with source document(s)
QRcode icon
http://kingsley.idehen.net/describe/?url=http%3A%2F%2Fwebdatacommons.org%2Fstructureddata%2F2014-12%2Fstats%2Fstats.html

This document provides statistics about the Web Data Commons RDFa, Microdata and Microformats data sets which have been extracted from the December 2014 release of the Common Crawl.

AttributesValues
type
Date Modified
label
  • Web Data Commons - RDFa, Microdata, and Microformats Data Sets - December 2014
comment
  • This document provides statistics about the Web Data Commons RDFa, Microdata and Microformats data sets which have been extracted from the December 2014 release of the Common Crawl.
seeAlso
Description
  • In summary, this project reports on the discovery of structured data within 620 million HTML pages out of the 2.01 billion pages contained in the crawl (30%). These pages originate from 2.72 million different pay-level-domains out of the 15.68 million pay-level-domains covered by the crawl (17%). Altogether, the extracted data sets consist of 20.48 billion RDF quads. Instructions on how to download the RDFa, Microdata, and Microformats data sets are given on the page how to get the data.
Format
  • text/html
Subject
mentions
xhv:related
https://twitter.com/hashtag/ht#this
is Subject of
Faceted Search & Find service v1.17_git139 as of Feb 29 2024


Alternative Linked Data Documents: PivotViewer | iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3330 as of Mar 11 2024, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (7 GB total memory, 6 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software