About the link graphs

Background

Link graphs extracted from the Danish Internet Archive Netarkivet. This archive contains harvested webpages from 1998 and up to today. The total number of resources in the archive is currently about 28 billion documents. The link graphs can also be downloaded in lossless SVG format below.

Technical

The data for the link graph is extraced using SolrWayback, which can be found at GitHub SolrWayback. SolrWayback is both a discovery platform for searching in historical webpages and a playback tool for viewing the webpages. It comes with various tools including the link graph export tool. The link graphs are exported in a CSV format that can be imported into Gephi. In Gephi, the data is processed and the link graph is saved in a lossless SVG format. This SVG file is then further processed by the libvips tool for image processing. Finally the link graph is using the equally excellent OpenSeadragon for display. The size of the domain name is determined by how central the domain is in the link graph (betweenness centrality).

Plans

Additional link graphs will be added.

Contact

We welcome feedback and ideas for future development.

Back to the link graphs

Commodore64/Amiga link graph

Wikipedia.dk link graph

KB/statsbiblioteket link graph

The early internet 1998 to 2003 link graph