« Building Scalable Web Archives » will be presented in the Technical Program on Thursday May 15, 2014, at Arsenal Cinema (Berlin)
Internet Memory Foundation is glad to participate to the Archiving Conference organized by the Society for Imaging Science and Technology (IS&T).
Our presentation Building Scalable Web Archives aims at introducing the Internet Memory Foundation platform based on its distributed infrastructure and the associated tools and workflows that facilitate data management and preservation actions at large scale.
IMF’s main concern over the past years has been related to scalability issues in terms of crawling, indexing, preserving and accessing content. To answer these issues, the foundation developed its own crawler and built a new infrastructure.
This presentation will outline the difficulty of analyzing content stored in (W)ARCs and the solution applied within IMF platform. We will also describe our automated quality assurance workflow and the results obtained through our new approach.