Web Archiving for Uninterruptible Web Service (UWS)

Collaborators

  • Edward Fox, Professor, Department of Computer Science, Virginia Tech

External Funding

Students

  • Prashant Chandrasekar, Computer Science MS student, Virginia Tech, May 2014 - Sep 2015. Now an Assistant Professor, Department of Computer Science, University of Mary Washington.

  • Andrej Galad, MS Computer Science, Virginia Tech, Dec 2015 - May 2016. Now at Google.

  • Shivam Maharshi, MS Computer Science, Virginia Tech, Dec 2015 - May 2017. Now at Amazon. I am Shivam's thesis committee member (chaired by Ed Fox).

  • Krati Nayyar, MS student, Computer Science, Virginia Tech, Jan - May 2016. Now at Adobe.

  • Saket Vishwasrao, MS Computer Science, Virginia Tech, May 2016 - May 2017. Now at Uber. I am Saket's thesis committee member (chaired by Ed Fox).

  • Abhinav Kumar, MS Computer Science, Virginia Tech, Aug 2017 - May 2019. Now at Amazon. I am Abhinav's thesis committee member (chaired by Ed Fox).

Synopsis

Preservation for the sake of preservation? This is a popular criticism of digital preservation, cultural heritage, and web archiving. Indeed, the IT industry spent billions of dollars every year on backup and disaster recovery, but very few are interested in adopting existing systems and tools designed for long-term preservation. Why? Because they don't see any immediate need for it.

This project bridges the gap. If a website archives its own content, then when the website is overloaded, hacked, or disrupted for any reason, our tool will switch on the archived copy to handle the incoming requests. Like a UPS for electricity, UWS provides a practical albeit temporary relief while operational problems are being resolved. Digital preservation now becomes a side effect of the operational best practices.

The tool was initially developed in 2015 and subsequently improved during multiple iterations.

Related Publications

  • Zhiwu Xie. Archiving Transactions Towards an Uninterruptible Web Service, In International Internet Preservation Consortium (IIPC) General Assembly, Stanford, CA. Apr 27­May 1, 2015

  • Zhiwu Xie, Prashant Chandrasekar, and Edward Fox. Using Transactional Web Archives To Handle Server Errors. In Proceedings of the 15th ACM/IEEE­CS Joint Conference on Digital Libraries. Knoxville, TN. ACM, 2015.

  • Zhiwu Xie, Prashant Chandrasekar and Edward A. Fox. A UWS Case for 200-Style Memento Negotiations, Bulletin of IEEE Technical Committee on Digital Libraries, 11(2) 2015.

  • Shivam Maharshi, Performance Measurement and Analysis of Transactional Web Archiving. MS Thesis, 2017. Virginia Tech. http://hdl.handle.net/10919/78371

  • Saket Vishwasrao, Performance Evaluation of Web Archiving Through In-Memory Page Cache. MS Thesis, 2017. Virginia Tech. http://hdl.handle.net/10919/78371

  • Zhiwu Xie, Krati Nayyar, and Edward A. Fox. Nearline Web Archiving, Bulletin of IEEE Technical Committee on Digital Libraries, 13(1), 2017.

  • Abhinav Kumar and Zhiwu Xie. Acquiring Web Content From In-Memory Cache. In Proceedings of the 18th ACM/IEEE-CS on Joint Conference on Digital Libraries, Fort Worth, TX. ACM, 2018.