Obtaining Search Engine Transaction Logs

There are currently six transaction logs available for release. They are:

At the moment, these are the only transaction logs available.

Please email me, Jim Jansen, if you would like access to one or more of the transaction logs. I will place the file(s) on an ftp site for you.

Background on Transaction Logs:

The 1997 Excite logs, both small and large, were offered to attendees at the 1997 Association of Computing Machinery Special Interest Group on Information Retrieval Conference (ACM-SIGIR) in Philadelphia by Excite.com, specifically by Mr. Doug Cutting, who participated on a conference workshop. There were initially five researchers who responded to the offer. However, the transaction logs were publicly available on an Excite ftp server for a couple of years. So, several researchers downloaded and conducted research using this data.

In 1999, Excite.com again offered a transaction log to the research community. Mr. Jack Xu provided this transaction log, again on an anonymous ftp server. It was also made available for the Text Retrieval Conferences. Several researchers have investigated aspects of Web searching using this data set.

The 2001 transaction log was obtained from Excite for a temporal study conducted by Drs. Spink, Wolfram, Saracevic, and Jansen. Results of this Web searching trend analysis were published in:
Spink, A., Jansen, B. J., Wolfram, D., and Saracevic, T. 2002. From E-sex to E-commerce: Web Search Changes. IEEE Computer. 35(3), 107 - 111. View in PDF.

The AlltheWeb and AltaVista logs were obtained thorough neogrations with the respective search engines companies, with agreements concerning sharing of the data. As of 2006, these agreements have expired. Results of this analysis were published in: Jansen, B. J., Spink, A., and Pederson, J. 2005. A Temporal Comparison of AltaVista Web Searching. Journal of the American Society for Information Science and Technology. 56(6), 559-570. View in PDF and Jansen, B. J. and Spink, A. 2004. An Analysis of Web Searching by European Alltheweb.com Users. Information Processing & Management. 41(2), 361-381. View in PDF.