Information retrieval, text management, ranking, top-k bibliography

2005

 * An Efficient and Versatile Query Engine for TopX Search. Martin Theobald, Ralf Schenkel, Gerhard Weikum. VLDB 2005. (db and ir)
 * Answering Queries from Statistics and Probabilistic Views. Nilesh N. Dalvi, Dan Suciu. VLDB 2005. (db and ir)
 * Approximate Matching of Hierarchical Data Using pq-Grams. Nikolaus Augsten, Michael H. B&ouml;hlen, Johann Gamper. VLDB 2005. (text data management)
 * Bootstrapping Semantic Annotations for Content-Rich HTML Documents. Saikat Mukherjee, I. V. Ramakrishnan, Amarjeet Singh. ICDE 2005. (text processing)
 * Estimating arbitrary subset sums with few probes. Noga Alon, Nick G. Duffield, Carsten Lund, Mikkel Thorup. PODS 2005. (databases &amp; information retrieval / data mining)
 * FTW: fast similarity search under the time warping distance. Yasushi Sakurai, Masatoshi Yoshikawa, Christos Faloutsos. PODS 2005. (databases &amp; information retrieval / data mining)
 * Indexing Mixed Types for Approximate Retrieval. Liang Jin, Nick Koudas, Chen Li, Anthony K. H. Tung. VLDB 2005. (db and ir)
 * KLEE: A Framework for Distributed Top-k Query Algorithms. Sebastian Michel, Peter Triantafillou, Gerhard Weikum. VLDB 2005. (db and ir)
 * Modeling and Managing Content Changes in Text Databases. Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano. ICDE 2005. (text processing)
 * Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results. Sandeep Pandey, Sourashis Roy, Christopher Olston, Junghoo Cho, Soumen Chakrabarti. VLDB 2005. (db and ir)
 * Space complexity of hierarchical heavy hitters in multi-dimensional data streams. John Hershberger, Nisheeth Shrivastava, Subhash Suri, Csaba D. T&oacute;th. PODS 2005. (databases &amp; information retrieval / data mining)
 * Text Classification without Labeled Negative Documents. Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Hongjun Lu, Philip S. Yu. ICDE 2005. (text processing)
 * The TEXTURE Benchmark: Measuring Performance of Text Queries on a Relational DBMS. Vuk Ercegovac, David J. DeWitt, Raghu Ramakrishnan. VLDB 2005. (text data management)
 * n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure. Min-Soo Kim, Kyu-Young Whang, Jae-Gil Lee, Min-Jae Lee. VLDB 2005. (text data management)

2004

 * Comparing and Aggregating Rankings with Ties. Ronald Fagin, Ravi Kumar, Mohammad Mahdian, D. Sivakumar, Erik Vee. PODS 2004. (ranking)
 * Efficiency-Quality Tradeoffs for Vector Score Aggregation. Pavan Kumar C. Singitham, Mahathi S. Mahabhashyam, Prabhakar Raghavan. VLDB 2004. (top-k ranking)
 * Merging the Results of Approximate Match Operations. Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Srivastava. VLDB 2004. (top-k ranking)
 * On the Integration of Structure Indexes and Inverted Lists. Raghav Kaushik, Rajasekar Krishnamurthy, Jeffrey F. Naughton, Raghu Ramakrishnan. ACM SIGMOD Conference 2004. (text and databases)
 * Top-k Query Evaluation with Probabilistic Guarantees. Martin Theobald, Gerhard Weikum, Ralf Schenkel. VLDB 2004. (top-k ranking)
 * Using Non-Linear Dynamical Systems for Web Searching and Ranking. Panayiotis Tsaparas. PODS 2004. (ranking)
 * When one Sample is not Enough: Improving Text Database Selection Using Shrinkage. Panagiotis G. Ipeirotis, Luis Gravano. ACM SIGMOD Conference 2004. (text and databases)

2003

 * CLUSEQ: Efficient and Effective Sequence Clustering. Jiong Yang, Wei Wang. ICDE 2003. (mining data, text and web)
 * Distance Based Indexing for String Proximity Search. S&uuml;leyman Cenk Sahinalp, Murat Tasan, Jai Macker, Z. Meral &Ouml;zsoyoglu. ICDE 2003. (mining data, text and web)
 * LOCI: Fast Outlier Detection Using the Local Correlation Integral. Spiros Papadimitriou, Hiroyuki Kitagawa, Phillip B. Gibbons, Christos Faloutsos. ICDE 2003. (mining data, text and web)
 * PIX: Exact and Approximate Phrase Matching in XML. Sihem Amer-Yahia, Mary F. Fern&aacute;ndez, Divesh Srivastava, Yu Xu. ACM SIGMOD Conference 2003. (demonstrations, text processing)
 * Pushing Aggregate Constraints by Divide-and-Approximate. Ke Wang, Yuelong Jiang, Jeffrey Xu Yu, Guozhu Dong, Jiawei Han. ICDE 2003. (mining data, text and web)
 * QXtract: A Building Block for Efficient Information Extraction from Plain-Text Databases. Eugene Agichtein, Luis Gravano. ACM SIGMOD Conference 2003. (demonstrations, text processing)
 * Querying Text Databases for Efficient Information Extraction. Eugene Agichtein, Luis Gravano. ICDE 2003. (mining data, text and web)
 * SWAT: Hierarchical Stream Summarization in Large Networks. Ahmet Bulut, Ambuj K. Singh. ICDE 2003. (mining data, text and web)

2002

 * Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic. Mengzhi Wang, Ngai Hang Chan, Spiros Papadimitriou, Christos Faloutsos, Tara M. Madhyastha. ICDE 2002. (data, text and web mining)
 * Discovering Similar Multidimensional Trajectories. Michail Vlachos, Dimitrios Gunopulos, George Kollios. ICDE 2002. (data, text and web mining)
 * Efficient Evaluation of Queries with Mining Predicates. Surajit Chaudhuri, Vivek R. Narasayya, Sunita Sarawagi. ICDE 2002. (data, text and web mining)
 * Fast Mining of Massive Tabular Data via Approximate Distance Computations. Graham Cormode, Piotr Indyk, Nick Koudas, S. Muthukrishnan. ICDE 2002. (data, text and web mining)
 * Lossy Reduction for Very High Dimensional Data. Chris Jermaine, Edward Omiecinski. ICDE 2002. (data, text and web mining)
 * OSSM: A Segmentation Approach to Optimize Frequency Counting. Carson Kai-Sang Leung, Raymond T. Ng, Heikki Mannila. ICDE 2002. (data, text and web mining)
 * Streaming-Data Algorithms for High-Quality Clustering. Liadan O'Callaghan, Adam Meyerson, Rajeev Motwani, Nina Mishra, Sudipto Guha. ICDE 2002. (data, text and web mining)
 * Towards Meaningful High-Dimensional Nearest Neighbor Search by Human-Computer Interaction. Charu C. Aggarwal. ICDE 2002. (data, text and web mining)
 * d-Clusters: Capturing Subspace Correlation in a Large Data Set. Jiong Yang, Wei Wang, Haixun Wang, Philip S. Yu. ICDE 2002. (data, text and web mining)

2001

 * Automatic Segmentation of Text into Structured Records. Vinayak R. Borkar, Kaustubh Deshmukh, Sunita Sarawagi. ACM SIGMOD Conference 2001. (text management)
 * DNA-Miner: A System Prototype for Mining DNA Sequences. Jiawei Han, Hasan M. Jamil, Ying Lu, Liangyou Chen, Yaqin Liao, Jian Pei. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * DyDa: Data Warehouse Maintenance in Fully Concurrent Environments. Jun Chen, Xin Zhang, Songting Chen, Andreas Koeller, Elke A. Rundensteiner. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * Dynamic Content Acceleration: A Caching Solution to Enable Scalable Dynamic Web Page Generation. Anindya Datta, Kaushik Dutta, Krithi Ramamritham, Helen M. Thomas, Debra E. VanderMeer. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * Efficient and Effective Metasearch for Text Databases Incorporating Linkages among Documents. Clement T. Yu, Weiyi Meng, Wensheng Wu, King-Lup Liu. ACM SIGMOD Conference 2001. (text management)
 * Lots o' Ticks: Real-Time High Performance Time Series Queries on Billions of Trades and Quotes. Arthur T. Whitney, Dennis Shasha. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * PBIR - Perception-Based Image Retrieval. Edward Y. Chang, Tim Cheng, Lihyuarn L. Chang. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * RETINA: A REal-time TraffIc NAvigation System. Kam-yiu Lam, Edward Chan, Tei-Wei Kuo, S. W. Ng, Dick Hung. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)
 * Spatial Data Management for Computer Aided Design. Hans-Peter Kriegel, Andreas M&uuml;ller, Marco P&ouml;tke, Thomas Seidl. ACM SIGMOD Conference 2001. (demos, content-based retrieval, web and data mining)

1999

 * A Layered Architecture for Querying Dynamic Web Content. Hasan Davulcu, Juliana Freire, Michael Kifer, I. V. Ramakrishnan. ACM SIGMOD Conference 1999. (text and web databases)
 * Automatic Discovery of Language Models for Text Databases. James P. Callan, Margaret E. Connell, Aiqun Du. ACM SIGMOD Conference 1999. (text and web databases)
 * Building Hierarchical Classifiers Using Class Proximity. Ke Wang, Senqiang Zhou, Shiang Chen Liew. VLDB 1999. (document classification and information retrieval)
 * Database Extensions for Complex Forms of Data (Abstract). Samuel DeFazio. ICDE 1999. (industry, document and www information service)
 * Distributed Hypertext Resource Discovery Through Examples. Soumen Chakrabarti, Martin van den Berg, Byron Dom. VLDB 1999. (document classification and information retrieval)
 * Document Warehousing Based on a Multimedia Database System. Hiroshi Ishikawa, Kazumi Kubota, Yasuo Noguchi, Koki Kato, Miyuki Ono, Naomi Yoshizawa, Yasuhiko Kanemasa. ICDE 1999. (industry, document and www information service)
 * Multi-Dimensional Substring Selectivity Estimation. H. V. Jagadish, Olga Kapitskaia, Raymond T. Ng, Divesh Srivastava. VLDB 1999. (document classification and information retrieval)
 * Record-Boundary Discovery in Web Documents. David W. Embley, Y. S. Jiang, Yiu-Kai Ng. ACM SIGMOD Conference 1999. (text and web databases)
 * The ECHO Method: Concurrency Control Method for a Large-Scale Distributed Database. Yukari Shirota, Atsushi Iizawa, Hiroko Mano, Takashi Yano. ICDE 1999. (industry, document and www information service)
 * Using XML in Relational Database Applications (Abstract). Susan Malaika. ICDE 1999. (industry, document and www information service)

1998

 * About Quark Digital Media System. Kamar Aulakh. ACM SIGMOD Conference 1998. (industrial, document systems)
 * Determining Text Databases to Search in the Internet. Weiyi Meng, King-Lup Liu, Clement T. Yu, Xiaodong Wang, Yuhsi Chang, Naphtali Rishe. VLDB 1998. (text and semistructured data)
 * Efficient Searching with Linear Constraints. Pankaj K. Agarwal, Lars Arge, Jeff Erickson, Paolo Giulio Franciosa, Jeffrey Scott Vitter. PODS 1998. (advanced information retrieval)
 * FileNet Integrated Document Management Database Usage and Issues. Daniel S. Whelan. ACM SIGMOD Conference 1998. (industrial, document systems)
 * Incremental Maintenance for Materialized Views over Semistructured Data. Serge Abiteboul, Jason McHugh, Michael Rys, Vasilis Vassalos, Janet L. Wiener. VLDB 1998. (text and semistructured data)
 * Latent Semantic Indexing: A Probabilistic Analysis. Christos H. Papadimitriou, Prabhakar Raghavan, Hisao Tamaki, Santosh Vempala. PODS 1998. (advanced information retrieval)
 * Proximity Search in Databases. Roy Goldman, Narayanan Shivakumar, Suresh Venkatasubramanian, Hector Garcia-Molina. VLDB 1998. (text and semistructured data)

1997

 * A Framework for Implementing Hypothetical Queries. Timothy Griffin, Richard Hull. ACM SIGMOD Conference 1997. (queries and sorting)
 * High-Performance Sorting on Networks of Workstations. Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, David A. Patterson. ACM SIGMOD Conference 1997. (queries and sorting)
 * On Saying "Enough Already!" in SQL. Michael J. Carey, Donald Kossmann. ACM SIGMOD Conference 1997. (queries and sorting)

1996

 * Client-Based Logging for High Performance Distributed Architectures. Euthimios Panagos, Alexandros Biliris, H. V. Jagadish, Rajeev Rastogi. ICDE 1996. (distributed information systems)
 * Energy-Efficient Caching for Wireless Mobile Computing. Kun-Lung Wu, Philip S. Yu, Ming-Syan Chen. ICDE 1996. (distributed information systems)
 * Relaxed Index Consistency for a Client-Server Database. Vibby Gottemukkala, Edward Omiecinski, Umakishore Ramachandran. ICDE 1996. (distributed information systems)
 * Search and Ranking Algorithms for Locating Resources on the World Wide Web. Budi Yuwono, Dik Lun Lee. ICDE 1996. (distributed information systems and www)
 * Speculative Data Dissemination and Service to Reduce Server Load, Network Traffic and Service Time in Distributed Information Systems. Azer Bestavros. ICDE 1996. (distributed information systems and www)
 * The Gold Text Indexing Engine. Daniel Barbar&aacute;, Sharad Mehrotra, Padmavathi Vallabhaneni. ICDE 1996. (distributed information systems and www)

1995

 * A Database Interface for File Updates. Serge Abiteboul, Sophie Cluet, Tova Milo. ACM SIGMOD Conference 1995. (documents)
 * Copy Detection Mechanisms for Digital Documents. Sergey Brin, James Davis, Hector Garcia-Molina. ACM SIGMOD Conference 1995. (documents)
 * Duplicate Removal in Information System Dissemination. Tak W. Yan, Hector Garcia-Molina. VLDB 1995. (distributed information retrieval)
 * Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies. Luis Gravano, Hector Garcia-Molina. VLDB 1995. (distributed information retrieval)
 * Join Queries with External Text Sources: Execution and Optimization Techniques. Surajit Chaudhuri, Umeshwar Dayal, Tak W. Yan. ACM SIGMOD Conference 1995. (documents)
 * W3QS: A Query System for the World-Wide Web. David Konopnicki, Oded Shmueli. VLDB 1995. (distributed information retrieval)

1994

 * From Structured Documents to Novel Query Facilities. Vassilis Christophides, Serge Abiteboul, Sophie Cluet, Michel Scholl. ACM SIGMOD Conference 1994. (textual databases)
 * Incremental Updates of Inverted Lists for Text Document Retrieval. Anthony Tomasic, Hector Garcia-Molina, Kurt A. Shoens. ACM SIGMOD Conference 1994. (textual databases)
 * Optimizing Queries on Files. Mariano P. Consens, Tova Milo. ACM SIGMOD Conference 1994. (textual databases)
 * Reasoning about Strings in Databases. G&ouml;sta Grahne, Matti Nyk&auml;nen, Esko Ukkonen. PODS 1994. (text databases)
 * Tutorial: Text Dominated Databases, Theory Practice and Experience. Gaston H. Gonnet. PODS 1994. (text databases)

1988

 * Design and Implementation of an Extensible Database Management System Supporting User Defined Data Types and Functions. Volker Linnemann, Klaus K&uuml;spert, Peter Dadam, Peter Pistor, R. Erbe, Alfons Kemper, Norbert S&uuml;dkamp, Georg Walch, Mechtild Wallrath. VLDB 1988. (textual and extensible dbms)
 * Extended User-Defined Indexing with Application to Textual Databases. Clifford A. Lynch, Michael Stonebraker. VLDB 1988. (textual and extensible dbms)
 * Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison. Christos Faloutsos, Raphael Chan. VLDB 1988. (textual and extensible dbms)