Schema matching bibliography

A. Agresti. {\em Categorical Data Analysis}. Wiley, New York, NY, 1990.
 * [Agr90]{agresti90}

N. Ashish and C. Knoblock. Wrapper generation for semi-structured internet sources. {\em SIGMOD Record}, 26(4):8--15, 1997.
 * [AK97]{ashish97}

J. Biskup and B. Convent. A formal view integration method. In {\em Proceedings of the ACM Conf. on Management of Data (SIGMOD)}, 1986.
 * [BC86]{biskup86}

S. Bergamaschi, S. Castano, M. Vincini, and D. Beneventano. Semantic integration of heterogeneous information sources. {\em Data and Knowledge Engineering}, 36(3):215--249, 2001.
 * [BCVB01]{momis}

P. Bernstein. Applying model management to classical meta data problems. In {\em Proceedings of the Conf. on Innovative Database Research (CIDR)}, 2003.
 * [Ber03]{bernstein-cidr03}

D. Brickley and R. Guha. Resource description framework schema specification 1.0, 2000.
 * [BG00]{rdf}

P. Bernstein, A. Halevy, and R. Pottinger. A vision for management of complex models. {\em ACM SIGMOD Record}, 29(4):55--63, 200.
 * [BHP00]{phil-vision-paper}

J. Broekstra, M. Klein, S. Decker, D. Fensel, F. van Harmelen, and I. Horrocks. Enabling knowledge representation on the {W}eb by extending {RDF} schema. In {\em Proceedings of the Tenth Int. World Wide Web Conference}, 2001.
 * [BKD{\etalchar{+}}01]{fensel01}

T. Berners-Lee, J. Hendler, and O. Lassila. The {S}emantic {W}eb. {\em Scientific American}, 279, 2001.
 * [BLHL01]{berners-lee}

C. Batini, M. Lenzerini, and SB. Navathe. A comparative analysis of methodologies for database schema integration. {\em ACM Computing Survey}, 18(4):323--364, 1986.
 * [BLN86]{bln86}

J. Berlin and A. Motro. Autoplex: {A}utomated discovery of content for virtual databases. In {\em Proceedings of the Conf. on Cooperative Information Systems (CoopIS)}, 2001.
 * [BM01]{autoplex}

J. Berlin and A. Motro. Database schema matching using machine learning with feature selection. In {\em Proceedings of the Conf. on Advanced Information Systems Engineering (CAiSE)}, 2002.
 * [BM02]{automatch}

S. Castano and V. De Antonellis. A schema analysis and reconciliation tool environment. In {\em Proceedings of the Int. Database Engineering and Applications Symposium (IDEAS)}, 1999.
 * [CA99]{artemis}

S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In {\em Proceedings of the ACM SIGMOD Conference}, 1998.
 * [CDI98]{hypertext}

D. Calvanese, D. G. Giuseppe, and M. Lenzerini. Ontology of integration and integration of ontologies. In {\em Proceedings of the 2001 Description Logic Workshop (DL 2001)}, 2001.
 * [CGL01]{calvanese}

W. Cohen and H. Hirsh. Joins that generalize: Text classification using {WHIRL}. In {\em Proc. of the Fourth Int. Conf. on Knowledge Discovery and Data Mining (KDD)}, 1998.
 * [CH98]{cohen-kdd98}

H. Chalupsky. Ontomorph: A translation system for symbolic knowledge. In {\em Principles of Knowledge Representation and Reasoning}, 2000.
 * [Cha00]{ontomorph}

C. Clifton, E. Housman, and A. Rosenthal. Experience with a combined approach to attribute-matching across heterogeneous databases. In {\em Proc. of the IFIP Working Conference on Data Semantics (DS-7)}, 1997.
 * [CHR97]{clifton97}

Donald D. Chamberlin, Jonathan Robie, and Daniela Florescu. Quilt: An {XML} query language for heterogeneous data sources. In {\em WebDB (Informal Proceedings) 2000}, pages 53--62, 2000.
 * [CRF00]{quilt}

T. M. Cover and J. A. Thomas. {\em Elements of Information Theory}. Wiley, New York, NY, 1991.
 * [CT91]{coverthomas}

www.daml.org.
 * [dam]{daml}

A. Doan, P. Domingos, and A. Halevy. Reconciling schemas of disparate data sources: A machine learning approach. In {\em Proceedings of the ACM SIGMOD Conference}, 2001.
 * [DDH01]{lsd}

A. Doan, P. Domingos, and A. Halevy. Learning to match the database schemas: A multistrategy approach. {\em Machine Learning}, 2003. Special Issue on Multistrategy Learning. To Appear.
 * [DDH03]{lsd-mlj}

A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for {XML}. In {\em Proceedings of the International Word Wide Web Conference, Toronto, CA}, 1999.
 * [DFF{\etalchar{+}}99]{xmlql}

R. O. Duda and P. E. Hart. {\em Pattern Classification and Scene Analysis}. John Wiley and Sons, New York, 1974.
 * [DH74]{dudahart}

T. Dasu, T. Johnson, S. Muthukrishnan, and V. Shkapenyuk. Mining database structure; or, how to build a data quality browser. In {\em Proceedings of the ACM Conf. on Management of Data (SIGMOD)}, 2002.
 * [DJMS02]{bellman-system}

A. Doan, J. Madhavan, P. Domingos, and A. Halevy. Learning to map ontologies on the {S}emantic {W}eb. In {\em Proceedings of the World-Wide Web Conference (WWW-02)}, 2002.
 * [DMDH02]{glue}

H. Do, S. Melnik, and E. Rahm. Comparison of schema matching evaluations. In {\em Proceedings of the 2nd Int. Workshop on Web Databases (German Informatics Society)}, 2002.
 * [DMR02]{erhard-eval}

P. Domingos and M. Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. {\em Machine Learning}, 29:103--130, 1997.
 * [DP97]{domingos&pazzani97}

S. Donoho and L. Rendell. Constructive induction using fragmentary knowledge. In {\em Proc. of the 13th Int. Conf. on Machine Learning}, pages 113--121, 1996.
 * [DR96]{donoho96}

H. Do and E. Rahm. Coma: A system for flexible combination of schema matching approaches. In {\em Proceedings of the 28th Conf. on Very Large Databases (VLDB)}, 2002.
 * [DR02]{coma}

D. Embley, D. Jackman, and L. Xu. Multifaceted exploitation of metadata for attribute match discovery in information integration. In {\em Proceedings of the WIIW Workshop}, 2001.
 * [EJX01]{embley01}

AK. Elmagarmid and C. Pu. Guest editors' introduction to the special issue on heterogeneous databases. {\em ACM Computing Survey}, 22(3):175--178, 1990.
 * [EP90]{ep90}

D. Fensel. {\em Ontologies: {S}ilver {B}ullet for {K}nowledge {M}anagement and {E}lectronic {C}ommerce}. Springer-Verlag, 2001.
 * [Fen01]{fensel-book01}

Dayne Freitag. Machine learning for information extraction in informal domains. {\em Ph.D. Thesis}, 1998. Dept. of Computer Science, Carnegie Mellon University.
 * [Fre98]{freitag-thesis}

M. Friedman and D. Weld. Efficiently executing information-gathering plans. In {\em Proc. of the Int. Joint Conf. of AI (IJCAI)}, 1997.
 * [FW97]{friedman-ijcai97}

H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. The {TSIMMIS} project: Integration of heterogeneous information sources. {\em Journal of Intelligent Inf. Systems}, 8(2), 1997.
 * [GMPQ{\etalchar{+}}97]{tsimmis97}

J. Hammer, H. Garcia-Molina, S. Nestorov, R. Yerneni, M. Breunig, and V. Vassalos. Template-based wrappers in the {TSIMMIS} system (system demonstration). In {\em ACM Sigmod Record}, Tucson, Arizona, 1998.
 * [HGMN{\etalchar{+}}98]{hammer97}

J. Heflin and J. Hendler. A portrait of the {S}emantic {W}eb in action. {\em IEEE Intelligent Systems}, 16(2), 2001.
 * [HH01]{shoe}

P. Hart, N. Nilsson, and B. Raphael. Correction to ``a formal basis for the heuristic determination of minimum cost paths''. {\em SIGART Newsletter}, 37:28--29, 1972.
 * [HNR72]{hart72}

R.A. Hummel and S.W. Zucker. On the foundations of relaxation labeling processes. {\em PAMI}, 5(3):267--287, May 1983.
 * [HZ83]{rlvision}

{\em IEEE Intelligent Systems}, 16(2), 2001.
 * [iee01]{ieee}

Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution system for data integration. In {\em Proc. of SIGMOD}, 1999.
 * [IFF{\etalchar{+}}99]{tukwila}

Z. Ives, A. Levy, J. Madhavan, R. Pottinger, S. Saroiu, I. Tatarinov, S. Betzler, Q. Chen, E. Jaslikowska, J. Su, and W. Yeung. Self-organizing data sharing communities with sagres. In {\em Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data}, page 582, 2000.
 * [ILM{\etalchar{+}}00]{sagres}

C. Knoblock, S. Minton, J. Ambite, N. Ashish, P. Modi, I. Muslea, A. Philpot, and S. Tejada. Modeling web sources for information integration. In {\em Proc. of the National Conference on Artificial Intelligence (AAAI)}, 1998.
 * [KMA{\etalchar{+}}98]{ariadne}

G. Keim, N. Shazeer, M. Littman, S. Agarwal, C. Cheves, J. Fitzgerald, J. Grosland, F. Jiang, S. Pollard, and K. Weinmeister. {PROVERB}: The probabilistic cruciverbalist. In {\em Proc. of the 6th National Conf. on Artificial Intelligence ({AAAI}-99)}, pages 710--717, 1999.
 * [KSL{\etalchar{+}}99]{proverb}

N. Kushmerick. Wrapper induction: Efficiency and expressiveness. {\em Artificial Intelligence}, 118(1--2):15--68, 2000.
 * [Kus00a]{kushmerick2000}

N. Kushmerick. Wrapper verification. {\em World Wide Web Journal}, 3(2):79--94, 2000.
 * [Kus00b]{kushmerickwrapper}

W. Li and C. Clifton. Semantic integration in heterogeneous databases using neural networks. In {\em Proceedings of the Conf. on Very Large Databases (VLDB)}, 1994.
 * [LC94]{semint-vldb}

W. Li and C. Clifton. {SEMINT}: A tool for identifying attribute correspondence in heterogeneous databases using neural networks. {\em Data and Knowledge Engineering}, 33:49--84, 2000.
 * [LC00]{semint00}

W. Li, C. Clifton, and S. Liu. Database integration using neural network: implementation and experience. {\em Knowledge and Information Systems}, 2(1):73--96, 2000.
 * [LCL00]{semint-journal}

M. Lacher and G. Groh. Facilitating the exchange of explixit knowledge through ontology mappings. In {\em Proceedings of the 14th Int. FLAIRS conference}, 2001.
 * [LG01]{lacher}

D. Lin. An information-theoretic definition of similarity. In {\em Proceedings of the International Conference on Machine Learning (ICML)}, 1998.
 * [Lin98]{infosim}

E. Lambrecht, S. Kambhampati, and S. Gnanaprakasam. Optimizing recursive information gathering plans. In {\em Proc. of the Int. Joint Conf. on AI (IJCAI)}, 1999.
 * [LKG99]{emerac}

S. Lloyd. An optimization approach to relaxation labeling algorithms. {\em Image and Vision Computing}, 1(2), 1983.
 * [Llo83]{lloyd83}

A. Y. Levy, A. Rajaraman, and J. Ordille. Querying heterogeneous information sources using source descriptions. In {\em Proc. of {VLDB}}, 1996.
 * [LRO96]{levy-im2-96}

J. Madhavan, P.A. Bernstein, and E. Rahm. Generic schema matching with {C}upid. In {\em Proceedings of the International Conference on Very Large Databases (VLDB)}, 2001.
 * [MBR01]{cupid}

D. McGuinness, R. Fikes, J. Rice, and S. Wilder. The {C}himaera ontology environment. In {\em Proceedings of the 17th National Conference on Artificial Intelligence}, 2000.
 * [MFRW00]{chimaera}

J. Madhavan, A. Halevy, P. Domingos, and P. Bernstein. Representing and reasoning about mappings between domain models. In {\em Proceedings of the National AI Conference (AAAI-02)}, 2002.
 * [MHDB02]{madhavan-aaai02}

R. Miller, L. Haas, and M. Hernandez. Schema mapping as query discovery. In {\em Proc. of {VLDB}}, 2000.
 * [MHH00]{miller00}

P. Mork, A. Halevy, and P. Tarczy-Hornoch. A model of data integration system of biomedical data applied to online genetic databases. In {\em Proceedings of the Symposium of the American Medical Informatics Association}, 2001.
 * [MHTH01]{peter-mork-paper}

S. Melnik, H. Molina-Garcia, and E. Rahm. Similarity flooding: a versatile graph matching algorithm. In {\em Proceedings of the International Conference on Data Engineering (ICDE)}, 2002.
 * [MMGR02]{simflood}

A. McCallum and K. Nigam. A comparison of event models for {N}aive {B}ayes text classification. In {\em Proceedings of the AAAI-98 Workshop on Learning for Text Categorization}, 1998.
 * [MN98]{mccallum-twoevents}

C. Manning and H. Sch{\"{u}}tze. {\em Foundations of Statistical Natural Language Processing}, pages 575--608. The MIT Press, Cambridge, US, 1999.
 * [MS99]{manning99}

A. Maedche and S. Saab. Ontology learning for the {S}emantic {W}eb. {\em IEEE Intelligent Systems}, 16(2), 2001.
 * [MS01]{onto-learn}

R. Michalski and G. Tecuci, editors. {\em Machine Learning: A Multistrategy Approach}. Morgan Kaufmann, 1994.
 * [MT94]{michalski&tecuci94}

P. Mitra, G. Wiederhold, and J. Jannink. Semi-automatic integration of knowledge sources. In {\em Proceedings of Fusion'99}.
 * [MWJ]{skat}

T. Milo and S. Zohar. Using schema matching to simplify heterogeneous data translation. In {\em Proceedings of the International Conference on Very Large Databases (VLDB)}, 1998.
 * [MZ98]{transcm}

F. Neumann, CT. Ho, X. Tian, L. Haas, and N. Meggido. Attribute classification using feature analysis. In {\em Proceedings of the Int. Conf. on Data Engineering (ICDE)}, 2002.
 * [NHT{\etalchar{+}}02]{howard}

N.F. Noy and M.A. Musen. {PROMPT}: Algorithm and tool for automated ontology merging and alignment. In {\em Proceedings of the National Conference on Artificial Intelligence (AAAI)}, 2000.
 * [NM00]{prompt-noy}

N.F. Noy and M.A. Musen. Anchor-{PROMPT}: Using non-local context for semantic {M}atching. In {\em Proceedings of the Workshop on Ontologies and Information Sharing at the International Joint Conference on Artificial Intelligence (IJCAI)}, 2001.
 * [NM01]{anchor-prompt}

NF. Noy and MA. Musen. Prompt{D}iff: A fixed-point algorithm for comparing ontology versions. In {\em Proceedings of the Nat. Conf. on Artificial Intelligence (AAAI)}, 2002.
 * [NM02]{noy-aaai02}

B. Omelayenko. Learning of ontologies for the {W}eb: the analysis of existent approaches. In {\em Proceedings of the International Workshop on Web Dynamics}, 2001.
 * [Ome01]{borys}

http://ontobroker.semanticweb.org.
 * [ont]{ontobroker}

L. Padro. A hybrid environment for syntax-semantic tagging, 1998.
 * [Pad98]{padro-hybrid}

R. Pottinger and P. Bernstein. Creating a mediated schema based on initial correspondences. {\em IEEE Data Engineering Bulletin}, 25(3), 2002.
 * [PB02]{pottinger02}

M. Perkowitz and O. Etzioni. Category translation: Learning to understand information on the {I}nternet. In {\em Proc. of Int. Joint Conf. on AI (IJCAI)}, 1995.
 * [PE95]{perkowitz&etzioni95}

V. Punyakanok and D. Roth. The use of classifiers in sequential inference. In {\em Proceedings of the Conference on Neural Information Processing Systems (NIPS-00)}, 2000.
 * [PR00]{roth-nips00}

C. Parent and S. Spaccapietra. Issues and approaches of database integration. {\em Communications of the ACM}, 41(5):166--178, 1998.
 * [PS98]{ps98}

L. Palopoli, D. Sacca, G. Terracina, and D. Ursino. A unififed graph-based framework for deriving nominal interscheme properties, type conflicts, and object cluster similarities. In {\em Proceedings of the Conf. on Cooperative Information Systems (CoopIS)}, 1999.
 * [PSTU99]{pstu99}

L. Palopoli, D. Sacca, and D. Ursino. Semi-automatic, semantic discovery of properties from database schemes. In {\em Proc. of the Int. Database Engineering and Applications Symposium (IDEAS-98)}, pages 244--253, 1998.
 * [PSU98]{palopoli98}

L. Palopoli, G. Terracina, and D. Ursino. The system {DIKE}: towards the semi-automatic synthesis of cooperative information systems and data warehouses. In {\em Proceedings of the {ADBIS}-{DASFAA} Conf.}, 2000.
 * [PTU00]{ptu00}

L. Popa, Y. Velegrakis, M. Hernandez, R. J. Miller, and R. Fagin. Translating web data. In {\em Proceedings of the Int. Conf. on Very Large Databases (VLDB)}, 2002.
 * [PVH{\etalchar{+}}02]{popa02}

E. Rahm and P.A. Bernstein. On matching schemas automatically. {\em VLDB Journal}, 10(4), 2001.
 * [RB01]{survey}

E. Rahm and H. Do. Data cleaning: Problems and current approaches. {\em IEEE Data Engineering Bulletin}, 2000.
 * [RD00]{rahm-do-cleaning}

I. Ryutaro, T. Hideaki, and H. Shinichi. Rule induction for concept hierarchy alignment. In {\em Proceedings of the 2nd Workshop on Ontology Learning at the 17th Int. Joint Conf. on AI (IJCAI)}, 2001.
 * [RHS01]{hical}

A. Rosenthal, F. Manola, and S. Renner. Getting data to applications: Why we fail, and how we can do better. In {\em Proceedings of the AFCEA Federal Database Conference}, 2000.
 * [RMR00]{arnie-get-data}

A. Rosenthal, S. Renner, L. Seligman, and F. Manola. Data integration needs an industrial revolution. In {\em Proceedings of the Workshop on Foundations of Data Integration}, 2001.
 * [RRSM01]{arnie-revolution}

A. Rosenthal and L. Seligman. Scalability issues in data integration. In {\em Proceedings of the AFCEA Federal Database Conference}, 2001.
 * [RS01]{arnie-scalability}

AP. Seth and JA. Larson. Federated database systems for managing distributed, heterogeneous, and autonomous databases. {\em ACM Computing Survey}, 22(3):183--236, 1990.
 * [SL90]{sl90}

L. Seligman and A. Rosenthal. The impact of xml in databases and data sharing. {\em IEEE Computer}, 2001.
 * [SR01]{arnie-impact}

L. Seligman, A. Rosenthal, P. Lehner, and A. Smith. Data integration: Where does the time go? {\em IEEE Data Engineering Bulletin}, 2002.
 * [SRLS02]{arnie-time}

L. Todorovski and S. Dzeroski. Declarative bias in equation discovery. In {\em Proceedings of the Int. Conf. on Machine Learning (ICML)}, 1997.
 * [TD97]{lagrame}

K. M. Ting and I. H. Witten. Issues in stacked generalization. {\em Journal of Artificial Intelligence Research}, 10:271--289, 1999.
 * [TW99]{ting&witten99}

{UDB}: {T}he unified database for human genome computing. http://bioinformatics.weizmann.ac.il/udb.
 * [{UDB}]{bio:udb}

van Rijsbergen. {\em Information {R}etrieval}. London:Butterworths, 1979. Second Edition.
 * [vR79]{ir-book}

D. Wolpert. Stacked generalization. {\em Neural Networks}, 5:241--259, 1992.
 * [Wol92]{wolpert92}

Wordnet: {A} lexical database for the {E}nglish language. http://www.cogsci.princeton.edu/ wn.
 * [Wor]{wordnet}

Extensible markup language ({XML}) 1.0. www.w3.org/TR/1998/REC-xml-19980210, 1998. W3C Recommendation.
 * [XML98]{xml}

X{Q}uery: {A}n {XML} query language. http://www.w3.org/TR/xquery.
 * [Xqu]{xquery}

{XSL} {T}ransformations ({XSLT}), version 1.0. http://www.w3.org/TR/xslt, 13 August 1999. W3C Working Draft.
 * [XSL99]{xslt}

L.L. Yan, R.J. Miller, L.M. Haas, and R. Fagin. Data driven understanding and refinement of schema mappings. In {\em Proceedings of the ACM SIGMOD}, 2001.
 * [YMHF01]{clio}

J. Yi and N. Sundaresan. A classifier for semi-structured documents. In {\em Proc. of the 6th Int. Conf. on Knowledge Discovery and Data Mining ({KDD}-2000)}, 2000.
 * [YS00]{yi00}