Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Efthimiadis, E.N.
Languages: English
Types: Doctoral thesis
Subjects: Z665
This thesis is aimed at investigating interactive query expansion within the context of a relevance feedback system that uses term weighting and ranking in searching online databases that are available through online vendors. Previous evaluations of relevance feedback systems have been made in laboratory conditions and not in a real operational environment. The research presented in this thesis followed the idea of testing probabilistic retrieval techniques in an operational environment. The overall aim of this research was to investigate the process of interactive query expansion (IQE) from various points of view including effectiveness. The INSPEC database, on both Data-Star and ESA-IRS, was searched online using CIRT, a front-end system that allows probabilistic term weighting, ranking and relevance feedback. The thesis is divided into three parts. Part I of the thesis covers background information and appropriate literature reviews with special emphasis on the relevance weighting theory (Binary Independence Model), the approaches to automatic and semi-automatic query expansion, the ZOOM facility of ESA/IRS and the CIRT front-end. Part II is comprised of three Pilot case studies. It introduces the idea of interactive query expansion and places it within the context of the weighted environment of CIRT. Each Pilot study looked at different aspects of the query expansion process by using a front-end. The Pilot studies were used to answer methodological questions and also research questions about the query expansion terms. The knowledge and experience that was gained from the Pilots was then applied to the methodology of the study proper (Part III). Part III discusses the Experiment and the evaluation of the six ranking algorithms. The Experiment was conducted under real operational conditions using a real system, real requests, and real interaction. Emphasis was placed on the characteristics of the interaction, especially on the selection of terms for query expansion. Data were collected from 25 searches. The data collection mechanisms included questionnaires, transaction logs, and relevance evaluations. The results of the Experiment are presented according to their treatment of query expansion as main results and other findings in Chapter 10. The main results discuss issues that relate directly to query expansion, retrieval effectiveness, the correspondence of the online-to-offline relevance judgements, and the performance of the w(p — q) ranking algorithm. Finally, a comparative evaluation of six ranking algorithms was performed. The yardstick for the evaluation was provided by the user relevance judgements on the lists of the candidate terms for query expansion. The evaluation focused on whether there are any similarities in the performance of the algorithms and how those algorithms with similar performance treat terms. This abstract refers only to the main conclusions drawn from the results of the Experiment: (1) One third of the terms presented in the list of candidate terms was on average identified by the users as potentially useful for query expansion; (2) These terms were mainly judged as either variant expression (synonyms) or alternative (related) terms to the initial query terms. However, a substantial portion of the selected terms were identified as representing new ideas. (3) The relationship of the 5 best terms chosen by the users for query expansion to the initial query terms was: (a) 34% have no relationship or other type of correspondence with a query term; (b) 66% of the query expansion terms have a relationship which makes the term: (bl) narrower term (70%), (b2) broader term (5%), (b3) related term (25%). (4) The results provide some evidence for the effectiveness of interactive query expansion. The initial search produced on average 3 highly relevant documents at a precision of 34%; the query expansion search produced on average 9 further highly relevant documents at slightly higher precision. (5) The results demonstrated the effectiveness of the w(p—q) algorithm, for the ranking of terms for query expansion, within the context of the Experiment. (6) The main results of the comparative evaluation of the six ranking algorithms, i.e. w(p — q), EMIM, F4, F4modifed, Porter and ZOOM, are that: (a) w(p — q) and EMIM performed best; and (b) the performance between w(p — q) and EMIM and between F4 and F4modified is very similar; (7) A new ranking algorithm is proposed as the result of the evaluation of the six algorithms. Finally, an investigation is by definition an exploratory study which generates hypotheses for future research. Recommendations and proposals for future research are given. The conclusions highlight the need for more research on weighted systems in operational environments, for a comparative evaluation of automatic vs interactive query expansion, and for user studies in searching weighted systems.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • file: weigths.124 w(p-q) 7.7 3.9 3.6 3.4 2.9
    • Bates, M.J. (1979a) Information search tactics. Journal of the American Society for Information Science, 30(4), 1979, pp.205-214.
    • Bates, M.J. (1979b) Idea tactics. Journal of the American Society for Information Science, 1979, 30(5), 280-289.
    • Bates, M.J. (1981) Search Techniques. In Annual Review of Information Science and Technology, Williams, M.E., editor. White Plains, N.Y.: Knowledge Industry Publications. 16, 1981, 139-169.
    • Bellardo, T. (1985) An investigation of online searcher traits and their relationship to search outcome. Journal of the American Society for Information Science, 36(4): 241-50
    • Bookstein, A. and Kraft, D. (1977) Operations research applied to document indexing and retrieval decisions. Journal of the ACM, 24:418-427.
    • Bookstein, A. and Swanson D. (1974) Probabilistic models for automatic indexing. Journal of the American Society for Information Science, 25:312-318.
    • Bookstein, A. and Swanson D. (1975) A decision theoretic foundation for indexing. Journal of the American Society for Information Science, 26:45-50.
    • Bookstein, A. (1985) Probability and fuzzy-set applications to information retrieval. In Williams, M.E. ed. Annual Review of Information Science and Technology, Knowledge Industries Publications, Inc. 1985, 117-151.
    • Bovey, J.D. and Robertson, S.E. (1984) An algorithm for weighted searching on a Boolean system. Information Technology, 3(2), 1984, pp.84-87.
    • Brajnik, G., Guida, G. and Tasso, C. (1986) An expert interface for effective man-machine interaction. In: Cooperative interfaces to information systems. L. Bolc and M. Jarke, eds. Berlin: Springer-Verlag; 1986, pp.259-308.
    • Brookes, B.C. (1968) The measure of information retrieval effectiveness proposed by Swets. Journal of Documentation, 24, 1968, 41-54.
    • Brooks, H.M. (1986) An intelligent interface for document retrieval systems: developing the problem description and retrieval strategy components. Unpublished Ph.D. Thesis. London: Department of Information Science, City University, 1986.
    • Brooks, H.M., Daniels, P.J. and Belkin, N.J. (1986) Research on Information interaction and intelligent provision mechanisms. Journal of Information Science, 12, 1986, pp.37-44.
    • Burket, T.G., Emrath, P. and Kuck, D.J. (1979) The use of vocabulary files for online information retrieval. Information Processing and Management, 15, 1979, pp.281-289.
    • Chiaramella, Y. and Defude, B. (1987) A prototype of an Intelligent System for Information Retrieval: IOTA. Information Processing Management, 23(4), 1987, pp.285-303.
    • Chow, C.K. & Liu, C.N. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 1968, IT-14(3): 462-467.
    • Cleverdon, C.W. (1974) User evaluation of information retrieval systems. Journal of Documentation, 30:170-180.
    • Cleverdon, C.W. (1984) Optimizing convenient online access to bibliographic databases. Information Services and Use, 4:37-47.
    • Cooper, W.S. (1977) The suboptimality of retrieval ranking based on the probability of usefulness, private communication to S.E. Robertson.
    • Croft, W.B. (1982) An overview of information systems. Information Technology: Research and Development, 1:73-96.
    • Croft, W.B. (1987) Approaches to Intelligent Information Retrieval. Information Processing and Management, Vol.23, No.4, 1987, pp. 249-254.
    • Croft, W.B. and Harper, D.J. (1979) Using probabilistic models of document retrieval without relevance information. Journal of Documentation 35 (4): 285-295; 1979.
    • Croft, W.B. and Thompson, R.H. (1987) PR: A new approach to the design of document retrieval systems. Journal of the American Society for Information Science, 38(6), 1987, pp.389-404.
    • Cuadra, C.A. and Katter, R.V. (1967) Opening the black box of relevance. Journal of Documentation, 23:251-303.
    • D'Elia, S. and Marchetti, P.G. (c1985) QUESTQUORUM: A new online search assistance tool from ESA-IRS. Available online from ESA-IRS. c1985.
    • Dillon, M. and Desper, J. (1980) The use of automatic relevance feedback in boolean retrieval systems. Journal of Documentation, 36(3), 1980, pp.197-208.
    • Dillon, M., Ulmschneider, J. and Desper, J. (1983) A prevalence formula for automatic relevance feedback in boolean systems. Information Processing and Management, 19(1), 1983, pp.27-36.
    • Doszkocs, T.E. (1978) AID - an associative interactive dictionary for on-line searching. Online Review, 2:163-173.
    • Doszkocs, T.E. (1978) An associative interactive dictionary (AID) for online bibliographic searching. In: The Information Age in Perspective. Proceedings of the ASIS Annual Meeting 1978, New York, NY, USA, 13-17 Nov. 1978. Knowledge Industry Publications Inc, White Plains, NY, USA, 1978. p.105-9.
    • Doszkocs, T.E. (1983) CITE NLM: Natural-language searching in an online catalog. Information Technology and Libraries, 1983, 2(4), 364-380.
    • Doszkocs, T.E. (1986) Natural Language Processing in Information Retrieval. Journal of the American Society for Information Science, Vol.37, No.4, 1986, pp. 191-196.
    • Doszkocs, T.E. and Rapp, B.A. (1979) Searching MEDLINE in English: A prototype user interface with natural language query, ranked output and relevance feedback. In: Information Choices and Policies, Proceedings of the 42nd ASIS Annual meeting, Minneapolis, Minnesota, Oct. 14-18, 1979. White Plains New York, Knowledge Industry Publications Inc, pp 131-139.
    • Duda, R.O. & Hart, P.E. (1973) Pattern classification and scene analysis. New York, NY: Wiley-Interscience, 1973.
    • Efthimiadis, E.N. (1990) Online searching aids: a review of front-ends, gateways and other interfaces. Journal of Documentation, 1990, 46(3), 218-262.
    • Efthimiadis, E.N. and Robertson, S.E. (1989) Feedback and Interaction in Information Retrieval. In: Perspectives in Information Management, Oppenheim, C., ed. London: Butterworths, 1989, pp.257-272.
    • Ellis, D. (1984) Theory and explanation in information retrieval research, Journal of Information Science, 8: 25-38.
    • Eichman, T.L. (1978) The complex nature of opening reference questions. RQ, 1978, 17(3), 212-222.
    • Gauch, S. and Smith, J.B. (1989) Query reformulation strategies for an intelligent search intermediary. In: Proceedings of the annual Al systems in government conference, IEEECat.no .89CH2715-1, 1989, pp.65-71.
    • Goldsmith, G. and Williams, P.W. (1986) Online searching made simple: a microcomputer interface for inexperienced users. British Library, Library and Information Research Report No. 41, 108 pp.
    • Harman, D. (1988) Towards interactive query expansion. In: 11th International Conference on Research and Development in IR, SIGIR 1988, Grenoble, France, Presses Universitaires de Grenoble, France, pp 321-331.
    • Harper, D.J. (1980) Relevance Feedback in Document Retrieval Systems, Ph.D. Thesis, Computer Laboratory, University of Cambridge.
    • Harper, D.J. and van Rijsbergen, C.J. (1978) An evaluation of feedback in document retrieval using co-occurrence data. Journal of Documentation 34 (3): 189-216; 1978.
    • Harter, S.P. (1975a) A probabilistic approach to automatic keyword indexing, Part I: On the distribution of specialty words in a technical literature. Journal of the American Society for Information Science, 26:197-206.
    • Harter, S.P. (1975b) A probabilistic approach to automatic keyword indexing, Part II: An algorithm for probabilistic indexing. Journal of the American Society for Information Science, 26:280-289.
    • Harter, S.P. (1986) Online information retrieval: concepts, principles, and techniques. Academic Press.
    • Harter, S.P. and Peters, A.R. (1985) Heuristics for online information retrieval: a typology and preliminary listing. Online Review, 9(5), 1985, pp.407-424.
    • Hartley R.J. et al. (1989) Online searching: principles and practice. London: Bowker-Sauer.
    • Hawkins, D.T. (1988) Applications of artificial intelligence (AI) and expert systems for online searching. Online, 12(1), 1988, pp.31-43.
    • Heine, M.H. (1973) The inverse relationship of precision and recall in terms of the Swets model. Journal of Documentation, 29(1):81-4.
    • Heine, M.H. (1982) A simple intelligent front-end for information retrieval systems using Boolean logic. Information Technology: Research 8 Development, 2, 247-260, 1982.
    • Heine, M.H. (1988) Logic assistant for the database searcher. Information Processing 8 Management, 24, 323-329, 1988.
    • Henry, M., Leigh, J., Tedd, L. and Williams, P. (1980) Online Search: An Introduction. London: Buttervvorths. 1980.
    • Hendry, I.G., Willett, P. and Wood, F.E. (1986) INSTRUCT: A teaching package for experimental methods in information retrieval. Part I. The users' view. Program, 1986, 20, 245-263.
    • Ingwersen, P. (1984) A cognitive view of three selected online search facilities. Online Review, 8(5), 1984, pp.465-492.
    • Jamieson, S.H. and Oddy, R.N. (1979) Implementation and evaluation of interactive retrieval through an intelligent terminal. A project proposal to the British Library Research and Development Department, 1979, unpublished.
    • Jamieson, S.H. (1979a) An intelligent terminal for information retrieval. Journal of Informatics, 3(1), April 1979, 51-56.
    • Jamieson, S.H. (1979b) The economic implementation of experimental retrieval techniques on a very large scale using an intelligent terminal. In: Proceedings of the Second International Conference on Information Storage 6 Retrieval, Dallas, TX, 1979. New York: Association of Computing Machinery, 45-51.
    • Kaye, D. (1973) A weighted rank correlation coefficient for the comparison of relevance judgements. Journal of Documentation, 29:380-389.
    • Katzer, J. (1982) A study of the overlap among document representations. Information Technology: Research and Development, 2:261-274.
    • Keen, E.M. (1971) Evaluation parameters. In: Salton G., ed. The SMART retrieval system. Experiments in automatic document processing. Englewood Cliffs, NJ: Prentice-Hall, pp 74-111.
    • Knapp, S.D. (1978) The reference interview in the computer-based setting. RQ, 1978, 17(4), 320-324.
    • Kraft, D.H. and Bookstein, A. (1978) Evaluation of information retrieval systems: a decision theory approach. Journal of the American Society for Information Science, 29:31-40.
    • Lamb, M.R., Auster, E.W. and Westel, E.R. (1985) A friendly front-end for bibliographic retrieval: the implementation of a flexible interface. In: Proceedings of the 48th ASIS annual meeting, vol. 22, Las Vegas, Oct 22-24, 1985. Parkhurst, C.A., ed. New York: Knowledge Industry Publications, 1985, pp.229-235.
    • Lancaster, F.W. (1979) Information retrieval systems: characteristics, testing and evaluation. 2nd edition. New York: John Wiley 8.E Sons, 1979.
    • Lehmann, E.L. (1975) Nonparametrics: statistical methods based on ranks. Oakland, CA: Holden Hay.
    • Lesk, M.E. (1969) Word-word associations in document retrieval systems. American Documentation, 20:27-38.
    • Macaskill, M.J. (1987) Splitting CIRT into two processes. In: Robertson & Thompson, 1987, A1.1-A.3.
    • Marcus, R.S. (1983) Computer-assisted search planning and evaluation. In: Proceedings of the 46th ASIS Annual Meeting. vol.20, October 1983, pp.19-21.
    • Maron, M.E. and Kuhns, J.L. (1960) On relevance, probabilistic indexing, and information retrieval. Journal of the ACM, 7:216-244.
    • Martin, W.A. (1982) Helping the less experienced user. In: 6th International Online Meeting. London, 7-9 December 1982. Oxford: Learned Information (Europe) Ltd. 1982, pp. 67-76.
    • McCune, B.P., Tong, R.M., Dean, J.S. and Shapiro, D.G. (1985) Rubric: A System for rule based Information Retrieval, IEEE Transactions on Software Engineering, Vol.SE-11, No.9, Sept.1985, pp. 939-944.
    • McGill, M., Koll, M. and Noreault, T. (1979) An evaluation of factors affecting document ranking by information retrieval systems. Technical report, Syracuse University, School of Information Studies, 1979.
    • McMath, C.F., Tamaru, R.S. and Rada, R. (1989) A graphical thesaurus-based information retrieval system. International Journal of Man-Machine Studies, 31, 1989, pp.121-147.
    • Meadow, C.T. and Cochrane, P.A. (1981) Basics of Online Searching. New York, N.Y.: John Wiley & Sons, 1981.
    • Meadow, C.T., Cerny, B.A., Borgman, C.L. and Case, D.O. (1989) Online Access to Knowledge: system design. Journal of the American Society for Information Science, 40(2), 1989, pp.86-98.
    • Meadow, C.T. (1979) The computer as a search intermediary. Online, 3(3): 54-59, (1979).
    • Minker, J., Wilson, G.A. and Zimmerman, B.H. (1972) An evaluation of query expansion, by the addition of clustered terms for a document retrieval system. Information Storage and Retrieval, Vol.8, 1972, pp. 329-348.
    • Pietilainen, P. (1983) Local feedback and intelligent automatic query expansion. Information Processing and Management, 1983, 19(1), 51-58.
    • Pollitt, A.S. (1981) An expert system as an online search intermediary. In 5th International Online Information Meeting. London 8-10, December 1981, Oxford: Learned Information Ltd. 1981, pp. 25-32.
    • Pollitt, A.S. (1987) CANSEARCH: An expert systems approach to document retrieval. Information Processing and Management, 1987, 23(2), 119-138.
    • Pollitt, A.S. (1988) A common query interface using Men USE -A menu-based user interface search engine. In: Proceedings of the 12th International Online Meeting. London 6-8 December 1988. Oxford: Learned Information, 1988, vol. 2, pp.445-457.
    • Pollock, S.M. (1968) Measures for the comparison of information retrieval systems. American Documentation, 19:387-397.
    • Porter, M.F. (1982) Implementing a probabilistic information retrieval system. Information Technology: Research and Development, 1982, 1(2), 131-156.
    • Robertson, S.E. (1977a) The probabilistic character of relevance. Information Processing 8 Management, 13(4): 247-51.
    • Robertson, S.E. (1977b) The probability ranking principle in IR. Journal of Documentation, 33, 1977, 294-304.
    • Robertson, S.E. (1978) Indexing theory and retrieval effectiveness. Drexel Library Quarterly, 14(2), 40-56, 1978
    • Robertson, S.E. (1986) On relevance weight estimation and query expansion. Journal of Documentation 42 (3): 182-188; 1986.
    • Robertson, S.E. (1990a) On sample sizes for non-matched-pair IR experiments. Information Processing eg Management, 26(6), 739-753.
    • Robertson, S.E. (1990b) On term selection for query expansion. Journal of Documentation, 46(4), 359-364.
    • Robertson, S.E. and Belkin, N.J. (1978) Ranking in principle. Journal of Documentation, 34:93-100.
    • Robertson, S.E. and Bovey, J.D. (1983) A front-end for IR experiments. Final report to the British Library Research and Development Department, BLRDD Report No. 5807; 1983.
    • Robertson, S.E. and Sparck Jones, K. (1976) Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3), 1976, pp.129-146.
    • Robertson, S.E. and Thompson, C.L. (1987) An operational evaluation of weighting, ranking and relevance feedback via a front-end system. Final report to the British Library Research and Development Department, BLRDD Report No. 5949, SI/G 703, 1987.
    • Robertson, S.E., Bovey, J.D., Thompson, C.L. and Macaskill, M.J. (1986) Weighting, ranking and relevance feedback in a front-end system. Journal of Information Science, 12, 1986, pp.71-75.
    • Shoval, P. (1985) Principles, procedures and rules in an expert system for information retrieval. Information Processing and Management, 21(6), 1985, pp.475-487.
    • Sparck Jones, K. (1988) A look back and a look forward. In: Proceedings of the 11th International Conference on Research & Development in Information Retrieval. June 13-15, 1988, Grenoble, France. Yves Chiaramella (ed.) ACM Press. 13-29.
    • Sparck Jones, K. and Barber, E.O. (1971) What makes an automatic keyword classification effective? Journal of the American Society for Information Science, 22(3): 166-75.
    • van Rijsbergen, C.J. (1986) A non-classical logic for Information Retrieval. Computer Journal, Vol.29, No.6, 1986, pp. 481-485.
    • van Rijsbergen, C.J., Harper, D.J. and Porter, M.F. (1981) The selection of good search terms. Information Processing and Management, 17(2), 1981, pp.77-91.
    • Williams, P.W. (1985) The design of an expert system for access to information. In: 9th International Online Information Meeting, London 3-5 December 1985. Oxford: Learned Information; 1985, pp.23-29.
    • Wu, H. and Salton, G. (1981) The estimation of term relevance weights using relevance feedback. Journal of Documentation, 37(4), 1981, pp.194-214.
    • Yip, M.K. (1981) An expert system for document retrieval. M.S. Thesis, Department of Electrical Engineering and Computer Science, M.I.T., Cambridge, MA, USA, 1981.
    • Yu, C.T., Luk, W.S. and Sin, M.K. (1979) On models of information retrieval processes. Information Systems, 4(3): 205-218.
    • Yu, C.T., Buckley, C., Lam, K. and Salton, G. (1983) A generalized term dependence model in information retrieval. Information Technology Research E4 Developmemnt, 2(4): 129-154.
  • No related research data.
  • Discovered through pilot similarity algorithms. Send us your feedback.

Share - Bookmark

Download from

Cite this article