Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Soomro, K.
Languages: English
Types: Doctoral thesis
Workflows are a way to describe a series of computations on raw e-Science data. These data may be MRI brain scans, data from a high energy physics detector or metric data from an earth observation project. In order to derive meaningful knowledge from the data, it must be processed and analysed. Workflows have emerged as the principle mechanism for describing and enacting complex e-Science analyses on distributed infrastructures such as grids. Scientific users face a number of challenges when designing workflows. These challenges include selecting appropriate components for their tasks, spec- ifying dependencies between them and selecting appropriate parameter values. These tasks become especially challenging as workflows become increasingly large. For example, the CIVET workflow consists of up to 108 components. Building the workflow by hand and specifying all the links can become quite cumbersome for scientific users.\ud Traditionally, recommender systems have been employed to assist users in such time-consuming and tedious tasks. One of the techniques used by recommender systems has been to predict what the user is attempting to do using a variety of techniques. These techniques include using workflow se- mantics on the one hand and historical usage patterns on the other. Semantics-based systems attempt to infer a user’s intentions based on the available semantics. Pattern-based systems attempt to extract usage patterns from previously-constructed workflows and match those patterns to the workflow un- der construction. The use of historical patterns adds dynamism to the suggestions as the system can learn and adapt with “experience”. However, in cases where there are no previous patterns to draw upon, pattern-based systems fail to perform. Semantics-based systems, on the other hand infer from static information, so they always have something to draw upon. However, that information first has to be encoded into the semantic repository for the system to draw upon it, which is a time-consuming and tedious task in it self. Moreover, semantics-based systems do not learn and adapt with experience. Both approaches have distinct, but complementary features and drawbacks. By combining the two approaches, the drawbacks of each approach can be addressed.\ud This thesis presents HyDRA, a novel hybrid framework that combines frequent usage patterns and workflow semantics to generate suggestions. The functions performed by the framework include; a) extracting frequent functional usage patterns; b) identifying the semantics of unknown components; and c) generating accurate and meaningful suggestions. Challenges to mining frequent patterns in- clude ensuring that meaningful and useful patterns are extracted. For this purpose only patterns that occur above a minimum frequency threshold are mined. Moreover, instead of just groups of specific components, the pattern mining algorithm takes into account workflow component semantics. This allows the system to identify different types of components that perform a single composite function. One of the challenges in maintaining a semantic repository is to keep the repository up-to-date. This involves identifying new items and inferring their semantics. In this regard, a minor contribution of this research is a semantic inference engine that is responsible for function b). This engine also uses pre-defined workflow component semantics to infer new semantic properties and generate more accurate suggestions. The overall suggestion generation algorithm is also presented.\ud HyDRA has been evaluated using workflows from the Laboratory of Neuro Imaging (LONI) repos- itory. These workflows have been chosen for their structural and functional characteristics that help� to evaluate the framework in different scenarios. The system is also compared with another existing pattern-based system to show a clear improvement in the accuracy of the suggestions generated.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] N. Guarino and C. A. Welty, “An overview of OntoCleaHn,a”nindbook on ontolog.iesSpringer, 2009, pp. 201-220.
    • [2] D. E. Rex, J. Q. Ma, and A. W. Toga, “The LONI pipeline processing environmentN,”euroImag,e vol. 19, no. 3, pp. 1033 - 1048, 2003. [Online]. Availablhet:tp://www.sciencedirect.com/science/ article/pii/S105381190300185X
    • [3] S. L. Chow, Statistical significance: Rationale, validity and ut.ilitSyAGE Publications Limited, 1997, vol. 1.
    • [4] Z. Tu et al,. “Brain anatomical structure segmentation by hybrid discriminative/generative models,”Medical Imaging, IEEE Transactions,ovnol. 27, no. 4, pp. 495-508, 2008.
    • [5] D. W. Shatuck and R. M. Leahy, “BrainSuite: an automated cortical surface identification tool,” Medical image analysi,svol. 6, no. 2, pp. 129-142, 2002.
    • [6] “fMRI preprocessing using Air,h”ttp://users.loni.usc.edu/~pipeline/serverlib/view_workflow. php?file=cranium/fMRI/Groups/fMR_using_AIR.pi[pLeast accessed 3ᵗʰ0Dec, 2014].
    • [7] L. Ramakrishnan and B. Plale, “A multi-dimensional classification model for scientific workflow characteristics,”Pirnoceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Scien, cser. Wands '10. New York, NY, USA: ACM, 2010, pp. 4:1-4:12.
    • [10] F. T. Oliveiraet al,. “Provenance and annotation of data and processes,” ser. Lecture Notes in Computer Science, J. Freire, D. Koop, and L. Moreau, Eds. Berlin, Heidelberg: Springer-Verlag, 2008, ch. Using Provenance to Improve Workflow Design, pp. 136-143.
  • No related research data.
  • No similar publications.

Share - Bookmark

Download from

Cite this article