Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Di Fatta, Giuseppe; Berthold, Michael R. (2005)
Languages: English
Types: Unknown
Subjects: frequent subgraph mining, biochemical databases, Distributed computing, dynamic load balancing
ddc: ddc:004
Frequent pattern discovery in structured data is receiving\ud an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] M. Deshpande, M. Kuramochi, and G. Karypis, “Frequent sub-structure-based approaches for classifying chemical compounds,” in Proceedings of IEEE International Conference on Data Mining (ICDM'03), Melbourne, Florida, USA, Nov. 19-22, 2003.
    • [2] T. Washio and H. Motoda, “State of the art of graphbased data mining,” ACM SIGKDD Explorations Newsletter, vol. 5, no. 1, pp. 59-68, July 2003.
    • [3] R. Agrawal, T. Imielinski, and A. N. Swami, “Mining association rules between sets of items in large databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 26-28,.
    • [4] M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, “New algorithms for fast discovery of association rules,” in Proceedings of 3rd Int. Conf. on Knowledge Discovery and Data Mining (KDD'97), 1997, pp. 283-296.
    • [5] M. Deshpande, M. Kuramochi, and G. Karypis, “Automated approaches for classifying structures,” in Proceedings of Workshop on Data Mining in Bioinformatics (BioKDD), 2002, pp. 11-18.
    • [6] X. Yan and J. Han, “gspan: Graph-based substructure pattern mining,” in Proceedings of the IEEE International Conference on Data Mining (ICDM'02), Maebashi City, Japan, 2002.
    • [7] C. Borgelt and M. R. Berthold, “Mining molecular fragments: Finding relevant substructures of molecules,” in IEEE International Conference on Data Mining (ICDM 2002), Maebashi, Japan, Dec. 9-12, 2002, pp. 51-58.
    • [8] S. Kramer, L. de Raedt, and C. Helma, “Molecular feature mining in hiv data,” in Proceedings of 7th Int. Conf. on Knowledge Discovery and Data Mining, (KDD'01), San Francisco, CA, 2001, pp. 136-143.
    • [9] M. R. Garey and D. S. Johnson, Computers and intractability: a guide to the theory of NPcompleteness. W. H. Freeman, 1979.
    • [10] S. Skiena, Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley, 1990.
    • [11] E. M. Luks, “Isomorphism of graphs of bounded valence can be tested in polynomial time,” Journal of Computer and System Sciences, vol. 25, pp. 42-65, Aug. 1982.
    • [12] M. J. Zaki, “Parallel and distributed association mining: A survey,” IEEE Concurrency, vol. 7, no. 4, pp. 14-25, 1999.
    • [13] G. Di Fatta and M. R. Berthold, “Distributed mining of molecular fragments,” in IEEE DM-Grid Workshop of the Int. Conf. on Data Mining (ICDM 2004), Brighton, UK, Nov. 1-4, 2004.
    • [14] C. Wang and S. Parthasarathy, “Parallel algorithms for mining frequent structural motifs in scientific data,” in Proceedings of the 18th Annual International Conference on Supercomputing (ICS'04), Saint Malo, France, June 26 - July 01, 2004.
    • [15] F. Schreiber and H. Schwbbermeyer, “Towards motif detection in networks: Frequency concepts and flexible search,” in Proceedings of the International Workshop on Network Tools and Applications in Biology (NETTAB04), Camerino, Italy, Sept. 5-7, 2004, pp. 91-102.
    • [16] R. Agrawal and J. Shafer, “Parallel mining of association rules,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 6, pp. 962-969, Dec. 1996.
    • [17] E. Han, G. Karypis, and V. Kumar, “Scalable parallel data mining for association rules,” IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 337-352, May/June 2000.
    • [18] R. Finkel and U. Manber, “Dib - a distributed implementation of backtracking,” ACM Transactions on Programming Languages and Systems, vol. 9 (2), pp. 235-256, Apr. 1987.
    • [19] V. Kumar, A. Grama, and V. N. Rao, “Scalable load balancing techniques for parallel computer,” Journal of Parallel and Distributed Computing, vol. 22, no. 1, pp. 60-79, July 1994.
    • [20] R. Karp and Y. Zhang, “A randomized parallel branch-and-bound procedure,” in Proceedings of the 20 Annual ACM Symposium on Theory of Computing (STOC 1988), 1988, pp. 290-300.
    • [21] S. Chakrabarti, A. Ranade, and K. Yelick, “Randomized load-balancing for tree-structured computation,” in Proceedings of the Scalable High Performance Computing Conference (SHPCC '94), Knoxville, TN, May 23-25, 1994, pp. 666-673.
    • [22] Y. Chung, J. Park, and S. Yoon, “An asynchronous algorithm for balancing unpredictable workload on distributed-memory machines,” ETRI Journal, vol. 20, no. 4, pp. 346-360, Dec. 1998.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article