Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Miranda , Cupertino; Pop , Antoniu; Dumont , Philippe; Cohen , Albert; Duranton , Marc (2010)
Publisher: HAL CCSD
Languages: English
Types: Conference object
Subjects: [ INFO.INFO-PL ] Computer Science [cs]/Programming Languages [cs.PL]
International audience; Tuning applications for multicore systems involve subtle concurrency concepts and target-dependent optimizations. This paper advocates for a streaming execution model, called \ER, where persistent processes communicate and synchronize through a multi-consumer multi-producer sliding window. Considering media and signal processing applications, we demonstrate the scalability and efficiency advantages of streaming compared to data-driven scheduling. To exploit these benefits in compilers for parallel languages, we propose an intermediate representation enabling the compilation of data-flow tasks into streaming processes. This intermediate representation also facilitates the application of classical compiler optimizations to concurrent programs.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] G. Al-Kadi and A. S. Terechko. A hardware task scheduler for embedded video processing. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'09), Paphos, Cyprus, Jan. 2009.
    • [2] M. Aldinucci, M. Meneghin, and M. Torquati. Efficient Smith-Waterman on multi-core with FastFlow. In Euromicro Intl. Conf. on Parallel, Distributed and Network-Based Processing, pages 195-199, Pisa, Feb. 2010.
    • [3] Arvind, R. S. Nikhil, and K. Pingali. I-structures: Data structures for parallel computing. ACM Trans. on Programming Languages and Systems, 11(4):598-632, 1989.
    • [4] C. Augonnet, S. Thibault, R. Namyst, and M. Nijhuis. Exploiting the Cell/BE architecture with the StarPU unified runtime system. In Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS'09), pages 329-339, 2009.
    • [5] A. Azevedo, C. Meenderinck, B. H. H. Juurlink, A. Terechko, J. Hoogerbrugge, M. Alvarez, and A. Ram´ırez. Parallel H.264 decoding on an embedded multicore processor. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'09), Paphos, Cyprus, Jan. 2009.
    • [6] P. M. Carpenter, D. Ro´denas, X. Martorell, A. Ram´ırez, and E. Ayguad´e. A streaming machine description and programming model. In Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS'07), pages 107-116, Samos, Greece, July 2007.
    • [7] P. Caspi and M. Pouzet. Synchronous Kahn networks. In ACM Intl. Conf. on Functional programming (ICFP'96), pages 226-238, 1996.
    • [8] A. Cohen, L. Mandel, F. Plateau, and M. Pouzet. Abstraction of clocks in synchronous data-flow systems. In 6th Asian Symp. on Programming Languages and Systems (APLAS 08), Bangalore, India, Dec. 2008.
    • [9] I. Corp. Occam Programming Manual. Prentice Hall, 1984.
    • [10] D. E. Culler and Arvind. Resource requirements of dataflow programs. In ISCA, pages 141-150, 1988.
    • [11] J. B. Dennis and G. R. Gao. An efficient pipelined dataflow processor architecture. In Supercomputing (SC'88), pages 368-373, 1988.
    • [12] H. M. et al. Acotes project: Advanced compiler technologies for embedded streaming. Intl. J. of Parallel Programming, 2010. Special issue on European HiPEAC network of excellence member's projects.
    • [13] F. L. Fessant and L. Maranget. Compiling join-patterns. Electr. Notes Theor. Comput. Sci., 16(3), 1998.
    • [14] C. Fournet and G. Gonthier. The reflexive chemical abstract machine and the join-calculus. In ACM Symp. on Principles of Programming Languages, pages 372-385, St. Petersburg Beach, Florida, Jan. 1996. ACM.
    • [15] J. Giacomoni, T. Moseley, and M. Vachharajani. Fastforward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. In ACM Symp. on Principles and practice of parallel programming (PPoPP'08), pages 43-52, Salt Lake City, Utah, 2008.
    • [16] R. Gupta. Exploiting parallelism on a fine-grain MIMD architecture based upon channel queues. Intl. J. of Parallel Programming, 21(3):169-192, 1992.
    • [17] W. Haid, L. Schor, K. Huang, I. Bacivarov, and L. Thiele. Efficient execution of Kahn process networks on multi-processor systems using protothreads and windowed FIFOs. In Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia'09), pages 35-44, Grenoble, France, Oct. 2009.
    • [18] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous dataflow programming language Lustre. Proc. of the IEEE, 79(9):1305-1320, Sept. 1991.
    • [19] R. H. Halstead, Jr. Multilisp: a language for concurrent symbolic computation. ACM Trans. on Programming Languages and Systems, 7(4):501-538, 1985.
    • [20] T. Henriksson and P. van der Wolf. TTL hardware interface: A high-level interface for streaming multiprocessor architectures. In Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia'06), pages 107-112, Seoul, Korea, Oct. 2006.
    • [21] C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985.
    • [22] G. Kahn. The semantics of a simple language for parallel programming. In J. L. Rosenfeld, editor, Information processing, pages 471-475, Stockholm, Sweden, Aug. 1974. North Holland, Amsterdam.
    • [23] C. Kim, J.-L. Gaudiot, and W. Proskurowski. Parallel computing with the sisal applicative language: Programmability and performance issues. Software, Practice and Experience, 26(9):1025-1051, 1996.
    • [24] C. Kyriacou, P. Evripidou, and P. Trancoso. Data-driven multithreading using conventional microprocessors. IEEE Trans. on Parallel Distributed Systems, 17(10):1176-1188, 2006.
    • [25] E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. on Computers, 36(1):24-25, 1987.
    • [26] E. A. Lee and A. L. Sangiovanni-Vincentelli. A framework for comparing models of computation. IEEE Trans. on CAD of Integrated Circuits and Systems, 17(12):1217-1229, 1998.
    • [27] K. H. R. M. Frigo, C. E. Leiserson. The implementation of the Cilk-5 multithreaded language. In ACM Symp. on Programming Language Design and Implementation (PLDI'98), pages 212-223, Montreal, Quebec, June 1998.
    • [28] V. Marjanovic, J. Labarta, E. Ayguad´e, and M. Valero. Effective communication and computation overlap with hybrid MPI/SMPSs. In PPOPP, 2010.
    • [29] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, i and ii. Inf. Comput., 100(1):1-40 and 41-77, 1992.
    • [30] M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: Efficient deterministic multithreading in software. In The Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, Washington, DC, Mar 2009.
    • [31] G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction with decoupled software pipelining. In IEEE Intl. Symp. on Microarchitecture (MICRO'05), pages 105-118, 2005.
    • [32] J. M. P´erez, P. Bellens, R. M. Badia, and J. Labarta. CellSs: Making it easier to program the cell broadband engine processor. IBM Journal of Research and Development, 51(5):593-604, 2007.
    • [33] J. Planas, R. M. Badia, E. Ayguad´e, and J. Labarta. Hierarchical task-based programming with starss. Intl. J. on High Performance Computing Architecture, 23(3):284-299, 2009.
    • [34] A. Pop and A. Cohen. A stream-comptuting extension to OpenMP. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'11), Jan. 2011.
    • [35] A. Pop, S. Pop, and J. Sjo¨din. Automatic streamization in GCC. In GCC Developer's Summit, Montreal, Quebec, June 2009.
    • [36] M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of Jade. ACM Trans. on Programming Languages and Systems, 20(3):483-545, 1998.
    • [37] M. Sja¨lander, A. Terechko, and M. Duranton. A look-ahead task management unit for embedded multi-core architectures. In Proc. of the 2008 11th EUROMICRO Conf. on Digital System Design Architectures, Parma, Italy, Sept. 2008.
    • [38] K. Stavrou, M. Nikolaides, D. Pavlou, S. Arandi, P. Evripidou, and P. Trancoso. Tflux: A portable platform for data-driven multithreading on commodity multicore systems. In Intl. Conf. on Parallel Processing (ICPP'08), pages 25-34, Portland, Oregon, Sept. 2008.
    • [39] S. Stuijk. Concurrency in computational networks. Master's thesis, Technische Universiteit Eindhoven (TU/e), Oct. 2002. # 446407.
    • [40] W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'10), Vienna, Austria, Sept. 2010.
    • [41] W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A language for streaming applications. In
  • No related research data.
  • No similar publications.

Share - Bookmark

Funded by projects


Related to

  • fet-fp7FET Proactive: FET proactive 1: Concurrent Tera-device Computing
  • fet-fp7FET Proactive: Exploiting dataflow parallelism in Teradevice Computing

Cite this article