Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Jain, Abhishek Kumar; Maskell, Douglas L.; Fahmy, Suhaib A.
Publisher: IEEE
Languages: English
Types: Unknown
Subjects: TK, QA76
Combining processors with hardware accelerators has become a norm with systems-on-chip (SoCs) ever present in modern compute devices. Heterogeneous programmable system on chip platforms sometimes referred to as hybrid FPGAs, tightly couple general purpose processors with high performance reconfigurable fabrics, providing a more flexible alternative. We can now think of a software application with hardware accelerated portions that are reconfigured at runtime. While such ideas have been explored in the past, modern hybrid FPGAs are the first commercial platforms to enable this move to a more software oriented view, where reconfiguration enables hardware resources to be shared by multiple tasks in a bigger application. However, while the rapidly increasing logic density and more capable hard resources found in modern hybrid FPGA devices should make them widely deployable, they remain constrained within specialist application domains. This is due to both design productivity issues and a lack of suitable hardware abstraction to eliminate the need for working with platform-specific details, as server and desktop virtualization has done in a more general sense. To allow mainstream adoption of FPGA based accelerators in general purpose computing, there is a need to virtualize FPGAs and make them more accessible to application developers who are accustomed to software API abstractions and fast development cycles. In this paper, we discuss the role of overlay architectures in enabling general purpose FPGA application acceleration.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of things (iot): A vision, architectural elements, and future directions,” Future Generation Computer Systems, vol. 29, no. 7, pp. 1645-1660, 2013.
    • [2] S. Ahmad, V. Boppana, I. Ganusov, V. Kathail, V. Rajagopalan, and R. Wittig, “A 16-nm multiprocessing system-on-chip fieldprogrammable gate array platform,” IEEE Micro, vol. 36, no. 2, pp. 48-62, 2016.
    • [3] R. Tessier, K. Pocek, and A. DeHon, “Reconfigurable computing architectures,” Proceedings of the IEEE, vol. 103, no. 3, pp. 332-354, 2015.
    • [4] S. M. Trimberger, “Three ages of FPGAs: A retrospective on the first thirty years of FPGA technology,” Proceedings of the IEEE, vol. 103, no. 3, pp. 318-331, 2015.
    • [5] J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang, “High-level synthesis for FPGAs: From prototyping to deployment,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 4, pp. 473-491, April 2011.
    • [6] G. Stitt, “Are field-programmable gate arrays ready for the mainstream?” IEEE Micro, vol. 31(6), pp. 58-63, 2011.
    • [7] N. W. Bergmann, S. K. Shukla, and J. Becker, “QUKU: a duallayer reconfigurable architecture,” ACM Transactions on Embedded Computing Systems (TECS), vol. 12, no. 1s, pp. 63:1-63:26, Mar. 2013.
    • [8] A. K. Jain, K. D. Pham, J. Cui, S. A. Fahmy, and D. L. Maskell, “Virtualized execution and management of hardware tasks on a hybrid ARM-FPGA platform,” J. Signal Process. Syst., vol. 77, no. 1-2, pp. 61-76, 2014.
    • [9] K. D. Pham, A. K. Jain, J. Cui, S. A. Fahmy, and D. L. Maskell, “Microkernel hypervisor for a hybrid ARM-FPGA platform,” in Proceedings of the International Conference on Application-Specific Systems, Architecture Processors (ASAP), 2013, pp. 219-226.
    • [10] J. Cong, H. Huang, C. Ma, B. Xiao, and P. Zhou, “A fully pipelined and dynamically composable architecture of CGRA,” in IEEE Symposium on FPGAs for Custom Computing Machines (FCCM), 2014.
    • [11] A. DeHon, “DPGA utilization and application,” in Proceedings of the International Symposium on Field-Programmable Gate Arrays (FPGA), 1996, pp. 115-121.
    • [12] S. Trimberger, D. Carberry, A. Johnson, and J. Wong, “A timemultiplexed FPGA,” in IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 1997, pp. 22-28.
    • [13] K. Vipin and S. A. Fahmy, “Mapping adaptive hardware systems with partial reconfiguration using CoPR for Zynq,” in Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems (AHS), June 2015, pp. 1-8.
    • [14] K. Vipin and S. A. Fahmy, “Architecture-aware reconfiguration-centric floorplanning for partial reconfiguration,” in Proceedings of the International Symposium on Applied Reconfigurable Computing (ARC), 2012, pp. 13-25.
    • [15] K. Vipin and S. A. Fahmy, “A high speed open source controller for FPGA partial reconfiguration,” in Proceedings of International Conference on Field Programmable Technology (FPT), 2012, pp. 61-66.
    • [16] K. Vipin and S. A. Fahmy, “ZyCAP: Efficient partial reconfiguration management on the Xilinx Zynq,” IEEE Embedded Systems Letters, Jan. 2014.
    • [17] T. Callahan, J. Hauser, and J. Wawrzynek, “The Garp architecture and C compiler,” Computer, vol. 33, no. 4, pp. 62-69, Apr. 2000.
    • [18] E. Mirsky and A. DeHon, “MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources,” in IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr. 1996, pp. 157-166.
    • [19] C. Ebeling, D. C. Cronquist, and P. Franklin, “RaPiD - reconfigurable pipelined datapath,” in Field-Programmable Logic Smart Applications, New Paradigms and Compilers, 1996, pp. 126-135.
    • [20] H. Singh, M.-H. Lee, G. Lu, F. Kurdahi, N. Bagherzadeh, and E. Chaves Filho, “MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications,” IEEE Transactions on Computers, vol. 49, no. 5, pp. 465-481, 2000.
    • [21] J. M. P. Cardoso and M. Weinhardt, “XPP-VC: a c compiler with temporal partitioning for the PACT-XPP architecture,” in Field-Programmable Logic and Applications: Reconfigurable Computing Is Going Mainstream, Jan. 2002, pp. 864-874.
    • [22] P. Heysters and G. Smit, “Mapping of DSP algorithms on the MONTIUM architecture,” in Parallel and Distributed Processing Symposium, 2003.
    • [23] C. Liang and X. Huang, “SmartCell: an energy efficient coarse-grained reconfigurable architecture for stream-based applications,” EURASIP Journal on Embedded Systems, vol. 2009, no. 1, pp. 518-659, Jun. 2009.
    • [24] B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins, “ADRES: an architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix,” in Field Programmable Logic and Application, Jan. 2003, pp. 61-70.
    • [25] S. Friedman, A. Carroll, B. Van Essen, B. Ylvisaker, C. Ebeling, and S. Hauck, “SPR: an architecture-adaptive CGRA mapping tool,” in Proceedings of the International Symposium on Field programmable gate arrays (FPGA), 2009, pp. 191-200.
    • [26] T. Miyamori and K. Olukotun, “REMARC: reconfigurable multimedia array coprocessor,” in IEICE Transactions on Information and Systems, vol. 82, no. 2, 1999, pp. 389-397.
    • [27] N. Kapre, N. Mehta, M. deLorimier, R. Rubin, H. Barnor, M. Wilson, M. Wrighton, and A. DeHon, “Packet switched vs. time multiplexed FPGA overlay networks,” in IEEE Symposium on Field Programmable Custom Computing Machines (FCCM), 2006.
    • [28] C. Liu, H.-C. Ng, and H. K.-H. So, “QuickDough: a rapid fpga loop accelerator design framework using soft CGRA overlay,” in Proceedings of the International Conference on Field-Programmable Technology (FPT), 2015.
    • [29] H.-P. Rosinger, “Connecting customized ip to the microblaze soft processor using the fast simplex link (fsl) channel,” Xilinx Application Note, 2004.
    • [30] H. Y. Cheah, F. Brosser, S. A. Fahmy, and D. L. Maskell, “The iDEA DSP block-based soft processor for FPGAs,” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 7, no. 3, pp. 19:1- 19:23, 2014.
    • [31] G. Stitt and J. Coole, “Intermediate fabrics: Virtual architectures for near-instant FPGA compilation,” IEEE ESL, vol. 3(3), pp. 81-84, 2011.
    • [32] D. Capalija and T. S. Abdelrahman, “A high-performance overlay architecture for pipelined execution of data flow graphs,” in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL), 2013.
    • [33] J. Benson, R. Cofell, C. Frericks, C.-H. Ho, V. Govindaraju, T. Nowatzki, and K. Sankaralingam, “Design, integration and implementation of the DySER hardware accelerator into OpenSPARC,” in International Symposium on High Performance Computer Architecture (HPCA), 2012.
    • [34] A. K. Jain, S. A. Fahmy, and D. L. Maskell, “Efficient Overlay architecture based on DSP blocks,” in IEEE Symposium on FPGAs for Custom Computing Machines (FCCM), 2015, pp. 25-28.
    • [35] J. Coole and G. Stitt, “Adjustable-cost overlays for runtime compilation,” in IEEE Symposium on FPGAs for Custom Computing Machines (FCCM), 2015, pp. 21-24.
    • [36] D. Capalija and T. Abdelrahman, “Towards synthesis-free JIT compilation to commodity FPGAs,” in IEEE Symposium on FPGAs for Custom Computing Machines (FCCM), 2011.
    • [37] J. Coole and G. Stitt, “Intermediate fabrics: Virtual architectures for circuit portability and fast placement and routing,” in Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2010, pp. 13-22.
    • [38] A. Landy and G. Stitt, “A low-overhead interconnect architecture for virtual reconfigurable fabrics,” in Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), 2012, pp. 111-120.
    • [39] V. Govindaraju, C.-H. Ho, and K. Sankaralingam, “Dynamically specialized datapaths for energy efficient computing,” in International Symposium on High Performance Computer Architecture (HPCA), 2011, pp. 503-514.
    • [40] V. Govindaraju, C.-H. Ho, T. Nowatzki, J. Chhugani, N. Satish, K. Sankaralingam, and C. Kim, “Dyser: Unifying functionality and parallelism specialization for energy-efficient computing,” IEEE Micro, vol. 32, no. 5, pp. 38-51, 2012.
    • [41] A. K. Jain, X. Li, S. A. Fahmy, and D. L. Maskell, “Adapting the DySER architecture with DSP blocks as an overlay for the Xilinx Zynq,” SIGARCH Comput. Archit. News, vol. 43, no. 4, pp. 28-33, Apr. 2016.
    • [42] A. K. Jain, D. L. Maskell, and S. A. Fahmy, “Throughput oriented FPGA overlays using DSP blocks,” in Proceedings of the Design, Automation and Test in Europe Conference (DATE), 2016, pp. 1628-1633.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article