Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Vesper, Malte; Koch, Dirk; Vipin, Kizheppatt; Fahmy, Suhaib A. (2016)
Publisher: Institute of Electrical and Electronics Engineers
Languages: English
Types: Contribution for newspaper or weekly magazine
Subjects: /dk/atira/pure/subjectarea/asjc/2600/2606, Computer Networks and Communications, TK, Computer Science Applications, QA76, Control and Optimization, /dk/atira/pure/subjectarea/asjc/1700/1705, /dk/atira/pure/subjectarea/asjc/1700/1706

Classified by OpenAIRE into


Many FPGA-based accelerators are constrained by the available resources and multi-FPGA solutions can be necessary for building more capable systems. Available PCIe solutions provide only FPGA-to-Host communication. In this paper we present JetStream, an open-source1 modular PCIe 3 library, supporting not only fast FPGA-to-Host communication, but also allowing direct FPGA-to-FPGA communication which fully bypasses the memory subsystem. The direct mode saves memory bandwidth for multicast modes and permits to connect multiple FPGAs in various software defined topologies. We show the benefits of JetStream with a large FIR filter spanning four FPGA boards, achieving throughputs of up to 7.09 GB/s per link. Utilizing direct FPGA-to-FPGA transfers reduces the required memory bandwidth by up to 75%.

  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] M. Vesper, “JetStream GitHub page,” 2016. [Online]. Available: https://maltevesper.github.io/JetStream/
    • [2] K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. S. Chung, “Accelerating deep convolutional neural networks using specialized hardware,” Microsoft Research Whitepaper, vol. 2, 2015.
    • [3] Maxeler Technologiesm, “MPC-X Series,” 2010, https://www.maxeler.com/products/mpc-xseries/ accessed 28.03.2016.
    • [4] PCI-SIG, PCI Express Base Specification, Revision 3.0, PCI-SIG Std., 2010.
    • [5] N. Logic, IP Core Size & Speed (Xilinx FPGAs) v4.56, Northwest Logic, 2016.
    • [6] Xillibus, IP core product brief v1.8, Xillibus, Jan 2016.
    • [7] J. Stuecheli, B. Blaner, C. Johns, and M. Siegel, “Capi: A coherent accelerator processor interface,” IBM Journal of Research and Development, vol. 59, no. 1, 2015.
    • [8] D. de la Chevallerie, J. Korinth, and A. Koch, “ffLink: A Lightweight High-Performance Open-Source PCI Express Gen3 Interface for Reconfigurable Accelerators,” in International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART), 2015.
    • [9] M. Jacobsen and R. Kastner, “RIFFA 2.0: A reusable integration framework for FPGA accelerators,” in International Conference on Field Programmable Logic and Applications (FPL), 2013.
    • [10] K. Vipin and S. A. Fahmy, “DyRACT: A partial reconfiguration enabled accelerator and test platform,” in International Conference on Field Programmable Logic and Applications (FPL), Sept 2014.
    • [11] J. Gong, T. Wang, J. Chen, H. Wu, F. Ye, S. Lu, and J. Cong, “An efficient and flexible host-FPGA PCIe communication library,” in International Conference on Field Programmable Logic and Applications (FPL), 2014, pp. 1-6.
    • [12] A. d. C. Lucas, S. Heithecker, and R. Ernst, “FlexWAFE-a high-end real-time stream processing library for FPGAs,” in Design Automation Conference (DAC), 2007.
    • [13] R. Bittner, “Speedy bus mastering PCI express,” in International Conference on Field Programmable Logic and Applications (FPL), 2012.
    • [14] H. Kavianipour, S. Muschter, and C. Bohm, “High performance FPGAbased DMA interface for PCIe,” IEEE Transactions on Nuclear Science, vol. 61, no. 2, 2014.
    • [15] K. Vipin, S. Shreejith, D. Gunasekera, S. A. Fahmy, and N. Kapre, “System-level FPGA device driver with high-level synthesis support,” in International Conference on Field-Programmable Technology (FPT), 2013, pp. 128-135.
    • [16] A. Goldhammer and J. Ayer Jr, “Understanding performance of PCI express systems,” Xilinx WP350, Sept, vol. 4, 2008.
    • [17] R. Scherzinger, “Avoiding PCI Express link performance surprises,” 2006.
    • [18] Xilinx, Virtex-7 FPGA Gen3 Integrated Block for PCI Express v4.1, Xilinx, Sep 2015.
    • [19] Arm, “AMBA R 4 AXI4-Stream protocol specification,” Mar. 2010. [Online]. Available: http://infocenter.arm.com/help/ index.jsp?topic=/com.arm.doc.ihi0051a/index.html
    • [20] J. D. McCalpin, “STREAM: Sustainable Memory Bandwidth in High Performance Computers,” University of Virginia, Charlottesville, Virginia, Tech. Rep., 1991-2007. [Online]. Available: http://www.cs.virginia.edu/stream/
    • [21] Altera, Stratix V Avalon-ST Interface for PCIe Solutions, Altera, Nov 2015.
  • No related research data.
  • No similar publications.