Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Publisher: IEEE
Languages: English
Types: Article
Subjects: QA76, QA76.76

Classified by OpenAIRE into

Large scale servers with hundreds of hosts and tens of thousands of cores are becoming common. To exploit these platforms software must be both scalable and reliable, and distributed actor languages like Erlang are a proven technology in this area. While distributed Erlang conceptually supports the engineering of large scale reliable systems, in practice it has some scalability limits that force developers to depart from the standard language mechanisms at scale. In earlier work we have explored these scalability limitations, and addressed them by providing a Scalable Distributed (SD) Erlang library that partitions the network of Erlang Virtual Machines (VMs) into scalable groups (s groups). This paper presents the first systematic evaluation of SD Erlang s groups and associated tools, and how they can be used. We present a comprehensive evaluation of the scalability and reliability of SD Erlang using three typical benchmarks and a case study. We demonstrate that s groups improve the scalability of reliable and unreliable Erlang applications on up to 256 hosts (6144 cores). We show that SD Erlang preserves the class-leading distributed Erlang reliability model, but scales far better than the standard model. We present a novel, systematic, and tool-supported approach for refactoring distributed Erlang applications into SD Erlang. We outline the new and improved monitoring, debugging and deployment tools for large scale SD Erlang applications. We demonstrate the scaling characteristics of key tools on systems comprising up to 10K Erlang VMs.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [Arm03] J. Armstrong. Making reliable distributed systems in the presence of sodware errors. PhD thesis, KTH, Stockholm, Sweden, 2003.
    • [Arm13] J. Armstrong. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf, 2nd edition, 2013.
    • [AVMD00] S. Aggarwal, J. Vincent, G. Mohr, and M. Day. Instant messaging/presence protocol requirements. Technical Report RFC2779, IETF, 2000.
    • [Bas14] Basho. Riak, 2014. http://basho.com/riak/.
    • [BCH13] L. A. Barroso, J. Clidaras, and U. Ho¨ lzle. The Datacenter as a Computer. Morgan and Claypool, 2nd edition, 2013.
    • [CDK+01] R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. McDonald, and R. Menon. Parallel Programming in OpenMP. Morgan Kaufmann Pub. Inc., USA, 2001.
    • [CLG+16] N. Chechina, H. Li, A. Ghaffari, S. Thompson, and P. Trinder. Improving the network scalability of Erlang. JPDC, 90-91:22-34, 2016.
    • [CMHT16] N. Chechina, M. Moro Hernandez, and P. Trinder. A scalable reliable instant messenger using the SD Erlang libraries. In Erlang'16, pages 33-41, Japan, 2016. ACM.
    • [CT09] F. Cesarini and S. Thompson. Erlang Programming. O'Reilly Media, Inc., 1st edition, 2009.
    • [CV16] F. Cesarini and S. Vinoski. Designing for Scalability with Erlang/OTP. O'Reilly Media, Inc., 1st edition, 2016.
    • [dBSD00] M. den Besten, T. Stu¨ tzle, and M. Dorigo. Ant colony optimization for the total weighted tardiness problem. In PPSN, volume 1917 of LNCS, pages 611-620. Springer, 2000.
    • [DG08] J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Com. ACM, 51(1):107-113, 2008.
    • [DRS00] M. Day, J. Rosenberg, and H. Sugano. A model for presence and instant messaging. Technical Report RFC2778, IETF, 2000.
    • [DS04] M. Dorigo and T. Stu¨ tzle. Ant Colony Optimization. Bradford Company, Scituate, MA, USA, 2004.
    • [EBPJ11] J. Epstein, A. P. Black, and S. Peyton-Jones. Towards Haskell in the cloud. SIGPLAN Not., 46(12):118-129, 2011.
    • [EDF10] EDF. The Sim-Diasca Simulation Engine, 2010. http://www.sim-diasca.com.
    • [GC15] A. Gainaru and F. Cappello. Fault-Tolerance Techniques for High-Performance Computing, chapter Errors and Faults, pages 89-144. Springer, Cham, 2015.
    • [Gei10] M. J. Geiger. New instances for the single machine total weighted tardiness problem. Technical Report 10-03-01, Helmut-Schmidt-Universita¨t, Hamburg, 2010.
    • [Ger06] G. Germain. Concurrency oriented programming in termite scheme. In Erlang'06, pages 20-20, USA, 2006. ACM.
    • [HBS73] C. Hewitt, P. Bishop, and R. Steiger. A universal modular ACTOR formalism for artificial intelligence. In IJCAI'73, pages 235-245, USA, 1973. Morgan Kaufmann Pub. Inc.
    • [HCT15] M. Moro Hernandez, N. Chechina, and P. Trinder. A reliable instant messenger in Erlang: Design and evaluation. Technical Report TR-2015-002, Glasgow University, 2015.
    • [Hew10] C. Hewitt. Actor model for discretionary, adaptive concurrency. CoRR, abs/1008.1459, 2010.
    • [HS12] P. Haller and F. Sommers. Actors in Scala. Artima Inc., 2012.
    • [J+16] S. M. Jodal et al. Pykka, 2016. pykka.readthedocs.org/.
    • [L+10] J. Lee et al. Python actor runtime library, 2010. http://osl.cs.uiuc.edu/parley/.
    • [LN01] F. Lubeck and M. Neunhoffer. Enumerating large orbits and direct condensation. Experimental Math., 10(2):197-205, 2001.
    • [LT12] H. Li and S. Thompson. Automated API migration in a userextensible refactoring tool for Erlang programs. In ASE'12, Essen, Germany, 2012.
    • [LT13] H. Li and S. Thompson. Multicore profiling for Erlang programs using Percept2. In Erlang'13, Boston, USA, 2013.
    • [LT15] H. Li and S. Thompson. Safe concurrency introduction through slicing. In PEPM'15, Mumbai, India, 2015.
    • [Lun16] D. Luna. Chaos monkey, Available 2016. https://github.com/dLuna/chaos monkey.
    • [McN59] R. McNaughton. Scheduling with deadlines and loss functions. Management Science, 6(1):1-12, 1959.
    • [MCT15] K. MacKenzie, N. Chechina, and P. Trinder. Performance portability through semi-explicit placement in distributed Erlang. In Erlang'15, pages 27-38, USA, 2015. ACM.
    • [MM00] D. Merkle and M. Middendorf. An ant algorithm with a new pheromone evaluation rule for total tardiness problems. In EvoWorkshops'00, volume 1803 of LNCS, pages 287-296. Springer Verlag, 2000.
    • [O+12] M. Odersky et al. The Scala programming language, 2012. http://www.scala-lang.org/.
    • [REL15a] RELEASE D5.3. Systematic Testing and Debugging Tools, 2015. http://www.release-project.eu/documents/D5.3.pdf.
    • [REL15b] RELEASE D6.2. Scalability Case Studies: Scalable SimDiasca for the Blue Gene, 2015. http://www.releaseproject.eu/documents/D6.2.pdf.
    • [REL15c] RELEASE D6.7. Scalability and Reliability for a Popular Actor Framework, 2015. http://www.releaseproject.eu/documents/D6.7.pdf.
    • [SOW+95] M. Snir, S. W. Otto, D. W. Walker, J. Dongarra, and S. HussLederman. MPI: The Complete Reference. MIT Press, 1995.
    • [Spi14] SpilGames. Spapi-router: A partially-connected Erlang clustering, 2014. https://github.com/spilgames/spapi-router.
    • [TCP+16] P. Trinder, N. Chechina, N. Papaspyrou, K. Sagonas, S. Thompson, et al. Scaling reliably: Improving the scalability of the Erlang distributed actor platform. (Submitted to) ACM Trans. Program. Lang. Syst., 2016.
    • [Tse13] A. Tseitlin. The antifragile organization. Commun. ACM, 56(8):40-44, 2013.
    • [web16a] CAF: C++ actor framework, 2016. actor-framework.org/.
    • [web16b] Rust, Available 2016. https://www.rust-lang.org/.
    • [Wha15] WhatsApp, 2015. https://www.whatsapp.com/.
    • [Whi10] T. White. Hadoop: The Definitive Guide. Yahoo! Press, 2010.
    • [Wra16] Wrangler, 2016. https://www.cs.kent.ac.uk/projects/wrangler.
    • [XGT07] Z. Xiao, L. Guo, and J. Tracey. Understanding instant messaging traffic characteristics. In ICDCS'07, pages 51-51. IEEE, 2007.
  • No related research data.
  • No similar publications.
  • BioEntity Site Name

Share - Bookmark

Funded by projects


Cite this article