Preview

The Herald of the Siberian State University of Telecommunications and Information Science

Advanced search

Application of Thread-Local Garbage Collection to Distributed Systems for Large-Scale Data Processing

https://doi.org/10.55648/1998-6920-2022-16-1-77-88

Abstract

Thread-local garbage collection (TLGC) is a technique of automatic memory management that associates memory locations with a specific application thread. These memory areas (thread-local heaps) support independent processing allowing other mutator threads to be concurrently executed. Improved scalability and throughput make thread-local memory manager an attractive alternative to conventional GC algorithms. This paper discusses effectiveness of thread-local GC applied to distributed systems for large-scale data processing. Experimental results show that the proposed approach increases overall system throughput and proves that TLGC is a suitable choice for memory-intensive fault-tolerant distributed systems.

About the Authors

A. Yu. Filatov
Novosibirsk State University; Huawei Research Center
Russian Federation

Alexander Yu. Filatov, Assistant, Novosibirsk State University; Leading engineer, Huawei Research Center

Novosibirsk



V. V. Mikheev
Novosibirsk State University; Huawei Research Center
Russian Federation

Vitaly V. Mikheev, Lecturer, Novosibirsk State University; Senior expert, Huawei Research Center

Novosibirsk



References

1. Doligez D., Leroy X. A concurrent, generational garbage collector for a multithreaded implementation of ml. Proc. 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1993, pp. 113–123.

2. Domani T., Goldshtein G., Kolodner E., Lewis E., Petrank E, Sheinwald D. Thread-local heaps for java. In SIGPLAN Not, 2002, pp. 76–87.

3. Apache Hadoop® official website [Electronic resource]. Apache Hadoop: [website]. URL: https://hadoop.apache.org/ (access date: 01.01.2022).

4. Apache Spark official website [Electronic resource]. Apache Spark: [website]. URL: https://spark.apache.org/ (access date: 01.01.2022).

5. Filatov A., Mikheev V. Evaluation of thread-local garbage collection. Proc. 2020 Ivannikov Memorial Workshop (IVMEM), 2020, pp. 15–21.

6. Purdom P., Stigler S., Cheam T. Statistical investigation of three storage allocation algorithms. BIT Numerical Mathematics. 1971, pp. 187–195.

7. Sleator D., Tarjan R. Self-adjusting binary search trees. J. ACM. 1985, pp. 682–686.

8. Agesen O. Gc points in a threaded environment. Technical report, SMLI TR-98-70. 1998.

9. Dean J., Ghemawat S. Mapreduce: Simplified data processing on large clusters.

10. Commun. ACM. 2008, pp. 107–113.

11. Zaharia M., Chowdhury M., Franklin M., Shenker S., Stoica I. Spark: Cluster computing with working sets. Proc. 2nd USENIX Conference on Hot Topics in Cloud Computing, 2010, p. 10.

12. White T. Hadoop: The Definitive Guide. O’Reilly Media, Inc., 4th ed., 2015.

13. HiBench source repository [Electronic resource]. GitHub: website URL https://github.com/Intel-bigdata/HiBench (access date: 01.01.2022).

14. MacQueen J. Some methods for classification and analysis of multivariate observations. Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297.

15. Jones S., Launchbury J., Peyton Jones S. Unboxed values as first class citizens. Proc. ACM Conference on Functional Programming and Computer Architecture, 1991, pp. 636–666.

16. O’Malley O. TeraByte Sort on Apache Hadoop [Electronic resource]. TeraByte Sort: [website]. URL: http://sortbenchmark.org/YahooHadoop.pdf (access date: 01.01.2022).

17. Berger E., McKinley K., Blumofe R., Wilson P. Hoard: A scalable memory allocator for multithreaded applications. SIGARCH Comput. Archit. News. 2000, pp. 117–128.

18. Vee V.-Y., Hsu W.-J. A scalable and efficient storage allocator on shared memory multiprocessors. Proc. 1999 International Symposium on Parallel Architectures, Algorithms and Networks, 1999, p. 230.

19. Kirsch C., Payer H., Röck H. Hierarchical plabs, clabs, tlabs in hotspot. Proc. International Conference on Systems (ICONS), 2012.

20. Anderson A. Optimizations in a private nursery-based garbage collector. Proc. 2010 International Symposium on Memory Management, 2010, pp. 21–30.

21. Mole M., Jones R., Kalibera T. A study of sharing definitions in thread-local heaps. ICOOOLPS, 2012, 21.

22. Mole M. A study of thread-local garbage collection for multi-core systems. PhD thesis, University of Kent, 2015.


Review

For citations:


Filatov A.Yu., Mikheev V.V. Application of Thread-Local Garbage Collection to Distributed Systems for Large-Scale Data Processing. The Herald of the Siberian State University of Telecommunications and Information Science. 2022;(1):77-88. (In Russ.) https://doi.org/10.55648/1998-6920-2022-16-1-77-88

Views: 3725


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1998-6920 (Print)