Добрый день! Уважаемые студенты, Вашему вниманию представляется курсовая работа на тему: «Protobuf Guided Static Field Stripping and Profiling»
Оригинальность работы 93%
Аннотация
Protocol Buffers – протокол сериализации структурированных данных. На сегодняшний день, это один из самых оптимальных методов сериализации и десериализации, который признан и используется многими компаниями и корпорациями. Несмотря на свою популярность, Protocol Buffers может быть оптимизирован для некоторых задач. Структуры, используемые высоконагруженными большими сервисами, растут достаточно быстро. Это приводит к тому, что некоторые поля не используются, при этом protoc компилятор генерирует все их методы. Избавление от неиспользуемого кода положительно влияет на производительность сервисов и нагрузку на серверы. В данной работе подробно описывается эта проблема, рассматриваются различные подходу к удалению “мертвого” кода и предлагается решение, основанное на двухэтапной компиляции. Реализованный алгоритм показал значительные результаты: увеличение производительности трех больших сервисов компании Google до 0.64%.
Abstract
Protocol Buffers is a language-neutral, extensible mechanism for serializing structured data. Today it is one of the most efficient data serialization techniques and is used by many companies in almost every big project. Despite its popularity, it is still not as efficient as it could be for some tasks. In high-loaded systems, proto schemes grow extremely fast, and not all parts of it are used. At the same time, protoc compiler generates all the field methods, even if it is not used. That leads to the arising of dead code. This thesis illuminates the optimization gap in the protoc compiler, describes different approaches applicable to that problem, and proposes the solution based on two-step compilation. The implemented algorithm showed positive results: the performance of three big Google services was measured, and the performance improved up to 0.64%.
Keywords: Protocol Buffers, dead code elimination, compiler optimization, protoc
Contents
1
Introduction
5
1.1
Relevance and Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.2
Problem Overview and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2
Literature Review
7
2.1
Mark-Sweep Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.2
Post-Link Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3
FDO................................................
8
3
Protocol Buffers. Key terms
9
3.1
Proto Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
3.2
Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
3.3
protoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
3.4
Generated code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
3.5
Protobuf Usage Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
4
C++ Compilation Process
12
4.1
Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
4.2
Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
4.3
Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
5
Problem Statement
14
6
Proposed Solution
15
6.1
Main ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
6.2
Programs, Compiler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
6.2.1
static . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
6.2.2
volatile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
6.2.3
-fdata-sections -ffunction-sections . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
6.2.4
-g -Wl, --gc-sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
6.2.5
objdump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
6.2.6
c++filt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
6.2.7
__builtin_trap() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3
6.3
Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
6.4
Stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
6.5
Problems Encountered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
6.6
Other Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
6.7
Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
7
Evaluation
23
8
Future work
24
9
Conclusion
25
10
References
26
10 References
[1] Protocol Buffers documentation. Accessed: Feb. 22 2021 [Online]. Available: https://developers.google.com/protocol-buffers
[2] Github fork of Protobuf repository with the implemented project. Accessed: May 3 2021 [Online]. Available: https://github.com/khasanovaa/protobuf
[3] Popi´c S. et al. Performance evaluation of using Protocol Buffers in the Internet of Things communication //2016 InternationalConference on Smart Systems and Technologies (SST). – IEEE, 2016. – С. 261-265.
[4] Wikipedia: Protocol Buffers. Accessed Feb. 2 2021 [Online]. Available: https://en.wikipedia.org/wiki/Protocol_Buffers
[5] Alphabet investor relations. Accessed: Feb. 22 2021 [Online]. Available: https://abc.xyz/investor/
[6] Protocol Buffers implementation. Accessed: Feb. 22 2021 [Online]. Available: https://github.com/protocolbuffers/protobuf
[7] Cooper K., Torczon L. Engineering a compiler. – Elsevier, 2011.
[8] Endo T., Taura K., Yonezawa A. A scalable mark-sweep garbage collector on large-scale shared-memory machines //SC’97: Proceedings of the 1997 ACM/IEEE Conference on Supercomputing. – IEEE, 1997. – С. 48-48.
[9] Garner R., Blackburn S. M., Frampton D. Effective prefetch for mark-sweep garbage collection //Proceedings of the 6th international symposium on Memory management. – 2007. – С. 43-54.
[10] Armstrong J., Virding R. One pass real-time generational mark-sweep garbage collection //International Workshop on Memory Management. – Springer, Berlin, Heidelberg, 1995. – С. 313-322.
26
[11] Panchenko M. et al. Bolt: a practical binary optimizer for data centers and beyond //2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). – IEEE, 2019. – С. 2-14.
[12] Smith M. D. Overcoming the challenges to feedback-directed optimization (keynote talk) //Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization. – 2000. – С. 1-11.
[13] Chen D., Moseley T., Li D. X. AutoFDO: Automatic feedback-directed op-timization for warehouse-scale applications //2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). – IEEE, 2016. – С. 12-23.
[14] Wikipedia: Data segment. Accessed May 5 2021 [Online]. Available:
https://en.wikipedia.org/wiki/Data_segment
[15] De Sutter B., De Bus B., De Bosschere K. Link-time binary rewriting techniques for program compaction //ACM Transactions on Programming Languages and Systems (TOPLAS). – 2005. – Т. 27. – №. 5. – С. 882-945.
[16] Static keyword documentation, cppreference. Accessed: May 3 2021 [Online]. Available:
https://en.cppreference.com/w/cpp/keyword/static
[17] Volatile type qualifier documentation, cppreference. Accessed: May 2 2021 [On-line]. Available:
https://en.cppreference.com/w/cpp/language/cv
[18] Reducing Size of Executables with Unused Subprogram/Data Elimination, GCC GNU documentation. Accessed: May 2 2021 [Online]. Available:
https://gcc.gnu.org/onlinedocs/gnat_ugn/Reducing-Size-of-Executables-wi
002fData-Elimination.html
[19] Clang command line argument reference, Clang documentation. Accessed: May
2 2021 [Online]. Available:
https://clang.llvm.org/docs/ClangCommandLineReference.html
27
[20] MAN documentation – objdump. Accessed: May 3 2021 [Online]. Available:
https://man7.org/linux/man-pages/man1/objdump.1.html
[21] MAN documentation – c++filt. Accessed: May 3 2021 [Online]. Available:
https://man7.org/linux/man-pages/man1/c++filt.1.html
[22] GCC GNU documentation – Other Built-in Functions Provided by GCC. Ac-cessed: May 4 2021 [Online]. Available:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
[23] AST Matcher Reference. Accessed: May 5 2021 [Online]. Available:
https://clang.llvm.org/docs/LibASTMatchersReference.html
[24] Verma, Abhishek, et al. "Large-scale cluster management at Google with Borg." Proceedings of the Tenth European Conference on Computer Systems. 2015.