Compression schemes for advanced data structures have become the challenge of today. Information theory has traditionally dealt with conventional data such as text, image, or video. In contrast, most data available today is multitype and context-dependent. To meet this challenge, we have recently initiated a systematic study of advanced data structures such as unlabeled graphs [1]. In this paper, we continue this program by considering trees with statistically correlated vertex names. Trees come in many forms, but here we deal with binary plane trees (where order of subtrees matters) and their non-plane version. Furthermore, we assume that each symbol of a vertex name depends in a Markovian sense on the corresponding symbol of the parent vertex name. We first evaluate the entropy for both types of trees. Then we propose two compression schemes COMPRESSPTREE for plane trees with correlated names, and COMPRESSNPTREE for non-plane trees. We prove that the former scheme achieves the lower bound within two bits, while the latter is within 1% of the optimal compression.
Authors
Additional information
- DOI
- Digital Object Identifier link open in new tab 10.1109/isit.2016.7541492
- Category
- Aktywność konferencyjna
- Type
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Language
- angielski
- Publication year
- 2016