On Indexes of Ordered Trees for Subtrees and Tree Patterns and Their Space Complexities

Indexy uspořádaných stromů pro podstromy a stromové vzorky a jejich prostorové složitosti

Supervisors

Editors

Other contributors

Journal Title

Journal ISSN

Volume Title

Publisher

České vysoké učení technické v Praze
Czech Technical University in Prague

Research Projects

Organizational Units

Journal Issue

Abstract

This doctoral thesis deals with methods of indexing of a tree for subtrees and for tree patterns. Two types of indexes are considered. The first type is the index of a tree for subtrees, i.e. a full index that accepts all subtrees of a given tree. The second type is the index of a tree for tree patterns, i.e. a full index that accepts all tree patterns that match a given tree at any of its nodes. The results of the doctoral thesis are divided into three parts. As the first result, this doctoral thesis presents a deterministic pushdown automaton called tree compression automaton (TCA), which can be used for multiple purposes. Firstly, as an index of the subject tree(s) for subtrees. Secondly, as a subtree matcher. Thirdly, TCA can be used for computing subtree repeats. Lastly, it can be used for compression of indexed tree(s). A conversion algorithm from a TCA to a finite tree automaton (FTA) [18] is given. As the second result, this doctoral thesis presents a linear-space index of a tree for tree patterns. A fast searching algorithm that uses this index is given. It is shown that the presented index, together with the searching algorithm, is an efficient simulation of a non-deterministic tree pattern pushdown automaton, which accepts all tree patterns that match a given tree. As the third result, this doctoral thesis investigates the space complexities of deterministic finite tree automata and deterministic tree pattern pushdown automata. Both automata that represent an index of a tree for tree patterns and they have non-deterministic variants with linear size. This text shows that there exist trees such that any deterministic finite tree automaton used as an index of these trees for tree patterns has size exponential with respect to the indexed trees. A related result is demonstrated for deterministic tree pattern PDAs. The results are a part of arbology research [50]. Arbology is an algorithmic discipline dealing with processing of trees that bases its approach on pushdown automata.

This doctoral thesis deals with methods of indexing of a tree for subtrees and for tree patterns. Two types of indexes are considered. The first type is the index of a tree for subtrees, i.e. a full index that accepts all subtrees of a given tree. The second type is the index of a tree for tree patterns, i.e. a full index that accepts all tree patterns that match a given tree at any of its nodes. The results of the doctoral thesis are divided into three parts. As the first result, this doctoral thesis presents a deterministic pushdown automaton called tree compression automaton (TCA), which can be used for multiple purposes. Firstly, as an index of the subject tree(s) for subtrees. Secondly, as a subtree matcher. Thirdly, TCA can be used for computing subtree repeats. Lastly, it can be used for compression of indexed tree(s). A conversion algorithm from a TCA to a finite tree automaton (FTA) [18] is given. As the second result, this doctoral thesis presents a linear-space index of a tree for tree patterns. A fast searching algorithm that uses this index is given. It is shown that the presented index, together with the searching algorithm, is an efficient simulation of a non-deterministic tree pattern pushdown automaton, which accepts all tree patterns that match a given tree. As the third result, this doctoral thesis investigates the space complexities of deterministic finite tree automata and deterministic tree pattern pushdown automata. Both automata that represent an index of a tree for tree patterns and they have non-deterministic variants with linear size. This text shows that there exist trees such that any deterministic finite tree automaton used as an index of these trees for tree patterns has size exponential with respect to the indexed trees. A related result is demonstrated for deterministic tree pattern PDAs. The results are a part of arbology research [50]. Arbology is an algorithmic discipline dealing with processing of trees that bases its approach on pushdown automata.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By