Understanding Formal Arrow-Connected Diagrams and Free-form Sketches
Typ dokumentu
disertační práceAutor
Bresler, Martin
Vedoucí práce
Hlaváč, Václav
Průša, Daniel
Studijní obor
Umělá inteligence a biokybernetikaStudijní program
Elektrotechnika a informatikaInstituce přidělující hodnost
České vysoké učení technické v Praze. Fakulta elektrotechnická. Katedra kybernetiky.Metadata
Zobrazit celý záznamAbstrakt
Drawing has been a natural way for humans to express their thoughts and ideas since
ancient times. People got used to create and understand illustrations. Many wellde
ned formal notations have evolved including technical drawings, musical scores, and
various diagrams. On the other hand, people use free-form sketches or ad hoc intuitive
notations often. Electronic devices equipped with a touch screen take part in our daily
lives and make it is easier to create and share drawings. Naturally, it is more and more
desired to have methods for automatic recognition and understanding of drawings. It
is much easier and desired in the case of formal drawings consisting of well de ned
entities. A computer system can work further with such recognized drawing { beautify
the drawing, rearrange a diagram, perform some actions according to a diagram, or
manufacture something according to a technical drawing. On the other hand, it is
extremely di cult and potentially unnecessary in the case of free-form sketches. It is
often enough to provide the user with tools for easier creation and manipulation with
sketches. The reason is that these sketches are meant to be understood by humans only
and the computer system serves for their creation, storage, and sharing. Obviously, we
can classify drawing into formal drawings and free-from sketching.
This thesis deals with two tasks: recognition of formally de ned diagrams and segmentation
of object of interest in free-form sketches. These two apparently di erent
topics are strongly related. We assume that the drawing is created on an electronic
device and thus it is in form of a sequence of strokes rather than a raster image.
We propose a recognition framework for arrow-connected diagrams. We introduce a
model for recognition by selection of symbol candidates, based on evaluation of relations
between candidates using a set of predicates. It is suitable for simpler structures,
in which the relations are explicitly given by symbols, arrows in the case of diagrams.
Knowledge of a speci c diagram domain is used. The two domains are
owcharts and
nite automata. We created a benchmark database of diagrams from these two domains.
Although the individual pipeline steps are tailored for these, the system can
be readily adapted for other domains. The recognition pipeline consists of the following
major steps: text/non-text separation, symbol segmentation, symbol classi cation,
and structural analysis. We performed a comparison with state-of-the-art methods for
recognition of
owcharts and nite automata and veri ed that our approach outperforms
them. We also analysed our system thoroughly and identi ed most frequent
causes of recognition failures.
The situation is more complicated in the case of a free-from sketching where the user
can draw and write anything freely. We cannot expect any particular structure. We can
often nd a combination of pictorial drawing with more structured gures and text in
form of labels. The classical segmentation by recognition can not be used here. In some
cases, the understanding of a sketch might be di cult even for humans. Speci cally,
the identi cation of an individual object in a drawing may di er from user to user,
case to case. Recent research showed that a single linkage agglomerative clustering
of strokes with trainable distance function can be used for segmentation of objects
from prede ned symbol classes in formal drawings. The proper distance function can
be learned from annotated data. It requires a lot of data and time. We propose
an approach combining several pre-trained distance functions for particular structures together to segment object in free-form sketches. We show that the desired segmentation
can be achieved in many cases by combining together distance functions trained for very
general object types like rows, columns, words, or compact images. We also show that
the best combination of distance functions can be found from a very limited data in real
time. We propose a segmentation approach, which estimates the optimal combination
of clustering distance functions from an initial selection of one object. It results in
segmentation of objects, which have similar characteristics to the initial one. Based on
this approach, we designed a selection tool bringing additional functionality allowing
to select and manipulate the segmented objects seamlessly. The method is suitable for
fast rearrangement of sketches during collaborative content creation (brainstorming).
Kolekce
- Disertační práce - 13000 [721]
K tomuto záznamu jsou přiřazeny následující licenční soubory: