-
Finite Automata for Efficient Graph Recognition
Authors:
Frank Drewes,
Berthold Hoffmann,
Mark Minas
Abstract:
Engelfriet and Vereijken have shown that linear graph grammars based on hyperedge replacement generate graph languages that can be considered as interpretations of regular string languages over typed symbols. In this paper we show that finite automata can be lifted from strings to graphs within the same framework. For the efficient recognition of graphs with these automata, we make them determinis…
▽ More
Engelfriet and Vereijken have shown that linear graph grammars based on hyperedge replacement generate graph languages that can be considered as interpretations of regular string languages over typed symbols. In this paper we show that finite automata can be lifted from strings to graphs within the same framework. For the efficient recognition of graphs with these automata, we make them deterministic by a modified powerset construction, and state sufficient conditions under which deterministic finite graph automata recognize graphs without the need to use backtracking.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Generating Semantic Graph Corpora with Graph Expansion Grammar
Authors:
Eric Andersson,
Johanna Björklund,
Frank Drewes,
Anna Jonsson
Abstract:
We introduce Lovelace, a tool for creating corpora of semantic graphs. The system uses graph expansion grammar as a representational language, thus allowing users to craft a grammar that describes a corpus with desired properties. When given such grammar as input, the system generates a set of output graphs that are well-formed according to the grammar, i.e., a graph bank. The generation process…
▽ More
We introduce Lovelace, a tool for creating corpora of semantic graphs. The system uses graph expansion grammar as a representational language, thus allowing users to craft a grammar that describes a corpus with desired properties. When given such grammar as input, the system generates a set of output graphs that are well-formed according to the grammar, i.e., a graph bank. The generation process can be controlled via a number of configurable parameters that allow the user to, for example, specify a range of desired output graph sizes. Central use cases are the creation of synthetic data to augment existing corpora, and as a pedagogical tool for teaching formal language theory.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Optimal Strategies for Static Black-Peg AB Game With Two and Three Pegs
Authors:
Gerold Jäger,
Frank Drewes
Abstract:
The AB~Game is a game similar to the popular game Mastermind. We study a version of this game called Static Black-Peg AB~Game. It is played by two players, the codemaker and the codebreaker. The codemaker creates a so-called secret by placing a color from a set of $c$ colors on each of $p \le c$ pegs, subject to the condition that every color is used at most once. The codebreaker tries to determin…
▽ More
The AB~Game is a game similar to the popular game Mastermind. We study a version of this game called Static Black-Peg AB~Game. It is played by two players, the codemaker and the codebreaker. The codemaker creates a so-called secret by placing a color from a set of $c$ colors on each of $p \le c$ pegs, subject to the condition that every color is used at most once. The codebreaker tries to determine the secret by asking questions, where all questions are given at once and each question is a possible secret. As an answer the codemaker reveals the number of correctly placed colors for each of the questions. After that, the codebreaker only has one more try to determine the secret and thus to win the game.
For given $p$ and $c$, our goal is to find the smallest number $k$ of questions the codebreaker needs to win, regardless of the secret, and the corresponding list of questions, called a $(k+1)$-strategy. We present a $\lceil 4c/3 \rceil-1)$-strategy for $p=2$ for all $c \ge 2$, and a $\lfloor (3c-1)/2 \rfloor$-strategy for $p=3$ for all $c \ge 4$ and show the optimality of both strategies, i.e., we prove that no $(k+1)$-strategy for a smaller $k$ exists.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
An Algebraic Approach to Learning and Grounding
Authors:
Johanna Björklund,
Adam Dahlgren Lindström,
Frank Drewes
Abstract:
We consider the problem of learning the semantics of composite algebraic expressions from examples. The outcome is a versatile framework for studying learning tasks that can be put into the following abstract form: The input is a partial algebra $\alg$ and a finite set of examples $(\varphi_1, O_1), (\varphi_2, O_2), \ldots$, each consisting of an algebraic term $\varphi_i$ and a set of objects~…
▽ More
We consider the problem of learning the semantics of composite algebraic expressions from examples. The outcome is a versatile framework for studying learning tasks that can be put into the following abstract form: The input is a partial algebra $\alg$ and a finite set of examples $(\varphi_1, O_1), (\varphi_2, O_2), \ldots$, each consisting of an algebraic term $\varphi_i$ and a set of objects~$O_i$. The objective is to simultaneously fill in the missing algebraic operations in $\alg$ and ground the variables of every $\varphi_i$ in $O_i$, so that the combined value of the terms is optimised. We demonstrate the applicability of this framework through case studies in grammatical inference, picture-language learning, and the grounding of logic scene descriptions.
△ Less
Submitted 4 July, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Polynomial Graph Parsing with Non-Structural Reentrancies
Authors:
Johanna Björklund,
Frank Drewes,
Anna Jonsson
Abstract:
Graph-based semantic representations are valuable in natural language processing, where it is often simple and effective to represent linguistic concepts as nodes, and relations as edges between them. Several attempts has been made to find a generative device that is sufficiently powerful to represent languages of semantic graphs, while at the same allowing efficient parsing. We add to this line o…
▽ More
Graph-based semantic representations are valuable in natural language processing, where it is often simple and effective to represent linguistic concepts as nodes, and relations as edges between them. Several attempts has been made to find a generative device that is sufficiently powerful to represent languages of semantic graphs, while at the same allowing efficient parsing. We add to this line of work by introducing graph extension grammar, which consists of an algebra over graphs together with a regular tree grammar that generates expressions over the operations of the algebra. Due to the design of the operations, these grammars can generate graphs with non-structural reentrancies; a type of node-sharing that is excessively common in formalisms such as abstract meaning representation, but for which existing devices offer little support. We provide a parsing algorithm for graph extension grammars, which is proved to be correct and run in polynomial time.
△ Less
Submitted 7 May, 2021; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case
Authors:
Adam Dahlgren Lindström,
Suna Bensch,
Johanna Björklund,
Frank Drewes
Abstract:
Semantic embeddings have advanced the state of the art for countless natural language processing tasks, and various extensions to multimodal domains, such as visual-semantic embeddings, have been proposed. While the power of visual-semantic embeddings comes from the distillation and enrichment of information through machine learning, their inner workings are poorly understood and there is a shorta…
▽ More
Semantic embeddings have advanced the state of the art for countless natural language processing tasks, and various extensions to multimodal domains, such as visual-semantic embeddings, have been proposed. While the power of visual-semantic embeddings comes from the distillation and enrichment of information through machine learning, their inner workings are poorly understood and there is a shortage of analysis tools. To address this problem, we generalize the notion of probing tasks to the visual-semantic case. To this end, we (i) discuss the formalization of probing tasks for embeddings of image-caption pairs, (ii) define three concrete probing tasks within our general framework, (iii) train classifiers to probe for those properties, and (iv) compare various state-of-the-art embeddings under the lens of the proposed probing tasks. Our experiments reveal an up to 12% increase in accuracy on visual-semantic embeddings compared to the corresponding unimodal embeddings, which suggest that the text and image dimensions represented in the former do complement each other.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
Formalization and Correctness of Predictive Shift-Reduce Parsers for Graph Grammars based on Hyperedge Replacement
Authors:
Frank Drewes,
Berthold Hoffmann,
Mark Minas
Abstract:
Hyperedge replacement (HR) grammars can generate NP-complete graph languages, which makes parsing hard even for fixed HR languages. Therefore, we study predictive shift-reduce (PSR) parsing that yields efficient parsers for a subclass of HR grammars, by generalizing the concepts of SLR(1) string parsing to graphs. We formalize the construction of PSR parsers and show that it is correct. PSR parser…
▽ More
Hyperedge replacement (HR) grammars can generate NP-complete graph languages, which makes parsing hard even for fixed HR languages. Therefore, we study predictive shift-reduce (PSR) parsing that yields efficient parsers for a subclass of HR grammars, by generalizing the concepts of SLR(1) string parsing to graphs. We formalize the construction of PSR parsers and show that it is correct. PSR parsers run in linear space and time, and are more efficient than the predictive top-down (PTD) parsers recently developed by the authors.
△ Less
Submitted 14 January, 2019; v1 submitted 31 December, 2018;
originally announced December 2018.
-
Analyzing Catastrophic Backtracking Behavior in Practical Regular Expression Matching
Authors:
Martin Berglund,
Frank Drewes,
Brink van der Merwe
Abstract:
We develop a formal perspective on how regular expression matching works in Java, a popular representative of the category of regex-directed matching engines. In particular, we define an automata model which captures all the aspects needed to study such matching engines in a formal way. Based on this, we propose two types of static analysis, which take a regular expression and tell whether there…
▽ More
We develop a formal perspective on how regular expression matching works in Java, a popular representative of the category of regex-directed matching engines. In particular, we define an automata model which captures all the aspects needed to study such matching engines in a formal way. Based on this, we propose two types of static analysis, which take a regular expression and tell whether there exists a family of strings which makes Java-style matching run in exponential time.
△ Less
Submitted 21 May, 2014;
originally announced May 2014.
-
Term Graph Rewriting and Parallel Term Rewriting
Authors:
Andrea Corradini,
Frank Drewes
Abstract:
The relationship between Term Graph Rewriting and Term Rewriting is well understood: a single term graph reduction may correspond to several term reductions, due to sharing. It is also known that if term graphs are allowed to contain cycles, then one term graph reduction may correspond to infinitely many term reductions. We stress that this fact can be interpreted in two ways. According to the "s…
▽ More
The relationship between Term Graph Rewriting and Term Rewriting is well understood: a single term graph reduction may correspond to several term reductions, due to sharing. It is also known that if term graphs are allowed to contain cycles, then one term graph reduction may correspond to infinitely many term reductions. We stress that this fact can be interpreted in two ways. According to the "sequential interpretation", a term graph reduction corresponds to an infinite sequence of term reductions, as formalized by Kennaway et.al. using strongly converging derivations over the complete metric space of infinite terms. Instead according to the "parallel interpretation" a term graph reduction corresponds to the parallel reduction of an infinite set of redexes in a rational term. We formalize the latter notion by exploiting the complete partial order of infinite and possibly partial terms, and we stress that this interpretation allows to explain the result of reducing circular redexes in several approaches to term graph rewriting.
△ Less
Submitted 13 February, 2011;
originally announced February 2011.