Learning formal grammars: some selected methods
We discuss three selected approaches to learning formal grammars in the Gold paradigm of identification in the limit. The first one is a general, practically useful criterion of learnability, and the remaining ones are methods of computing grammars from finite data.
It is often hard to prove the learnability of a class of grammars (languages) using Angluin’s finite tell-tale criterion or Blum and Blum’s locking sequences. A useful sufficient condition for learnability is finite elasticity, introduced in (Wright 1989, Motoki, Shinohara and Wright 1991). (Kanazawa 1996, 1998) shows that finite elasticity is preserved by counter-images determined by finitely-valued relations. We show that this theorem easily yields several classical results, e.g., the learnability of context-sensitive grammars admitting at most k rules.
Unification of sets of terms (formulae) is a standard technique in automated deduction. We briefly discuss learning methods based on unification of sets of types which learn type grammars from (structured) data. These methods were elaborated in (Buszkowski 1987, Buszkowski and Penn 1990) and refined in (Kanazawa 1996, 1998, Marciniec 1994, 1997, 2011, Foret 2001, Costa Florêncio 2003) and others. Finite elasticity is used to prove the conver
gence of the resulting learning functions. We exhibit natural connections with belief-revision based learning methods, studied in (Gierasimczuk 2010).
Syntactic concept lattices
At the end, we outline a recent method of (Clark 2010, 2011), modified in (Leiss 2012), which uses a Galois connection between sets of strings and sets of contexts. This method allows to learn context-free grammars, satisfying finite context property.
Wright, K. (1989). Identification of unions of languages drawn from an identifiable class. In: The 1989 Workshop on Computational Learning Theory, 328–333, San Mateo Calif.: Morgan Kaufmann.
Motoki, T., Shinohara, T., Wright, K. (1991). The correct definition of finite elasticity: Corrigendum to identification of unions. In: The Fourth Workshop on Computational Learning Theory, 375, San Mateo Calif.: Morgan Kaufmann.
Kanazawa, M. (1996). Identification in the limit of categorial grammars. Journal of Logic, Language and Information 5: 115–155.
Kanazawa, M. (1998). Learnable Classes of Categorial Grammars. CSLI Publications and FoLLI, Stanford.
Buszkowski, W. (1987). Discovery procedures for categorial grammars. In: Categories, Polymorphism and Unification, (E. Klein and J. van Benthem eds.), 36–64, University of Amsterdam.
Buszkowski, W., Penn, G. (1990). Categorial grammars determined from linguistic data by unification. Studia Logica 49: 431–454.
Marciniec, J. (1994). Learning categorial grammars by unification with negative constraints. Journal of Applied Non-Classical Logics 4: 181–200.
Marciniec, J. (1997). Infinite set unification with application to categorial grammar. Studia Logica 58: 339–355.
Marciniec, J. (2011). Tarski’s Principle, Categorial Grammars and Learnability. In: Language and Automata Theory and Applications, (A. Horia Dediu et al. eds.), Lecture Notes in Computer Science 6638: 378–389, Springer.
Foret, A. (2001). On mixing deduction and substitution in Lambek Categorial Grammars. In: Logical Aspects of Computational Linguistics, (P. de Groote et al. eds.), Lecture Notes in Computer Science 2099: 158–174, Springer.
Costa Florêncio, C. (2003), Learning Categorial Grammars. PhD Thesis, University of Utrecht.
Gierasimczuk, N. (2010). Knowing One’s Limits. Logical Analysis of Inductive Inference., PhD Thesis, University of Amsterdam.
Clark, A. (2010). Learning context-free grammars with the syntactic concept lattice. In: Proc. International Colloquium ICGI, (J.M. Sempere and P. Garcia eds.), Lecture Notes in Computer Science 6339: 38–51, Springer.
Clark, A. (2011). A learnable representation of syntax using residuated lattices. In: Formal Grammar 2009, (P. de Groote et al. eds.), Lecture Notes in Computer Science 5591: 183–198, Springer.
Leiss, H. (2012), Learning CFGs with the Finite Context Property. A note on A. Clark’s algorithm. Manuscript, University of Munich.