Automating Formalization by Statistical and Semantic Parsing of Mathematics

Cezary Kaliszyk, Josef Urban, Jiřı́ Vyskočil

8th International Conference on Interactive Theorem Proving, Lecture Notes in Computer Science 10499, pp. 12 – 27, 2017.


We discuss the progress in our project which aims to automate formalization by combining natural language processing with deep semantic understanding of mathematical expressions. We introduce the overall motivation and ideas behind this project, and then propose a context-based parsing approach that combines efficient statistical learning of deep parse trees with their semantic pruning by type checking and large-theory automated theorem proving. We show that our learning method allows efficient use of large amount of contextual information, which in turn significantly boosts the precision of the statistical parsing and also makes it more efficient. This leads to a large improvement of our first results in parsing theorems from the Flyspeck corpus.


  PDF |    doi:10.1007/978-3-319-66107-0_2  |  © Standard Springer LNCS Copyright


author = {Cezary Kaliszyk and Josef Urban and Ji\v{r}\'{\i} Vysko\v{c}il},
title = {Automating Formalization by Statistical and Semantic Parsing of Mathematics},
booktitle = {8th International Conference on Interactive Theorem Proving (ITP 2017)},
pages = {12--27},
year = {2017},
url = {},
doi = {10.1007/978-3-319-66107-0_2},
editor = {Mauricio Ayala{-}Rinc{\'{o}}n and C{\'{e}}sar A. Mu{\~{n}}oz},
series = {Lecture Notes in Computer Science},
volume = {10499},
publisher = {Springer},
Nach oben scrollen