Knowledge graphs (KGs) provide structured, verifiable grounding for large language models (LLMs), but current LLM-based systems commonly use KGs as auxiliary structures for text retrieval, leaving their intrinsic quality underexplored.In this work, we propose \textitWikontic, a multi-stage pipeline that constructs KGs from open-domain texts by extracting candidate triplets with qualifiers, enforcing Wikidata-based type and relation constraints, and normalizing entities to reduce duplication.The resulting KGs are compact, ontology-consistent, and well-connected; on MuSiQue, the correct answer entity appears in 96% of generated triplets.On HotpotQA, our triplets-only setup achieves 76.0 F1, and on MuSiQue 59.8 F1, matching or surpassing several retrieval-augmented generation baselines that still require textual context. In addition, Wikontic attains state-of-the-art information-retention performance on the MINE-1 benchmark (86%), outperforming prior KG construction methods.Wikontic is also efficient at build time: KG construction uses less than 1,000 output tokens, about 3\times fewer than AriGraph and <1/20 of GraphRAG.The proposed pipeline improves the quality of the generated KG and offers a scalable solution for leveraging structured knowledge in LLMs. Wikontic is available at https://github.com/screemix/Wikontic.
@inproceedings{chepurova-etal-2026-wikontic,title={Wikontic: Constructing {W}ikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models},author={Chepurova, Alla and Bulatov, Aydar and Burtsev, Mikhail and Kuratov, Yuri},editor={Demberg, Vera and Inui, Kentaro and Marquez, Llu{\'i}s},booktitle={Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 1: Long Papers)},month=mar,year={2026},address={Rabat, Morocco},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2026.eacl-long.388/},doi={10.18653/v1/2026.eacl-long.388},pages={8304--8319},isbn={979-8-89176-380-7},}
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent
Yuri Kuratov, Matvey Kairov, Aydar Bulatov, Ivan Rodkin, and Mikhail Burtsev
@misc{kuratov2026gradmem,title={GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent},author={Kuratov, Yuri and Kairov, Matvey and Bulatov, Aydar and Rodkin, Ivan and Burtsev, Mikhail},year={2026},archiveprefix={arXiv},primaryclass={cs.CL},url={https://arxiv.org/abs/2603.13875},}
2025
GENA-LM: a family of open-source foundational DNA language models for long sequences
Veniamin Fishman, Yuri Kuratov, Aleksei Shmelev, Maxim Petrov, Dmitry Penzar, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, and Mikhail Burtsev
Recent advancements in genomics, propelled by artificial intelligence, have unlocked unprecedented capabilities in interpreting genomic sequences, mitigating the need for exhaustive experimental analysis of complex, intertwined molecular processes inherent in DNA function. A significant challenge, however, resides in accurately decoding genomic sequences, which inherently involves comprehending rich contextual information dispersed across thousands of nucleotides. To address this need, we introduce GENA language model (GENA-LM), a suite of transformer-based foundational DNA language models capable of handling input lengths up to 36 000 base pairs. Notably, integrating the newly developed recurrent memory mechanism allows these models to process even larger DNA segments. We provide pre-trained versions of GENA-LM, including multispecies and taxon-specific models, demonstrating their capability for fine-tuning and addressing a spectrum of complex biological tasks with modest computational demands. While language models have already achieved significant breakthroughs in protein biology, GENA-LM showcases a similarly promising potential for reshaping the landscape of genomics and multi-omics data analysis. All models are publicly available on GitHub (https://github.com/AIRI-Institute/GENA_LM) and on HuggingFace (https://huggingface.co/AIRI-Institute). In addition, we provide a web service (https://dnalm.airi.net/) allowing user-friendly DNA annotation with GENA-LM models.
@article{fishman2025genalm,author={Fishman, Veniamin and Kuratov, Yuri and Shmelev, Aleksei and Petrov, Maxim and Penzar, Dmitry and Shepelin, Denis and Chekanov, Nikolay and Kardymon, Olga and Burtsev, Mikhail},title={GENA-LM: a family of open-source foundational DNA language models for long sequences},journal={Nucleic Acids Research},volume={53},number={2},pages={gkae1310},year={2025},month=jan,issn={1362-4962},doi={10.1093/nar/gkae1310},url={https://doi.org/10.1093/nar/gkae1310},}
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov, Mikhail Arkhipov, Aydar Bulatov, and Mikhail Burtsev
In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2025
A range of recent works addresses the problem of compression of sequence of tokens into a shorter sequence of real-valued vectors to be used as inputs instead of token embeddings or key-value cache. These approaches are focused on reduction of the amount of compute in existing language models rather than minimization of number of bits needed to store text. Despite relying on powerful models as encoders, the maximum attainable lossless compression ratio is typically not higher than x10. This fact is highly intriguing because, in theory, the maximum information capacity of large real-valued vectors is far beyond the presented rates even for 16-bit precision and a modest vector size. In this work, we explore the limits of compression by replacing the encoder with a per-sample optimization procedure. We show that vectors with compression ratios up to x1500 exist, which highlights two orders of magnitude gap between existing and practically attainable solutions. Furthermore, we empirically show that the compression limits are determined not by the length of the input but by the amount of uncertainty to be reduced, namely, the cross-entropy loss on this sequence without any conditioning. The obtained limits highlight the substantial gap between the theoretical capacity of input embeddings and their practical utilization, suggesting significant room for optimization in model design.
@inproceedings{kuratov2025cramming,title={Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity},author={Kuratov, Yuri and Arkhipov, Mikhail and Bulatov, Aydar and Burtsev, Mikhail},editor={Che, Wanxiang and Nabende, Joyce and Shutova, Ekaterina and Pilehvar, Mohammad Taher},booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},month=jul,year={2025},address={Vienna, Austria},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2025.acl-long.948/},doi={10.18653/v1/2025.acl-long.948},pages={19323--19339},isbn={979-8-89176-251-0},}
2024
Prompt Me One More Time: A Two-Step Knowledge Extraction Pipeline with Ontology-Based Verification
Alla Chepurova, Yuri Kuratov, Aydar Bulatov, and Mikhail Burtsev
In Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing, Aug 2024
This study explores a method for extending real-world knowledge graphs (specifically, Wikidata) by extracting triplets from texts with the aid of Large Language Models (LLMs). We propose a two-step pipeline that includes the initial extraction of entity candidates, followed by their refinement and linkage to the canonical entities and relations of the knowledge graph. Finally, we utilize Wikidata relation constraints to select only verified triplets. We compare our approach to a model that was fine-tuned on a machine-generated dataset and demonstrate that it performs better on natural data. Our results suggest that LLM-based triplet extraction from texts, with subsequent verification, is a viable method for real-world applications.
@inproceedings{chepurova-etal-2024-prompt,title={Prompt Me One More Time: A Two-Step Knowledge Extraction Pipeline with Ontology-Based Verification},author={Chepurova, Alla and Kuratov, Yuri and Bulatov, Aydar and Burtsev, Mikhail},editor={Ustalov, Dmitry and Gao, Yanjun and Panchenko, Alexander and Tutubalina, Elena and Nikishina, Irina and Ramesh, Arti and Sakhovskiy, Andrey and Usbeck, Ricardo and Penn, Gerald and Valentino, Marco},booktitle={Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing},month=aug,year={2024},address={Bangkok, Thailand},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2024.textgraphs-1.5/},pages={61--77},}
Beyond Attention: Breaking the Limits of Transformer Context Length with Recurrent Memory
Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, and Mikhail Burtsev
Proceedings of the AAAI Conference on Artificial Intelligence, Mar 2024
@article{bulatov2024beyond,title={Beyond Attention: Breaking the Limits of Transformer Context Length with Recurrent Memory},volume={38},url={https://ojs.aaai.org/index.php/AAAI/article/view/29722},doi={10.1609/aaai.v38i16.29722},number={16},journal={Proceedings of the AAAI Conference on Artificial Intelligence},author={Bulatov, Aydar and Kuratov, Yuri and Kapushev, Yermek and Burtsev, Mikhail},year={2024},month=mar,pages={17700–17708},}
Associative Recurrent Memory Transformer
Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, and Mikhail Burtsev
In Next Generation of Sequence Modeling Architectures Workshop at ICML 2024, 2024
@inproceedings{rodkin2024associative,title={Associative Recurrent Memory Transformer},author={Rodkin, Ivan and Kuratov, Yuri and Bulatov, Aydar and Burtsev, Mikhail},booktitle={Next Generation of Sequence Modeling Architectures Workshop at ICML 2024},year={2024},}
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Ivan Rodkin, Dmitry Sorokin, Artyom Sorokin, and Mikhail Burtsev
In Advances in Neural Information Processing Systems, 2024
@inproceedings{kuratov2024babilong,author={Kuratov, Yuri and Bulatov, Aydar and Anokhin, Petr and Rodkin, Ivan and Sorokin, Dmitry and Sorokin, Artyom and Burtsev, Mikhail},booktitle={Advances in Neural Information Processing Systems},doi={10.52202/079017-3381},editor={Globerson, A. and Mackey, L. and Belgrave, D. and Fan, A. and Paquet, U. and Tomczak, J. and Zhang, C.},pages={106519--106554},publisher={Curran Associates, Inc.},title={BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack},url={https://proceedings.neurips.cc/paper_files/paper/2024/file/c0d62e70dbc659cc9bd44cbcf1cb652f-Paper-Datasets_and_Benchmarks_Track.pdf},volume={37},year={2024},}
2022
Recurrent Memory Transformer
Aydar Bulatov, Yury Kuratov, and Mikhail Burtsev
In Advances in Neural Information Processing Systems, 2022
@inproceedings{rmt_2022,author={Bulatov, Aydar and Kuratov, Yury and Burtsev, Mikhail},booktitle={Advances in Neural Information Processing Systems},editor={Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D. and Cho, K. and Oh, A.},pages={11079--11091},publisher={Curran Associates, Inc.},title={Recurrent Memory Transformer},url={https://proceedings.neurips.cc/paper_files/paper/2022/file/47e288629a6996a17ce50b90a056a0e1-Paper-Conference.pdf},volume={35},year={2022},}