Computational Approaches to Automatic Poetry Generation and Evaluation: A Survey

Ilya Koziev; Leonid Sinev

doi:10.1613/jair.1.20584

PDF

Published: May 15, 2026

DOI: https://doi.org/10.1613/jair.1.20584

Keywords:

automatic poetry generation, large language model, Creativity, machine learning

Ilya Koziev

SberAI

https://orcid.org/0009-0004-4447-132X

Leonid Sinev

SberAI

https://orcid.org/0000-0003-2097-9036

Abstract

This survey provides a comprehensive synthesis of research on automatic poetry generation and evaluation from 2017 to 2025. We examine computational approaches that leverage pre-trained LLMs, multimodal architectures, and specialized algorithms for handling poetic constraints such as meter, rhyme, and stanza structure. In addition to surveying generative methods, we analyze practices in data engineering, including corpus construction, annotation, and preprocessing tools tailored to poetry. Evaluation receives particular attention: we review automatic metrics, LLM-as-a-judge methods, and human-centered protocols, discussing their strengths and limitations. Compared with prior surveys, our work emphasizes (1) the dominant role of LLMs in both generation and evaluation, (2) a taxonomy of poetry generation tasks categorized by interaction modality, (3) systematic coverage of dataset engineering challenges, and (4) a comprehensive analysis of automatic and human evaluation approaches, highlighting their drawbacks. By consolidating advances across diverse research lines, we show how poetry serves as a challenging benchmark for controllable text generation, multimodal grounding, and human-aligned evaluation. Building on this perspective, the survey summarizes current methods and open challenges in the generation, control, and evaluation of poetic and lyrical text.

Issue

Vol. 86 (2026)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details