您当前的位置: > 详细浏览

Guiding Large Language Models to Generate Computer-Parsable Content

请选择邀稿期刊:
摘要: We propose a method to guide Large Language Models (LLMs) in generating structured content adhering to specific conventions without fine-tuning. By utilizing coroutine-based content generation constraints through a pre-agreed context-free grammar (CFG), LLMs are directed during decoding to produce formal language compliant outputs. This enhances stability and consistency in generating target data structures, types, or instructions, reducing application development complexities. Experimentally, error rates of GPT-2 and Gemma exceed 95% for DSLs longer than 36 and 282 tokens, respectively. We introduce YieldLang, a coroutine-based DSL generation framework, and evaluate it with LLMs on various tasks including JSON and Mermaid flowchart generation. Compared to benchmarks, our approach improves accuracy by 1.09 to 11.6 times, with LLMs requiring only about 16.5% of the samples to generate JSON effectively. This enhances usability of LLM-generated content for computer programs.

版本历史

[V2] 2024-04-23 14:22:16 ChinaXiv:202404.00272V2 下载全文
[V1] 2024-04-21 22:45:22 ChinaXiv:202404.00272v1 查看此版本 下载全文
点击下载全文
预览
同行评议状态
待评议
许可声明
metrics指标
  •  点击量423
  •  下载量93
评论
分享
申请专家评阅