• LAMOST异构环境信息检索系统的设计与实现

    Subjects: Astronomy >> Astrophysical processes submitted time 2020-06-09 Cooperative journals: 《天文研究与技术》

    Abstract:环境信息是大天区面积多目标光纤光谱望远镜(LAMOST,即郭守敬望远镜)运行状态的一部分,用于辅助望远镜的日常维护、观测以及后期数据处理。环境信息来源于多个子系统,而子系统的数据存储由不同单位设计,使得数据存储环境复杂,不利于统一检索。基于异步协程,我们提出了一种异构环境信息检索的方法。该方法利用数据库代理对象工厂整合多个数据库及检索字段信息,通过客户端和Web浏览器提供远程服务,以自定义检索命令的方式实现对异构数据库的检索。本方法充分考虑了后期升级的需求,保留多种数据库和应用接口,便于更多数据库和应用的接入。目前基于该方法开发的检索系统已部署于LAMOST运行环境,在实际运行中简化了运行环境检索操作,提高了工作效率和数据获取的精准度,提升了维护和观测效率。

  • 基于Python技术的LAMOST控制节点分布状态采集与监视系统

    Subjects: Astronomy >> Astronomical Instruments and Techniques submitted time 2018-06-22 Cooperative journals: 《天文研究与技术》

    Abstract: The control of modern large-scale astronomical telescopes is generally accomplished by multiple independent (or clustered) computers. The performance and efficiency of each node computer directly affect the operational stability of the whole telescope. A source monitor software system is an import and essential management tool for the operation and maintenance of large telescopes. This system can collect and store the hardware resource information of each node, and it can further monitor and analyze this information. When necessary, the system will provide the observers with warning messages and suggestions in order to eliminate hidden dangers and to improving the observational efficiency of the overall telescope. Based on the deep analysis of the engineering requirements, we design and develop a resource monitor system for LAMOST. The system is implemented using Python language and asynchronous coroutines technology, and provides a variety of human-computer interactive functional modules and extension interfaces. We has deployed it in the actual project environment and achieved rather effective results. The work of this paper provides rather valuable references for managing and maintaining other large telescope.

  • The Slides for Guiding Large Language Models to Generate Computer-Parsable Content

    Subjects: Computer Science >> Computer Software Subjects: Linguistics and Applied Linguistics >> Linguistics and Applied Linguistics submitted time 2024-04-21

    Abstract: This slide presentation describes the research on Guiding Large Language Models to Generate Computer-Parsable Content in terms of Background, Motivation, Method, Effect, Prospect and Acknowledgements. For the full paper, please refer to: https://arxiv.org/abs/2404.05499

  • Constraining Large Language Model for Generating Computer-Parsable Content

    Subjects: Computer Science >> Computer Software Subjects: Linguistics and Applied Linguistics >> Linguistics and Applied Linguistics submitted time 2024-04-07

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in learning patterns from massive text corpora, including word relationships, sentence structures, and even complex semantic and pragmatic information. However, it remains challenging to induce pre-trained language models to generate structured content that strictly follows specific conventions.We propose a scheme for guiding LLMs to generate highly usable content for computers without the need for fine-tuning and additional neural network inference, by introducing coroutine-based content generation constraints through a pre-agreed context-free grammar (CFG), which guides the autoregressive model Transformer to sample the correct tokens during its decoding phase to form a program-compliant form in the decoding phase of the autoregressive model Transformer to form a formal language that conforms to the program conventions. This will effectively improve the stability and consistency of LLMs in generating target data structures, types or instructions, and reduce the difficulty of application development and integration.We first verified that the error rate of models such as GPT-2 and Gemma reaches 95% when the length of the generated DSLs are greater than 36 and 282, respectively, through the experiment of matching bracket pairs , which illustrates the performance problem of some current LLMs in the generation of specific DSLs. We also present YieldLang, a coroutine-based DSL generation framework, and conduct experiments using LLMs on multiple task datasets, including tasks such as JSON, Mermaid flowchart, and function call expression generation. These experiments show that the approach in this paper improves its accuracy by a factor of 1.09 to 11.6 compared to the benchmarks, and in the best case is able to reduce the number of samples used by the LLMs to generate JSON to about 16.5% of the benchmarks, which will effectively improve the usability of the content generated by the LLMs for computer programs.