Abstract:
Digital transformation of enterprises requires the whole-life cycle management of data. Data identification and archiving is an important means to solve the problem of traditional unstructured documents that are difficult to be directly processed by big data technology. Based on enterprise data governance, master data management is introduced into data archiving, and enterprise data is divided into three categories: master data, transaction data, and analytical data. Using macro identification method to identify these three types of data. Determining the scope of data archiving, and include metadata such as ER diagrams, data dictionaries, and data lineage diagrams into the scope of metadata archiving, integrating archived data into the construction of enterprise data lake as the best path for data archiving. The archives department can accelerate integration into the national big data strategy by implementing a "dual system" of electronic file and data archiving, piloting data archiving in large state-owned enterprises, and actively participating in data governance to enhance the data literacy of the archives work team.