Your conditions: 安璐
  • User Profiling Based on the Behaviour and Content Combined Model

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-08-27 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] To identify and remove online reviews from irrational investors, enhance the professional degree and quality of comments, and to promote rational investment, this article takes identifying whether the users on the Guba website belong to the noise investors as an example, and carries out a user profiling study.[Method/process] Deep user representation learning method was used to learn text information such as users'posts, then a behavior and content combined model was proposed with respect to behavior characteristics such as fans number, influence, bar age, post number and so on, and an empirical and comparative study was done on the annotated data set.[Result/conclusion] Experiment result showed that the BCCM model got the F1 score of 79.47%, which is superior to Decision Tree model(69.90%), SVM model(75.61%), KNN model(73.21%) and ANN model(74.83%). In the specific user profiling task of identifying noise traders, by using deep user representation learning method to obtain text content characteristics, the various evaluation metrics of use profiling can be remarkably improved.

  • Research of Abstractive Chinese Text Summarization Based on Seq2seq Model

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-07-26 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] To deal with the Out Of Vocabulary (OOV) in text summarization while avoiding duplication of summaries, this article focuses on solving the OOV problem and the self-duplication and carries out a profiling study.[Method/process] Bases on the sequence-to-sequence model, a pointer generator module and a coverage processing module are added. An attempt is made to copy the OOV into abstractive summary to solve the problem of OOV by means of the pointer generator module. The coverage processing module tries to avoid the Attention Mechanism paying attention to the same position repeatedly to solve the duplicate problem. The model is applied to the Chinese summarization dataset LCSTS to conduct experiments to test the effectiveness.[Result/conclusion] Experiment results show that the ROUGE of the generated summary is much higher than that of seq2seq model and extractive model, indicating that in the abstractive Chinese text summary, the pointer generator module and the coverage mechanism module can effectively solve the problem of OOV and the repetition of the summary, thereby significantly improving text summary quality.

  • Research on Scale Adaptation of Text Sentiment Analysis Algorithm in Big Data Environment: Using Twitter as Data Source

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-07-26 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] This paper aims to study the scale adaptation problem for the purpose of textual sentiment analysis in big data environment. The paper provides reference for the best choice between efficiency and cost when researchers in the field of information science carry out data analysis under big data environment. [Method/process] We use the Sentiment140 dataset of Stanford University. Based on the analysis of traditional sentiment analysis algorithms, we propose five textual sentiment analysis algorithms for big data to test the adaptation effectiveness of various algorithms under different environments and data sizes, and conduct empirical comparisons in terms of accuracy, scalability and efficiency. [Result/conclusion] The experimental results show that the cluster built in this paper has good operational efficiency, correctness, and scalability. Spark clusters have more efficiency advantages in processing large-scale text sentiment analysis data, and with increasing the data size, its efficiency advantage is more obvious. In resource utilization, as the number of nodes and cores increase, the overall operating efficiency of the cluster changes significantly. We find the configuration of five slave nodes with 4 cores and 4G memory can achieve the effect of saving resource costs while efficiently completing the classification task.

  • Propagation Prediction of Police Microblog Entries Based on Heterogeneous Information Network

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] This study aimed to predict whether microblog users would retweet or comment on the microblog entries containing wanted information. We also evaluated the important features that affected the spread of wanted microblog entries to help the public security departments improve their operation performance and enhance the communication and cooperation between the police and the public. [Method/process] Based on the characteristics of the wanted microblogging, we combined user features, time features and structure features, and extracted event features in microblog entries, such as location keywords, time keywords, the wanted level and so on. The Xgboost algorithm was used to calculate the importance of different features in the retweet and comment prediction. In combination with the features of transmission network and node attributes, we trained and evaluated a prediction model based on heterogeneous information network embedding. [Result/conclusion] The values of the AUC in retweeting and commenting data sets are 0.737 and 0.799 respectively. As the model integrated network structure characteristics and different nodes' attributes, it was closer to the heterogeneous information network in reality and had higher accuracy than the traditional link prediction model. In addition, the result of features' importance showed that the keyword features of the proposed event features had the highest importance among all the features that affected the prediction of microblog entries retweeted and commented.

  • Research on the Collaboration of Security & Safety Intelligence Work in the Big Data Environment——Taking Counter-Terrorism Intelligence Work as an Example

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] Big data puts forward high requirements for security & safety intelligence work collaboration. A study on the problems and solutions in security & safety intelligence work collaboration is helpful for departments related to security & safety intelligence to work together and to improve the effectiveness of security & safety intelligence work. [Method/process] This paper discussed the possible problems of security & safety intelligence cooperation under the big data environment. Taking counter-terrorism intelligence as an example, combined with the process of intelligence work, it analyzed the main body and cooperation needs of security & safety intelligence work, and put forward the cooperation scheme of counter-terrorism intelligence work. [Result/conclusion] Under the guidance of the counter-terrorism intelligence demand issued by the counter-terrorism leading group, the Ministry of Public Security and other professional departments cooperate with the People's Bank of China, the Ministry of Transportation, the Ministry of Industry and Information Technology, the General Administration of Customs and other general business departments, as well as the financial, transportation, telecommunications, and medical industries and non-profit sectors, the masses and other social forces to collect, process, analyze, apply and deliver counter-terrorism intelligence in specific fields.

  • Research on the Model of Adversarial Entity Relation Extraction in Cross-Lingual Context

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] From the perspective of entity relation extraction, the knowledge acquisition task in a single language context is extended to a cross-language context, and the relation extraction effect of low-resource languages is improved.[Method/process] This paper proposed a Cross-Lingual Adversarial Relation Extraction (CLARE) framework, which decomposed cross-lingual relation extraction into parallel corpus acquisition and adversarial adaptation relation extraction. Through dictionary expansion or self-learning methods, the source language relation extraction data set was converted into the target language data set. On this basis, the feature representation of the source language was transferred to the target language using adversarial feature adaptation, and then the target language relation extraction network obtained by training was used to classify the target language.[Result/conclusion] The method in this paper is applied to the English-Chinese and Chinese-English cross-lingual relation extraction task based on the ACE2005 multilingual dataset. The Macro-F1 values of the optimal models on the two tasks are 0.880 1 and 0.842 2 respectively, indicating that the proposed CLARE framework for cross-language adversarial relation extraction can significantly improve the effect of low-resource language entity relation extraction. The research results are of great significance for improving the relation extraction model in the cross-lingual context and promoting the application of entity relation extraction research in the field of information science.

  • Emergency Severity Assessment and Early Warning Mechanism in the Social Media Environment

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] Due to the rapid spread and deterioration of public emergencies, emergency management departments need to evaluate the severity of emergencies in real time and establish a scientific early warning mechanism. Microblogging and other social media platforms provide rich clues for the real-time study and judgment of emergencies. [Method/process] This study constructed the severity assessment indexes of emergencies from the dimensions of the netizens’ role, the Internet media’s role, the spread of events, attitudes and feelings of netizens. The methods of analyzing the influence tendency of the emergency severity indexes and comparing the features were put forward. A total of 1 107 308 microblogging entries regarding four social security incidents of "8.24 Yueqing Girl Riding Murder" "5.6 Zhengzhou Stewardess Taxi Murder" "8.27 Kunshan Knife-Cutting Case" in 2018 and "Ctrip Kindergarten Abuse Incident" in 2017 were investigated. [Result/conclusion] The results of the study establish a quantitative classification standard for public emergencies, and provide method guidance and data support for the governments to take emergency management measures in time.

  • User Role Formation and Transformation of Socialized Q&A Platforms in the Context of Infectious Disease Outbreaks:Taking the Zhihu Platform as an Example

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/Significance] To explore the user role classification methods, key factors of role formation, transformation characteristics and differences of the Q&A platforms in the context of infectious disease outbreaks. [Method/Process] A total of 702,927 data related to Covid-19 epidemic were collected from Q&A platforms. The user roles were analyzed from the dimensions of participation and value. The influencing factors of community user role formation were constructed based on the information user factor, information factor and information environment factor. The key factors affecting the formation of different roles were analyzed by combining the multi-classification model and the SHapley Additive exPlanations (SHAP) model. The FP-growth association rule algorithm was used to mine behavior patterns and topic characteristics during the transformation of different roles. [Result/Conclusion] The results show that users tend to keep their roles unchanged, and the transformation direction is mainly towards active or diving roles. The amount of information is the key factor for the formation of different roles. There are significant differences in the extent of change in user role transformation characteristics in different transformation stages and user role transformation behaviors in all transformation stages.