关键词分析(关键词分析包括哪几个方面)-第1张

那时为他们如是说CiteSpace的应用领域——关键字预测(1)。主要就文本是关键字预测第三部份的文本——控制点的求出。

Today we introduce you to the application of CiteSpace - keyword analysis (1). The main content is the first part of the keyword analysis - the derivation of clusters.

1

Q值和S值的重新认识

关键词分析(关键词分析包括哪几个方面)-第2张

这儿特别注意三个难题,每天在Node Types的优先选择上,根本无法优先选择三个展开预测,优先选择author,即聚合译者共现图表;优先选择institution,即聚合政府机构共现图表;优先选择keyword,即聚合关键字共现图表。那时他们优先选择keyword,展开关键字重大贡献图表的预测。

控制点j位数越小,控制点中包涵的关键字越多,每一控制点是数个求逆共同组成的,能透过控制点求出调查报告上看每一控制点中包涵了什么样关键字。

控制点前会再次出现Q值和S值,这三个值代表者控制点效用的优劣,通常指出:

Modularity:控制点组件值(Q值)>0.3意味着控制点结构显著

Silhouette:控制点平均轮廓值(S值)>0.5意味着控制点合理,>0.7意味着控制点是令人信服的。

在这儿,Q值=0.8669,大于0.3,所以控制点结构显著;S值=0.9639,大于0.5,意味着控制点结果是令人信服的。

每一控制点都是共现网络中的关键字,CiteSpace将关系紧密的关键字展开控制点,给每三个关键字三个值,同一控制点中值最大的就是该类中的代表者。

Note one issue here, each time on the selection of Node Types, only one can be selected for analysis, choose author, that is, to generate author co-occurrence mapping; choose institution, that is, to generate institution co-occurrence mapping; choose keyword, that is, to generate keyword co-occurrence mapping. Today, we choose keyword to analyze the keyword contribution map.

The smaller the cluster number, the more keywords are included in the clusters, and each cluster is composed of multiple associated words, so you can see which keywords are included in each cluster through the cluster export report.

After clustering, Q and S values appear, which represent the effectiveness of clustering, and are generally considered to be

Modularity: clustering module value (Q value) > 0.3 means significant clustering structure

Silhouette: clustering average contour value (S value) >0.5 means the clustering is reasonable and >0.7 means the clustering is convincing.

Here, Q value = 0.8669, which is greater than 0.3, so the clustering structure is significant, and S value = 0.9639, which is greater than 0.5, implying that the clustering results are convincing.

Each cluster is a keyword in the co-occurrence network, and CiteSpace clusters the closely related keywords, giving each keyword a value, and the one with the largest value in the same cluster is the representative in that class.

2

控制点功能实现的步骤

关键词分析(关键词分析包括哪几个方面)-第3张

控制点有三种算法:LSI,LLR,MI,通常就使用LLR算法展开预测。

方法一:在菜单栏优先选择Clusters,然后点击其中的Summary of clusters and save labels as whitelists,就能得到下面的控制点表。其中,top terms列为按照LSI算法展开计算的结果,下一列为按照LLR算法展开计算的结果,再下一列为按照Mi算法展开计算的结果

There are three algorithms for clustering: LSI, LLR, MI, and generally the LLR algorithm is used for analysis.

Method 1: Select Clusters in the menu bar, and then click Summary of clusters and save labels as whitelists, you can get the following clustering table. The top terms column is the result of the LSI algorithm, the next column is the result of the LLR algorithm, and the next column is the result of the Mi algorithm.

关键词分析(关键词分析包括哪几个方面)-第4张
关键词分析(关键词分析包括哪几个方面)-第5张

方法二:在菜单栏优先选择Clusters,然后点击clusters explorer,优先选择打开即可得到与上述方法同样的控制点表。

Method 2: Select Clusters in the menu bar, then click Clusters explorer and select Open to get the same clustering table as the above method.

参考资料:文字:百度;图片:微博;翻译:百度翻译

本文由LearningYard新学苑原创,部份图片文字来自网络,如有侵权请联系。