如何在sklearnkmeans中绘制成本/惯性值？

html5 • 2022年11月25日 pm9:44 • 问答

是否可以绘制kmeans成本值？我想根据 kmeans 的迭代绘制成本值，如下图所示

你能参考一些相关的线程吗？谢谢

回答

Kmeans 中的惯性

通过cost我假设你要绘制的惯性值是发生在K均值运行每个迭代。

K-means 算法旨在选择使惯量最小的质心，或簇内平方和准则。惯性可以被认为是衡量集群内部相干程度的度量。

这是 KMeans 试图在每次迭代中最小化的内容。

更多细节在这里。

每次迭代打印惯性值

您可以KMeans()通过使用拟合 a 后获得最终惯性值，kmeans.inertia_但如果您想获得每次迭代的惯性值，一种方法是设置verbose=2.

def train_kmeans(X):
    kmeans = KMeans(n_clusters=5, verbose=2, n_init=1)
    kmeans.fit(X)
    return kmeans

X = np.random.random((1000,7))
train_kmeans(X)

Initialization complete
Iteration 0, inertia 545.5728914456803
Iteration 1, inertia 440.5225419317938
Iteration 2, inertia 431.87478970379755
Iteration 3, inertia 427.52125502838504
Iteration 4, inertia 425.75105209622967
Iteration 5, inertia 424.7788124997543
Iteration 6, inertia 424.2111904252263
Iteration 7, inertia 423.7217490965455
Iteration 8, inertia 423.29439165408354
Iteration 9, inertia 422.9243615021072
Iteration 10, inertia 422.54144662407566
Iteration 11, inertia 422.2677910840504
Iteration 12, inertia 421.98686844470336
Iteration 13, inertia 421.76289612029376
Iteration 14, inertia 421.59241427498324
Iteration 15, inertia 421.36516415785724
Iteration 16, inertia 421.23801796298704
Iteration 17, inertia 421.1065220191125
Iteration 18, inertia 420.85788031236586
Iteration 19, inertia 420.6053961581343
Iteration 20, inertia 420.4998816171483
Iteration 21, inertia 420.4436034595902
Iteration 22, inertia 420.39833211852346
Iteration 23, inertia 420.3583721574586
Iteration 24, inertia 420.32684273674226
Iteration 25, inertia 420.2786269304449
Iteration 26, inertia 420.24149714604516
Iteration 27, inertia 420.22255866139835
Iteration 28, inertia 420.2075247585145
Iteration 29, inertia 420.19985517233584
Iteration 30, inertia 420.18983415887305
Iteration 31, inertia 420.18584733421886
Converged at iteration 31: center shift 8.716337631121295e-33 within tolerance 8.370287188573764e-06

注意： KMeans 多次重新初始化其质心，并在max_iters每次初始化时运行到。对于单个惯性值列表，您必须设置n_iter=1以确保在拟合模型期间进行单个初始化。如果将 n_iter 设置为更高的值，您将在打印输出中看到多个初始化和迭代列表。

为每次迭代绘制惯性值

问题是，（据我所知）无法使用 sklearn 中的参数将这些惯性值存储到变量中。因此，您可能需要围绕它编写一个包装器，将详细的标准输出重定向到作为文本的输出变量中，然后为每次迭代提取惯性值。

您可以使用StringIO从verbose=2、提取和绘图中捕获此打印输出。

这是完整的代码 -

import io
import sys
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

#Dummy data
X = np.random.random((1000,7)) 

def train_kmeans(X):
    kmeans = KMeans(n_clusters=5, verbose=2, n_init=1) #<-- init=1, verbose=2
    kmeans.fit(X)
    return kmeans

#HELPER FUNCTION
#Takes the returned and printed output of a function and returns it as variables
#In this case, the returned output is the model and printed is the verbose intertia at each iteration

def redirect_wrapper(f, inp):
    old_stdout = sys.stdout
    new_stdout = io.StringIO()
    sys.stdout = new_stdout

    returned = f(inp)                #<- Call function
    printed = new_stdout.getvalue()  #<- store printed output

    sys.stdout = old_stdout
    return returned, printed


returned, printed = redirect_wrapper(train_kmeans, X)

#Extract inertia values
inertia = [float(i[i.find('inertia')+len('inertia')+1:]) for i in printed.split('n')[1:-2]]

#Plot!
plt.plot(inertia)

编辑：我已经更新了我的答案来编写一个通用的辅助函数，它调用一个给定的函数（返回并打印一些东西）并分别返回它的打印数据和返回的数据。在这种情况下，返回模型并将打印的内容作为文本存储在变量中。

以上是如何在sklearnkmeans中绘制成本/惯性值？的全部内容。

THE END

二维码

如何从头开始制作Squeak课程？

< <上一篇

哪个libc时间给了我一天中的时间？

下一篇>>

搜索内容

如何在sklearnkmeans中绘制成本/惯性值？

回答

Kmeans 中的惯性

每次迭代打印惯性值

为每次迭代绘制惯性值

目录

目录

推荐文章

最新文章