python数据分析可视化案例基于kmeans算法的工业汽车数据分析案例.zip

共26个文件

xml：7个

py：7个

csv：3个

版权申诉

数据分析

数据可视化

kmeans算法

27 浏览量 2022-12-07 19:04:32 上传评论 1 收藏 1.04MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

工业汽车数据分析案例.zip （26个子文件）

Industrial_vehicle_data_analysis-master

工业汽车数据集应用实践

实验数据

Auto_Data_Labels.csv 3KB

Automobile price data _Raw_.csv 26KB

Auto_Data_Features.csv 64KB

预处理

预处理之后的数据.xlsx 25KB

数据预处理.py 1KB

聚类分析

将要聚类分析的数据.xlsx 16KB

聚类可视化.png 42KB

聚类分析.py 2KB

特征工程

特征工程.py 2KB

.idea

code.iml 474B

misc.xml 185B

csv-plugin.xml 909B

modules.xml 260B

.gitignore 47B

inspectionProfiles

profiles_settings.xml 174B

课程代码

ApplicationOfClustering.ipynb 711KB

kmeans案例.py 1KB

肘方法的应用.py 1KB

.idea

课程代码.iml 441B

misc.xml 185B

modules.xml 276B

.gitignore 47B

inspectionProfiles

profiles_settings.xml 174B

IntroToUnsupervisedLearning.ipynb 592KB

不同数据集的聚类及CH系数应用.py 1KB

图片压缩实战.py 2KB

import numpy as np import matplotlib.pyplot as plt from sklearn.cluster import KMeans from sklearn.metrics import pairwise_distances_argmin from sklearn.datasets import load_sample_image from sklearn.utils import shuffle from time import time n_colors = 64 # Load the Summer Palace photo china = load_sample_image("china.jpg") # Convert to floats instead of the default 8 bits integer coding. Dividing by # 255 is important so that plt.imshow behaves works well on float data (need to # be in the range [0-1]) china = np.array(china, dtype=np.float64) / 255 # Load Image and transform to a 2D numpy array. w, h, d = original_shape = tuple(china.shape) assert d == 3 image_array = np.reshape(china, (w * h, d)) print("Fitting model on a small sub-sample of the data") t0 = time() image_array_sample = shuffle(image_array, random_state=0)[:1000] kmeans = KMeans(n_clusters=n_colors, random_state=0).fit(image_array_sample) print("done in %0.3fs." % (time() - t0)) # Get labels for all points print("Predicting color indices on the full image (k-means)") t0 = time() labels = kmeans.predict(image_array) print("done in %0.3fs." % (time() - t0)) # codebook_random = shuffle(image_array, random_state=0)[:n_colors + 1] # print("Predicting color indices on the full image (random)") # t0 = time() # labels_random = pairwise_distances_argmin(codebook_random, # image_array, # axis=0) # print("done in %0.3fs." % (time() - t0)) def recreate_image(codebook, labels, w, h): """Recreate the (compressed) image from the code book & labels""" d = codebook.shape[1] image = np.zeros((w, h, d)) label_idx = 0 for i in range(w): for j in range(h): image[i][j] = codebook[labels[label_idx]] label_idx += 1 return image # Display all results, alongside original image plt.figure(1) plt.clf() ax = plt.axes([0, 0, 1, 1]) plt.axis('off') plt.title('Original image (96,615 colors)') plt.imshow(china) plt.figure(2) plt.clf() ax = plt.axes([0, 0, 1, 1]) plt.axis('off') plt.title('Quantized image (64 colors, K-Means)') plt.imshow(recreate_image(kmeans.cluster_centers_, labels, w, h)) # plt.figure(3) # plt.clf() # ax = plt.axes([0, 0, 1, 1]) # plt.axis('off') # plt.title('Quantized image (64 colors, Random)') # plt.imshow(recreate_image(codebook_random, labels_random, w, h)) plt.show()

评论收藏

内容反馈

版权申诉