Python字典排序问题

字典的问题

navagation:

1.问题来源
2.dict的学习
*3.numpy的应用

1.问题来源

在做cs231n,assigment1-kNN实现的时候，需要对一个列表中的元素进行计数，并找出个数最多的元素

问题本身不是很难，但是运用python字典dict发现自己对字典的理解还是有些欠缺

def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.
    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.
    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in range(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      #########################################################################
      # TODO:                                                                 #
      # Use the distance matrix to find the k nearest neighbors of the ith    #
      # testing point, and use self.y_train to find the labels of these       #
      # neighbors. Store these labels in closest_y.                           #
      # Hint: Look up the function numpy.argsort.                             #
      #########################################################################
      order=np.argsort(dists[i,:])
      closest_y=self.y_train[order[:k]]
      #########################################################################
      # TODO:                                                                 #
      # Now that you have found the labels of the k nearest neighbors, you    #
      # need to find the most common label in the list closest_y of labels.   #
      # Store this label in y_pred[i]. Break ties by choosing the smaller     #
      # label.                                                                #
      #########################################################################
      #closest_dict=dict(closest_y)
      #dict_key=sorted(closest_dict.keys(),reverse=True)
      #y_pred[i]=dict_key[0]
      ########## 转换成字典这个想法确实不太对  #####################
      c=Counter(closest_y)
      y_pred[i]=sorted(c.keys(),reverse=True)[0]
      ########## 利用Counter来生成字典  #####################
      #y_pred[i]=np.argmax(np.bincount(closest_y))
    return y_pred

2.字典dict

字典的生成

利用dict函数

a = dict(one=1, two=2, three=3)
b = {'one': 1, 'two': 2, 'three': 3}
c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
d = dict([('two', 2), ('one', 1), ('three', 3)])
e = dict({'three': 3, 'one': 1, 'two': 2})

利用Counter可以实现将 array 转换成计数dict

from collections import Counter
a=[4，8，4，7，9，9]
dict_a=Counter(a)
print(dict_a)

#dict{'4':2,'7':1,'8':1,'9':2}

原始的字典中元素是不存在顺序的。

形如 dic = {'a':1 , 'b':2 , 'c': 3},字典中的元素没有顺序，所以dic[0]是有语法错误的

字典排序的实现

根据键进行排序
根据键值进行排序

使用内建的sorted()函数

sorted(iterable ,key=None, reverse=False)

key 排序的根据，可以是键，可以是键值
reverse 是升序还是降序，默认False是升序排序

# 常用形式
dic = {'a':3 , 'b':2 , 'c': 1}
sorted(dic.items(),key=lambda d:d[0],reverse=True)
# {'c':1,'b':2,'a':3}
sorted(dic.items(),key=lambda d:d[1],reverse=True)
# 这个是按字典的第二个元素来进行排序
# {'a':3,'b':2,'c':1}

注意

无论怎么进行排序，字典还是原来的字典顺序

python2中的迭代对象是 iteritems()

items() 迭代器返回字典的键值对,

>>> dict.items()
dict_items([('a', 3), ('b', 2), ('c', 1)])

lambda表达式

#创建一个匿名函数对象
>>> func=lambda x:x+2
>>> func(2)
4

在函数sorted(dic.items(), key = lambda asd:asd[1])中，第一个参数传给第二个参数“键-键值”，第二个参数取出其中的键([0])或键值(1])

3.numpy 函数使用

找到数组中出现次数最多的元素，在vote投票函数中很有用

使用bincount函数实现更简单

>>> import numpy as np
>>> y=[3,3,5,6,7,9,9,8,8,8]
>>> np.bincount(y)
array([0, 0, 0, 2, 0, 1, 1, 1, 3, 2])
>>> np.argmax(np.bincount(y)) # 找到元素
8
>>>