2019-07-26 01:57 已编辑北京邮电大学算法工程师

关注

4.数据可视化：Visualing earnings based on college majors.(2010-2012)

The dataset is stored in recent-grads.csv file.It contains information on earnings of college majors in US from 2010 to 2012.

It can be download form here：https://github.com/fivethirtyeight/data/tree/master/college-majors

In this project,I will explore the dataset and try to find some patterns in the earning of majors then plot it use matplotlib library.

代码使用jupyter完成：
读取数据：

import pandas as pd

recent_grads=pd.read_csv('./data/recent-grads.csv')
recent_grads.columns
print(recent_grads.info())
print(recent_grads.describe())
print(recent_grads.head(1))

处理缺失值:

raw_data_count=recent_grads.shape[0]
print(raw_data_count)
cleaned_data_count=recent_grads.dropna().shape[0]
print(cleaned_data_count)

==>>173
172
绘制散点图，查看各属性之间的关系：

import matplotlib.pyplot as plt
%matplotlib inline

recent_grads.plot(x='Full_time',y='Median',kind='scatter')
recent_grads.plot(x='Unemployed',y='Median',kind='scatter')
recent_grads.plot(x='Men',y='Median',kind='scatter')
recent_grads.plot(x='Women',y='Median',kind='scatter')

得到

我们继续绘制柱状图，查看各属性的分布情况：

columns=['Median','Employed','Employed','Unemployment_rate','Women','Men']
['Men'].hist()
fig=plt.figure(figsize=(6,18))
for i,col in enumerate(columns):
    ax=fig.add_subplot(6,1,i+1)
    ax=recent_grads[col].hist(color='orange')
plt.show()

为了更方便的查看就业人数与薪资的关系，使用scatter_matrix函数来构建散点图矩阵：

from pandas.tools.plotting import scatter_matrix
scatter_matrix(recent_grads[['Employed','Median']],figsize=(10,10),c=['red','blue'])

关于该矩阵的说明：

接下来不妨做些有意思的事情，分析一下薪资前10以及后10的专业中女生所占比例：

recent_grads[:10].plot.bar(x='Major',y='ShareWomen')
plt.legend(loc='upper left')
plt.title('The 10 highest paying majors.')
recent_grads[162:].plot(x='Major',y='ShareWomen',kind='bar')
plt.title('The 10 lowest paying majors.')

分析薪资较高的专业中的男女性别比例：

recent_grads[:10].plot.bar(x='Major',y=['Men','Women'])

全部评论

推荐最新楼层

07-04 14:51

门头沟学院机械结构工程师

准备去殡仪馆看监控了...

抛开工作地点，我觉得应该是个挺清闲的美差诶，而且不用值夜班还没有年龄歧视，这样的好工作不多了

点赞评论收藏

06-30 14:39

已编辑

广东白云学院测试工程师

求助，25届毕业生测试简历修改建议

简历被挂麻了，求建议

点赞评论收藏

06-26 17:24

已编辑

宁波大学 Java

某为od岗位

hr为了KPI演都不演了

迷失西雅图：别给，纯kpi，别问我为什么知道

点赞评论收藏

06-26 15:58

门头沟学院 Java

26双非，第一次做简历的我遇到了温柔学姐

今天逛了会boss，hr姐姐亲自联系我，青春猪头少年不会遇到温柔hr姐姐

点赞评论收藏

不愿透露姓名的神秘牛友

07-01 11:47

25届有多少人找到工作了...

我这破双非学校签了三方的也就30％，打开牛客还看到好多25届被毁offer的贴子，所以25届应该怎么活啊😅

缘愁似个长a：25时代的眼泪了，马上上热搜的是26届怎么活

点赞评论收藏

全站热榜

创作者周榜

正在热议

# 现代汽车前瞻技术研发急速编程挑战赛 #