2019-07-28 12:01 已编辑北京理工大学算法工程师

关注

社交网络影响力最大化——线性阈值模型（LT模型）算法实现（Python实现）

１、环境配置

２、LT传播模型算法实现

３、LT传播模型算法测试

４、测试文件Wiki-Vote.txt数据

社交网络影响力最大化——线性阈值模型（LT模型）算法实现（Python实现）

１、环境配置

环境配置：Win7 Pycharm Anaconda2

该算法每个节点的阈值设为 0.5

２、LT传播模型算法实现

linear_threshold.py (LT传播模型算法)

# -*- coding: utf-8 -*-
"""
Implement linear threshold models
社交网络影响力最大化 传播模型——线性阈值（LT）模型算法实现
"""
import copy
import itertools
import random
import math
import networkx as nx

__all__ = ['linear_threshold']

#-------------------------------------------------------------------------
#  Some Famous Diffusion Models
#-------------------------------------------------------------------------

def linear_threshold(G, seeds, steps=0):           #LT线性阈值算法
  """
  Parameters
  ----------
  G : networkx graph                     #所有节点构成的图
      The number of nodes.

  seeds: list of nodes                   #子节点集
      The seed nodes of the graph

  steps: int                             #激活节点的层数（深度），当steps<=0时，返回子节点集能激活的所有节点
      The number of steps to diffuse
      When steps <= 0, the model diffuses until no more nodes
      can be activated

  Return
  ------
  layer_i_nodes : list of list of activated nodes
    layer_i_nodes[0]: the seeds                  #子节点集
    layer_i_nodes[k]: the nodes activated at the kth diffusion step   #该子节点集激活的节点集

  Notes
  -----
  1. Each node is supposed to have an attribute "threshold".  If not, the
     default value is given (0.5).    #每个节点有一个阈值，这里默认阈值为：0.5
  2. Each edge is supposed to have an attribute "influence".  If not, the
     default value is given (1/in_degree)  #每个边有一个权重值，这里默认为：1/入度

  References
  ----------
  [1] GranovetterMark. Threshold models of collective behavior.
      The American journal of sociology, 1978.
  """

  if type(G) == nx.MultiGraph or type(G) == nx.MultiDiGraph:
      raise Exception( \
          "linear_threshold() is not defined for graphs with multiedges.")

  # make sure the seeds are in the graph
  for s in seeds:
    if s not in G.nodes():
      raise Exception("seed", s, "is not in graph")

  # change to directed graph
  if not G.is_directed():
    DG = G.to_directed()
  else:
    DG = copy.deepcopy(G)        # copy.deepcopy 深拷贝 拷贝对象及其子对象

  # init thresholds
  for n in DG.nodes():
    if 'threshold' not in DG.node[n]:
      DG.node[n]['threshold'] = 0.5
    elif DG.node[n]['threshold'] > 1:
      raise Exception("node threshold:", DG.node[n]['threshold'], \
          "cannot be larger than 1")

  # init influences
  in_deg = DG.in_degree()       #获取所有节点的入度
  for e in DG.edges():
    if 'influence' not in DG[e[0]][e[1]]:
      DG[e[0]][e[1]]['influence'] = 1.0 / in_deg[e[1]]    #计算边的权重
    elif DG[e[0]][e[1]]['influence'] > 1:
      raise Exception("edge influence:", DG[e[0]][e[1]]['influence'], \
          "cannot be larger than 1")

  # perform diffusion
  A = copy.deepcopy(seeds)
  if steps <= 0:
    # perform diffusion until no more nodes can be activated
    return _diffuse_all(DG, A)
  # perform diffusion for at most "steps" rounds only
  return _diffuse_k_rounds(DG, A, steps)

def _diffuse_all(G, A):
  layer_i_nodes = [ ]
  layer_i_nodes.append([i for i in A])
  while True:
    len_old = len(A)
    A, activated_nodes_of_this_round = _diffuse_one_round(G, A)
    layer_i_nodes.append(activated_nodes_of_this_round)
    if len(A) == len_old:
      break
  return layer_i_nodes

def _diffuse_k_rounds(G, A, steps):
  layer_i_nodes = [ ]
  layer_i_nodes.append([i for i in A])
  while steps > 0 and len(A) < len(G):
    len_old = len(A)
    A, activated_nodes_of_this_round = _diffuse_one_round(G, A)
    layer_i_nodes.append(activated_nodes_of_this_round)
    if len(A) == len_old:
      break
    steps -= 1
  return layer_i_nodes

def _diffuse_one_round(G, A):
  activated_nodes_of_this_round = set()
  for s in A:
    nbs = G.successors(s)
    for nb in nbs:
      if nb in A:
        continue
      active_nb = list(set(G.predecessors(nb)).intersection(set(A)))
      if _influence_sum(G, active_nb, nb) >= G.node[nb]['threshold']:
        activated_nodes_of_this_round.add(nb)
  A.extend(list(activated_nodes_of_this_round))
  return A, list(activated_nodes_of_this_round)

def _influence_sum(G, froms, to):
  influence_sum = 0.0
  for f in froms:
    influence_sum += G[f][to]['influence']
  return influence_sum

３、LT传播模型算法测试

test_linear_threshold.py（LT模型算法测试）

#!/usr/bin/env python
# coding=UTF-8                 #支持中文字符需要添加  coding=UTF-8
from nose.tools import *
from networkx import *
from linear_threshold import *
import time
"""Test Diffusion Models
----------------------------
"""
if __name__=='__main__':
    start=time.clock()
    datasets=[]
    f=open("Wiki-Vote.txt","r")        #读取文件数据（边的数据）
    data=f.read()
    rows=data.split('\n')
    for row in rows:
      split_row=row.split('\t')
      name=(int(split_row[0]),int(split_row[1]))
      datasets.append(name)            #将边的数据以元组的形式存放到列表中

    G=networkx.DiGraph()               #建立一个空的有向图G
    G.add_edges_from(datasets)         #向有向图G中添加边的数据列表
    layers=linear_threshold(G,[6],2)     #调用LT线性阈值算法，返回子节点集和该子节点集的最大激活节点集
    del layers[-1]
    length=0
    for i in range(len(layers)):
        length =length+len(layers[i])
    lengths=length-len(layers[0])       #获得子节点的激活节点的个数（长度）
    end=time.clock()
    #测试数据输出结果
    print(layers)  #[[25], [33, 3, 6, 8, 55, 80, 50, 19, 54, 23, 75, 28, 29, 30, 35]]
    print(lengths) #15
    print('Running time: %s Seconds'%(end-start))  #输出代码运行时间

４、测试文件Wiki-Vote.txt数据

注释：测试文件Wiki-Vote.txt数据如下（每组数据代表图的有向边）

# FromNodeId ToNodeId

30 1412

30 3352

30 5254

30 5543

30 7478

3 28

3 30

3 39

3 54

3 108

3 152

3 178

3 182

3 214

3 271

3 286

3 300

3 348

3 349

3 371

3 567

3 581

3 584

3 586

3 590

3 604

3 611

3 8283

25 3

25 6

25 8

25 19

25 23

25 28

25 29

25 30

25 33

25 35

25 50

25 54

25 55

25 75

25 80

全部评论

推荐最新楼层

不愿透露姓名的神秘牛友

07-08 14:15

谁懂，同期实习生中只有我做dirty work？

同期的实习生比我早来三天，我们两人被mentor安排的任务技术含量完全不同，也许是因为她之前的实习经验比我多一点？所以她的活都比较有挑战性，而且很容易做出彩的项目，相反我的工作都是重复性极强，很机械，根本写不到简历里。被区别对待的感觉很不爽！我还特意私下找了mt评理，但他也没有给我什么实质性的回答，我现在只能每天不停地调整好自己的心态，熬完三个月就跑路吧！

不是上谷：1.两个人面试表现不同，简历不同，能力不同 2.社会本质是人组成的，你组长或者导师更愿意留下他转正 3.实习给你的小任务，你没有很好的完成，你导师评估你无法承担更难的任务。三点都有可能。本质上还是你自己的原因

实习生的蛐蛐区

点赞评论收藏

不愿透露姓名的神秘牛友

07-09 12:02

活久见，遇见真boss了

ssob上原来真有BOSS啊

硫蛋蛋：这种也是打工的，只不是是给写字楼房东打工

点赞评论收藏

06-17 21:57

门头沟学院 Java

哥们👯‍♂️，我成玩具了？

白友：噗嗤，我发现有些人事就爱发这些，明明已读不回就行了，就是要恶心人

点赞评论收藏

不愿透露姓名的神秘牛友

昨天 12:31

我从来没想过我会出轨

以前小时候我最痛恨出轨、偷情的人，无论男女，为什么会出轨？现在我成了自己最讨厌的人，没想到分享的东西在牛客会被这么多人看，大家的评价都很中肯，我也认同，想过一一回复，但我还是收声了，我想我应该说说这件事，这件事一直压在我心里，是个很大的心结，上面说了人为什么出轨，我大概能明白了。我们大一下半年开始恋爱，开始恋爱，我给出了我铭记3年的承诺，我对她好一辈子，我永远不会背叛，我责任心太重，我觉得跟了我，我就要照顾她一辈子，我们在一起3年我都没有碰过她，她说往东我就往东，她说什么我做什么，她要我干什么，我就干什么！在学校很美好，中途也出过一些小插曲，比如男闺蜜、男闺蜜2号等等等。但我都强迫她改掉了，我...

牛客刘北：两个缺爱的人是没有办法好好在一起的，但世界上哪有什么是非对错？你后悔你们在一起了，但是刚刚在一起的美好也是真的呀，因为其他人的出现，你开始想要了最开始的自己，你的确对不起自己，21岁的你望高物远，你完全可以不谈恋爱，去过你想要的生活，你向往自由，在一起之后，你要想的不是一个人，而是两个人，你不是变心了，就像你说的，你受够了，你不想包容了，冷静几天是你最优的选择，爱人先爱己。

社会教会你的第一课

点赞评论收藏