CCF YOCSEF哈尔滨
“大数据 之 图数据计算模型和框架”
学术报告会 邀请函
中国计算机学会青年计算机科技论坛-哈尔滨分论坛
CCF Young Computer Scientists & Engineers Forum-Harbin Branch
CCF YOCSEF-HARBIN
于
哈尔滨工程大学 21号楼4楼426室
举行 学术报告会,敬请光临
报告会题目
大数据 之 图数据计算模型和框架
程 序
8:00—8:30 签到;
8:30—11:30 报告会
特邀讲者:于 旭 教授,香港中文大学
演讲题目:I/O Efficient: Computing SCCs in Massive Graphs
特邀讲者:邹 磊 副教授,北京大学
演讲题目:基于图的海量RDF数据管理
11:30-11:40 全体合影
12:00-14:00 午餐
执行主席:
潘海为 哈尔滨工程大学 副教授,YOCSEF哈尔滨主席
张志强 哈尔滨工程大学 教授,YOCSEF哈尔滨 荣誉委员
张可佳 哈尔滨工程大学 讲师,YOCSEF哈尔滨 委员
参加人员:IT领域专业人士、媒体、其他有兴趣者
大数据 之 图数据计算模型和框架
“大数据”时代已经降临,在商业、经济及其他领域中,决策将日益基于数据和分析而作出,而并非基于经验和直觉。图数据是大数据的重要组成部分,研究重点关注的是数据的关联性。图数据不仅仅是指互联网上的图数据。现在应用处理的很多数据都具有图数据的特性,而且这种趋势随着应用及数据的日趋复杂变得愈来愈明显。如何有效地从数据中获得有价值的信息,势必使数据的关联性得到更多的关注。对于大量的关联性操作来说,关系型数据库的处理能力有限,因此作为非关系型数据库NoSQL重要分支的图数据库应运而生。本次报告会我们邀请了国内外知名学者分享188体育投注:大数据时代的图数据的计算模型和框架,图数据处理方法,并进行深入的研讨。
特邀讲者: 于旭
Dr Jeffrey Xu Yu is a Professor in the Department of Systems Engineering and Engineering Management, the
报告题目:
I/O Efficient: Computing SCCs in Massive Graphs
报告提要:
A strongly connected component (SCC) is a maximal subgraph of a directed graph G in which every pair of nodes are reachable from each other in the SCC. With such a property, a general directed graph can be represented by a directed acyclic graph (DAG) by contracting an SCC of G to a node in DAG. In many real applications that need graph pattern matching, topological sorting, or reachability query processing, the best way to deal with a general directed graph is to deal with its DAG representation. Therefore, finding all SCCs in a directed graph G is a critical operation. The existing in-memory algorithms based on depth first search (DFS) can find all SCCs in linear time w.r.t. the size of a graph. However, when a graph cannot resident entirely in the main memory, the existing external or semi-external algorithms to find all SCCs have limitation to achieve high I/O efficiency. In this talk, we discuss new I/O efficient semi-external algorithms to find all SCCs for a massive directed graph G that cannot reside in main memory entirely. To overcome the deficiency of the existing DFS based semi-external algorithm that heavily relies on a total order, we explore a weak order based on which we investigate new algorithms. We give a new two phase algorithm, namely, tree construction and tree search. In the tree construction phase, a spanning tree of G can be constructed in bounded sequential scans of G. In the tree search phase, it needs to sequentially scan the graph once to find all SCCs. In addition, we discuss a new single phase algorithm, which combines the tree construction and tree search phases into a single phase, with three new optimization techniques. They are early acceptance, early rejection, and batch processing. By the single phase algorithm with the new optimization techniques, we can significantly reduce the number of I/Os and CPU cost.
特邀讲者: 邹磊
邹磊,副教授,男,1981年7月出生,安徽安庆人。邹磊于2003年和2009年毕业于华中科技大学计算机科学与技术学院,分别获得工学学士学位和工学博士学位。攻读博士期间先后访问香港科技大学计算机科学与工程系和加拿大滑铁卢大学计算机系。其博士学位论文《图数据库中的子图查询方法研究》于2010年获得中国计算机学会优秀博士学位论文提名奖和湖北省优秀博士学位论文奖。邹磊于2009年7月入职北京大学计算机科学技术研究所,任讲师。其于2012年8月晋升为副教授,硕士生导师。目前主要方向为“海量图数据的管理”和“基于图的RDF数据管理”等数据库研究领域。详细信息见其个人主页 http://www.icst.pku.edu.cn/intro/leizou/index.html。
报告题目:
基于图的海量RDF数据管理
报告提要:
RDF(资源描述框架)是W
所有评论仅代表网友意见