博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Parallel Decision Tree
阅读量:7249 次
发布时间:2019-06-29

本文共 3235 字,大约阅读时间需要 10 分钟。

Decision Tree such as C4.5 is easy to parallel. Following is an example.

This is a non-parallel version:

public void learnFromDataSet(Iterable
> dataset){ for(Sample sample : dataset){ model.addSample((MapBasedBinarySample
)sample); } Queue
> Q = new LinkedList
>(); TreeNode
root = model.selectRootTreeNode(); model.addTreeNode(root); Q.add(root); while (!Q.isEmpty()){ TreeNode v = Q.poll(); if(v.getDepth() >= model.getMaxDepth()){ continue; } FeatureSplit
featureSplit = model.selectFeature(v); if(featureSplit.getFeatureId() == null){ continue; } v.setFeatureSplit(featureSplit); Pair
, TreeNode
> children = model.newTreeNode(v, featureSplit); TreeNode leftNode = children.getKey(); TreeNode rightNode = children.getValue(); if(leftNode != null && leftNode.getSampleSize() > model.getMinSampleSizeInNode()){ v.setLeft(leftNode); model.addTreeNode(leftNode); Q.add(leftNode); } if(rightNode != null && rightNode.getSampleSize() > model.getMinSampleSizeInNode()){ v.setRight(rightNode); model.addTreeNode(rightNode); Q.add(rightNode); } } }

And this is a parallel version:

public class NodeSplitThread implements Runnable{        private TreeNode
node = null; private Queue
> Q = null; public NodeSplitThread(TreeNode
node, Queue
> Q){ this.node = node; this.Q = Q; } @Override public void run() { if(node.getDepth() >= model.getMaxDepth()){ return; } FeatureSplit
featureSplit = model.selectFeature(node); if(featureSplit.getFeatureId() == null){ return; } node.setFeatureSplit(featureSplit); Pair
, TreeNode
> children = model.newTreeNode(node, featureSplit); TreeNode
leftNode = children.getKey(); TreeNode
rightNode = children.getValue(); if(leftNode != null && leftNode.getSampleSize() > model.getMinSampleSizeInNode()){ node.setLeft(leftNode); model.addTreeNode(leftNode); Q.add(leftNode); } if(rightNode != null && rightNode.getSampleSize() > model.getMinSampleSizeInNode()){ node.setRight(rightNode); model.addTreeNode(rightNode); Q.add(rightNode); } } } public List
> pollTopN(Queue
> Q, int n){ List
> ret = new ArrayList
>(); for(int i = 0; i < n; ++i){ if(Q.isEmpty()) break; TreeNode
node = Q.poll(); ret.add(node); } return ret; } @Override public void learnFromDataSet(Iterable
> dataset){ for(Sample sample : dataset){ model.addSample((MapBasedBinarySample
)sample); } Queue
> Q = new ConcurrentLinkedQueue
>(); TreeNode
root = model.selectRootTreeNode(); model.addTreeNode(root); Q.add(root); ExecutorService threadPool = Executors.newFixedThreadPool(10); while (!Q.isEmpty()){ List
> nodes = pollTopN(Q, 10); List
tasks = new ArrayList
(nodes.size()); for(TreeNode
node : nodes){ Future task = threadPool.submit(new NodeSplitThread(node, Q)); tasks.add(task); } for(Future task : tasks){ try { task.get(); } catch (InterruptedException e) { continue; } catch (ExecutionException e) { continue; } } } threadPool.shutdown(); try { threadPool.awaitTermination(60, TimeUnit.SECONDS); } catch (InterruptedException e) { threadPool.shutdownNow(); Thread.interrupted(); } threadPool.shutdownNow(); }

 

转载地址:http://mxqbm.baihongyu.com/

你可能感兴趣的文章
日期类问题
查看>>
区块链入门之基础知识
查看>>
mysql锁(Innodb)
查看>>
小程序开发之影分身术
查看>>
磨刀霍霍:爬爬爬爬爬爬虫爬起来~
查看>>
RxJava中的Observable,多Subscribers
查看>>
I/O模型和Java NIO源码分析
查看>>
第二天-《企业应用架构模式》-组织领域逻辑
查看>>
日志服务与SIEM(如Splunk)集成方案实战
查看>>
解决packet_write_wait: Connection to...: Broken pipe
查看>>
图学ES6-3.变量的解构赋值
查看>>
web3j的maven插件
查看>>
帮你理清React的生命周期
查看>>
堆和堆排序
查看>>
新手也能看懂,消息队列其实很简单
查看>>
全网稀缺的快应用开源项目-熊宝儿歌故事QuickApp
查看>>
【大数据实践】KSQL流处理——如何将多个STREAM输出到一个TOPIC
查看>>
Vue组件通信的几种方式
查看>>
09.Java数据算法
查看>>
git日常使用经验总结
查看>>