我是 Hadoop/Giraph 和 Java 的新手。作为任务的一部分,我在其上下载了 Cloudera Quickstart VM 和 Giraph。我正在使用这本书,名为“使用 Apache Giraph 进行实用图形分析;作者:Shaposhnik、Roman、Martella、Claudio、Logothetis、Dionysios”,我尝试从中运行第 111 页上的第一个示例(Twitter Followership Graph)。
编辑:显然,书中的示例(2015 年出版)所依赖的 Hadoop 版本比当前(2017 年)版本的 Cloudera Quickstart VM 提供的版本要旧得多。如何让示例运行?
原帖:
运行 GiraphHelloWorld.java 程序
import org.apache.giraph.edge.Edge;
import org.apache.giraph.GiraphRunner;
import org.apache.giraph.graph.BasicComputation;
import org.apache.giraph.graph.Vertex;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.util.ToolRunner;
// Giraph applications are custom classes that typically use
// BasicComputation class for all their defaults... except for
// the compute method that has to be defined
public class GiraphHelloWorld extends
BasicComputation<IntWritable, IntWritable,
NullWritable, NullWritable> {
@Override
public void compute(Vertex<IntWritable, IntWritable, NullWritable> vertex, Iterable<NullWritable> messages) {
System.out.print("Hello world from the: " + vertex.getId().toString() + " who is following:");
// iterating over vertex's neighbors
for (Edge<IntWritable, NullWritable> e : vertex.getEdges()) {
System.out.print(" " + e.getTargetVertexId());
}
System.out.println("");
// signaling the end of the current BSP computation for the current vertex
vertex.voteToHalt();
}
public static void main(String[] args) throws Exception {
System.exit(ToolRunner.run(new GiraphRunner(), args));
}
}
下面的代码在终端上运行以执行程序:
export HADOOP_HOME=/usr/lib/hadoop
export GIRAPH_HOME=/usr/local/giraph
export HADOOP_CONF_DIR=$GIRAPH_HOME/conf
PATH=$HADOOP_HOME/bin:$GIRAPH_HOME/bin:$PATH
giraph target/book-examples-1.0.0-jar-with-dependencies.jar GiraphHelloWorld -vip /home/cloudera/src/main/resources/1 -vif org.apache.giraph.io.formats.IntIntNullTextInputFormat -w 1 -ca giraph.SplitMasterWorker=false,giraph.logLevel=error
以上导致了以下错误:
rker=false,giraph.logLevel=error
No lib directory, assuming dev environment
HADOOP_CONF_DIR=/usr/local/giraph/conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/cloudera/workspace/first/target/book-examples-1.0.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2017-12-08 16:46:24,917 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(336)) - No edge input format specified. Ensure your InputFormat does not require one.
2017-12-08 16:46:24,926 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(346)) - No vertex output format specified. Ensure your OutputFormat does not require one.
2017-12-08 16:46:24,926 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(361)) - No edge output format specified. Ensure your OutputFormat does not require one.
2017-12-08 16:46:24,957 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(402)) - Setting custom argument [giraph.SplitMasterWorker] to [false] in GiraphConfiguration
2017-12-08 16:46:24,957 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(402)) - Setting custom argument [giraph.logLevel] to [error] in GiraphConfiguration
2017-12-08 16:46:25,329 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2017-12-08 16:46:25,332 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first
2017-12-08 16:46:25,332 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
2017-12-08 16:46:25,336 INFO [main] job.GiraphJob (GiraphJob.java:run(226)) - run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
2017-12-08 16:46:25,339 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-12-08 16:46:25,401 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2017-12-08 16:46:25,405 INFO [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:43)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:259)
at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Maven pom xml 文件:
<?xml version="1.0" encoding="UTF-8"?>
<project>
<modelVersion>4.0.0</modelVersion>
<groupId>giraph</groupId>
<artifactId>book-examples</artifactId>
<version>1.0.0</version>
<dependencies>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-core</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.9.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>create-jar-bundle</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
</project>
如果还有其他需要,请告诉我。感谢您的帮助,提前致谢!
最佳答案
当我尝试使用 Giraph 项目所需的依赖项创建自己的 pom 文件时,版本问题得到解决。
`
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com</groupId>
<artifactId>R4.giraphshortestpath</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>R4.giraphshortestpath</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>cloudera</id>
<name>cloudera repository</name>
<url>https://repository.cloudera.com/content/repositories/releases/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-parent</artifactId>
<version>1.2.0-hadoop2</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-core</artifactId>
<version>1.2.0-hadoop2</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0-cdh5.12.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0-mr1-cdh5.12.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>create-jar-bundle</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
`
关于java - 如何更新 "Practical Graph Analytics with Apache Giraph"示例以在当前 Cloudera Quickstart VM 上运行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47724275/
我正在学习如何使用Nokogiri,根据这段代码我遇到了一些问题:require'rubygems'require'mechanize'post_agent=WWW::Mechanize.newpost_page=post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')puts"\nabsolutepathwithtbodygivesnil"putspost_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div
总的来说,我对ruby还比较陌生,我正在为我正在创建的对象编写一些rspec测试用例。许多测试用例都非常基础,我只是想确保正确填充和返回值。我想知道是否有办法使用循环结构来执行此操作。不必为我要测试的每个方法都设置一个assertEquals。例如:describeitem,"TestingtheItem"doit"willhaveanullvaluetostart"doitem=Item.new#HereIcoulddotheitem.name.shouldbe_nil#thenIcoulddoitem.category.shouldbe_nilendend但我想要一些方法来使用
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题?更新问题,以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点?我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等),但我确实想创建一个输出文件。
我正在尝试测试是否存在表单。我是Rails新手。我的new.html.erb_spec.rb文件的内容是:require'spec_helper'describe"messages/new.html.erb"doit"shouldrendertheform"dorender'/messages/new.html.erb'reponse.shouldhave_form_putting_to(@message)with_submit_buttonendendView本身,new.html.erb,有代码:当我运行rspec时,它失败了:1)messages/new.html.erbshou
我在从html页面生成PDF时遇到问题。我正在使用PDFkit。在安装它的过程中,我注意到我需要wkhtmltopdf。所以我也安装了它。我做了PDFkit的文档所说的一切......现在我在尝试加载PDF时遇到了这个错误。这里是错误:commandfailed:"/usr/local/bin/wkhtmltopdf""--margin-right""0.75in""--page-size""Letter""--margin-top""0.75in""--margin-bottom""0.75in""--encoding""UTF-8""--margin-left""0.75in""-
给定这段代码defcreate@upgrades=User.update_all(["role=?","upgraded"],:id=>params[:upgrade])redirect_toadmin_upgrades_path,:notice=>"Successfullyupgradeduser."end我如何在该操作中实际验证它们是否已保存或未重定向到适当的页面和消息? 最佳答案 在Rails3中,update_all不返回任何有意义的信息,除了已更新的记录数(这可能取决于您的DBMS是否返回该信息)。http://ar.ru
我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗?当我运行compasswatch时,它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行?文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们?我自己编译的.sass文件编译成compiled/t
我正在寻找执行以下操作的正确语法(在Perl、Shell或Ruby中):#variabletoaccessthedatalinesappendedasafileEND_OF_SCRIPT_MARKERrawdatastartshereanditcontinues. 最佳答案 Perl用__DATA__做这个:#!/usr/bin/perlusestrict;usewarnings;while(){print;}__DATA__Texttoprintgoeshere 关于ruby-如何将脚
Rackup通过Rack的默认处理程序成功运行任何Rack应用程序。例如:classRackAppdefcall(environment)['200',{'Content-Type'=>'text/html'},["Helloworld"]]endendrunRackApp.new但是当最后一行更改为使用Rack的内置CGI处理程序时,rackup给出“NoMethodErrorat/undefinedmethod`call'fornil:NilClass”:Rack::Handler::CGI.runRackApp.newRack的其他内置处理程序也提出了同样的反对意见。例如Rack
在选择我想要运行操作的频率时,唯一的选项是“每天”、“每小时”和“每10分钟”。谢谢!我想为我的Rails3.1应用程序运行调度程序。 最佳答案 这不是一个优雅的解决方案,但您可以安排它每天运行,并在实际开始工作之前检查日期是否为当月的第一天。 关于ruby-如何每月在Heroku运行一次Scheduler插件?,我们在StackOverflow上找到一个类似的问题: https://stackoverflow.com/questions/8692687/