java.lang.IllegalArgumentException: java.net.UnknownHostException: dfscluster
解析:
解决办法:
找不到hdfs集群名字dfscluster,这个⽂件在HADOOP的etc/hadoop下⾯,有个⽂件hdfs-site.xml,复制到Spark的conf下,重启即可
如:执⾏脚本,分发到所有的Spark集群机器中,
[bdata@bdata4 hadoop]foriin34,35,36,37,38;doscphdfs−site.xml192.168.10.i:/u01/spark-1.5.1/conf/ ; done
Exception in thread “main” java.lang.Exception: When running with master ‘yarn-client’ either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
解析:
问题:在执⾏yarn集群或者客⼾端时,报以上错误,
[bdata@bdata4 bin]$ ./spark-sql –master yarn-client
Exception in thread “main” java.lang.Exception: When running with master ‘yarn-client’ either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
解决办法
根据提⽰,配置HADOOP_CONF_DIR or YARN_CONF_DIR的环境变量即可
export HADOOP_HOME=/u01/hadoop-2.6.1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
PATH=PATH:HOME/.local/bin:HOME/bin:SQOOP_HOME/bin:HIVEHOME/bin:HADOOP_HOME/bin
www.aboutyun.com/thread-24246-1-1.html 43/57
2019/4/24 spark相关的⾯试题跟答案,带着问题学习效果更佳哟。?)-⾯试区-about云开发
14、Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in
[Stage 0:> (0 + 4) / 42]2016-01-15 11:28:16,512 [org.apache.spark.scheduler.TaskSchedulerImpl]-[ERROR] Lost executor 0 on 192.168.10.38: remote Rpc client disassociated
[Stage 0:> (0 + 4) / 42]2016-01-15 11:28:23,188 [org.apache.spark.scheduler.TaskSchedulerImpl]-[ERROR] Lost executor 1 on 192.168.10.38: remote Rpc client disassociated
[Stage 0:> (0 + 4) / 42]2016-01-15 11:28:29,203 [org.apache.spark.scheduler.TaskSchedulerImpl]-[ERROR] Lost executor 2 on 192.168.10.38: remote Rpc client disassociated
[Stage 0:> (0 + 4) / 42]2016-01-15 11:28:36,319 [org.apache.spark.scheduler.TaskSchedulerImpl]-[ERROR] Lost executor 3 on 192.168.10.38: remote Rpc client disassociated
2016-01-15 11:28:36,321 [org.apache.spark.scheduler.TaskSetManager]-[ERROR] Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread “main” org.apache.spark.SparkException : Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 14,
38): ExecutorLostFailure (executor 3 lost)
解析:
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
解决⽅案
这⾥遇到的问题主要是因为数据源数据量过⼤,⽽机器的内存⽆法满⾜需求,导致⻓时间执⾏超时断开的情况,数据⽆法有效进⾏交互计算,因此有必要增加内存
⻓时间等待⽆反应,并且看到服务器上⾯的web界⾯有内存和核⼼数,但是没有分配,如下图
解析:
[Stage 0:> (0 + 0) / 42]
或者⽇志信息显⽰:
16/01/15 14:18:56 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
解决⽅案
出现上⾯的问题主要原因是因为我们通过参数spark.executor.memory设置的内存过⼤,已经超过了实际机器拥有的内存,故⽆法执⾏,需要等待机器拥有⾜够的内存后,才能执⾏任务,可
以减少任务执⾏内存,设置⼩⼀些即可
Spark Streaming 和kafka整合后读取消息报错:OffsetOutOfRangeException
解析:
解决⽅案:如果和kafka消息中间件结合使⽤,请检查消息体是否⼤于默认设置1m,如果⼤于,则需要设置fetch.message.max.bytes=1m,这⾥需要把值设置⼤些