关于服务器上测试自定义编译的spark没反应问题

2022-01-22 评论

背景

服务器上部署了Spark，并配置了环境变量。根据需要，在IDEA上编译了Spark源码，导出tgz包后，到服务器上解压部署。为区分两个Spark,服务器上的称为S1，新编译的为S2，并修改了S2的Welcome文本。

演示

服务器上S1的位置，以及部署了环境变量：

1 2	[hadoop@hadoop001 ~]$ echo $SPARK_HOME /home/hadoop/app/spark

S2的位置：

1 2	[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ pwd /home/hadoop/source/spark-3.2.0-bin-custom-spark

我尝试启动S2进行测试，在S2上测试，于是在S2目录下执行命令

1 2	[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ ./bin/spark-shell

此时，Welcome文本与S1默认的文本相同，所以此时是启动了S1。

尝试以绝对路径的方式启动S2：

结果一样，启动的是S1。

原因

阅读脚本spark-shell内容

#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Shell script for starting the Spark Shell REPL

cygwin=false
case "$(uname)" in
  CYGWIN*) cygwin=true;;
esac

# Enter posix mode for bash
set -o posix

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
fi

export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]

可以看出，脚本启动/usr/bin/env 路径的bash，然后查找是否设置环境变量${SPARK_HOME}，如果设置了，运行${SPARK_HOME}路径下的脚本；如果${SPARK_HOME}没有设置，才运行当前路径下的bin目录下的脚本。

解决方法

所以需要删除环境变量${SPARK_HOME}，${SPARK_HOME或者指向当前版本路径。

通过echo $SPARK_HOME命令验证环境变量是否正确或者为空值。

删除环境变量：

unset VAL：暂时的，只会在当前环境有效
export -n VAL：删除指定的变量。变量实际上并未删除，只是不会输出到后续指令的执行环境中。
修改配置文件，默认保存在~/.bash_profile：需要退出重连才生效

再次测试：

[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ vi ~/.bash_profile 
[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ source ~/.bash_profile 
[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ echo $SPARK_HOME
/home/hadoop/app/spark
[hadoop@hadoop001 spark-3.2.0-bin-custom-spark]$ exit
logout
[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$ echo $SPARK_HOME
        
[hadoop@hadoop001 ~]$ source/spark-3.2.0-bin-custom-spark/bin/spark-shell 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/01/21 19:51:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://hadoop001:4040
Spark context available as 'sc' (master = local[*], app id = local-1642794689485).
Spark session available as 'spark'.
Welcome to  new Spark
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.2.0
      /_/
         
Using Scala version 2.12.14 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

本文链接： https://k12coding.github.io/2022/01/22/关于服务器上测试自定义编译的spark没反应问题/

版权声明： 本博客所有文章除特别声明外，均采用 CC BY 4.0 CN协议许可协议。转载请注明出处！

k12

大数据技术

关于服务器上测试自定义编译的spark没反应问题

背景

演示

原因

解决方法

k12大数据技术