文章目录
1、下载Hive
本实例使用的是apache-hive-2.1.1,请根据需要将hive下载到本地并解压。下载地址:http://archive.apache.org/dist/hive/
解压后的路径:
[root@hadoop-master apache-hive-2.1.1]# pwd
/usr/local/hive/apache-hive-2.1.1
2、安装mysql数据库
Hive搭建分三种方式:
内嵌Derby方式
:使用derby存储方式时,运行hive会在当前目录生成一个derby文件和一个metastore_db目录。这种存储方式的弊端是在同一个目录下同时只能有一个hive客户端能使用数据库。
本地模式
:这种存储方式需要在本地运行一个mysql服务器,并作如下配置(下面两种使用mysql的方式,需要将mysql的jar包拷贝到$HIVE_HOME/lib目录下)。
多用户模式
:这种存储方式需要在远端服务器运行一个mysql服务器,并且需要在Hive服务器启动meta服务。
三种方式归根到底就是元数据的存储位置不一样
,本文采用的是多用户模式。
mysql的安装可以参考 《CentOS7下安装mysql-5.7.24》
3、修改配置文件
3.1 hive-site.xml配置文件
首先进入下面这个目录,编辑hive-site.xml
文件,没有就新加一个
[root@hadoop-master conf]# vi /usr/local/hive/apache-hive-2.1.1/conf/hive-site.xml
其中hive-site.xml内容:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<!--数据连接地址 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop-master:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<!-- 数据库驱动,这里使用mysql数据库驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!-- 数据库用户名,根据自己的数据库用户名填写 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<!-- 数据库密码,根据自己的数据库密码填写 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateColumns</name>
<value>true</value>
</property>
<!-- 设置hive仓库在hdfs上的位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!--资源临时文件存放位置 -->
<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/hive/tmp/resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<!-- Hive在0.9版本之前需要设置hive.exec.dynamic.partition为true, Hive在0.9版本之后默认为true -->
<property>
<name>hive.exec.dynamic.partition</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<!-- 修改日志位置 -->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/usr/local/hive/tmp/hiveJobsLog</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/hive/tmp/resourcesLog</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/hive/tmp/hiveRunLog</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/hive/tmp/opertitionLog</value>
<description>Top level directory where operation tmp are stored if logging functionality is enabled</description>
</property>
<!-- 配置HWI接口 -->
<property>
<name>hive.hwi.war.file</name>
<value>/usr/local/hive/apache-hive-2.1.1/lib/hive-hwi-2.1.1.jar</value>
<description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description>
</property>
<property>
<name>hive.hwi.listen.host</name>
<value>hadoop-master</value>
<description>This is the host address the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
<description>This is the port the Hive Web Interface will listen on</description>
</property>
<!-- Hiveserver2已经不再需要hive.metastore.local这个配置项了(hive.metastore.uris为空,则表示是metastore在本地,否则就是远程)远程的话直接配置hive.metastore.uris即可 -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop-master:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop-master</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
</property>
<!-- HiveServer2的WEB UI -->
<property>
<name>hive.server2.webui.host</name>
<value>hadoop-master</value>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>755</value>
</property>
<!-- 使用Hive on spark时,若不设置下列该配置会出现内存溢出异常 -->
<property>
<name>spark.driver.extraJavaOptions</name>
<value>-XX:PermSize=128M -XX:MaxPermSize=512M</value>
</property>
<!-- 是否验证配置,此处设置未NONE,暂时不需要验证,测试用 -->
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
</configuration>
配置中的文件目录记得要创建:
[root@hadoop-master tmp]# pwd
/usr/local/hive/tmp
[root@hadoop-master tmp]# ll
总用量 40
drwxr-xr-x. 2 root root 6 2月 21 22:47 hiveJobsLog
drwxr-xr-x. 2 root root 6 2月 21 21:20 hiveRunLog
drwxr-xr-x. 2 root root 6 2月 21 21:20 opertitionLog
drwxr-xr-x. 2 root root 6 2月 21 21:19 resources
drwxr-xr-x. 2 root root 6 2月 21 21:20 resourcesLog
3.2 hive-env.sh配置文件
首先,进入hive的配置目录
cd /usr/local/hive/apache-hive-2.1.1/conf
然后,将需要的配置文件拷贝,即去掉后面的template
cp hive-env.sh.template hive-env.sh
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
cp hive-log4j2.properties.template hive-log4j2.properties
之后的文件如图所示:
在hive-env.sh文件结尾增加如下内容:
JAVA_HOME=/usr/local/jdk/jdk1.8.0_261
HADOOP_HOME=/usr/local/hadoop/apps/hadoop-2.7.3
HIVE_HOME=/usr/local/hive/apache-hive-2.1.1
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$HADOOP_HOME/lib:$HIVE_HOME/lib
export HADOOP_OPTS="-Dorg.xerial.snappy.tempdir=/tmp -Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib $HADOOP_OPTS"
4、新增mysql驱动到hive中
在以下目录添加mysql-connector-java-5.1.30.jar
[root@hadoop-master lib]# pwd
/usr/local/hive/apache-hive-2.1.1/lib
并且chomd 777 mysql-connector-java-5.1.30.jar
赋权限
5、将hive命令添加到环境变量中
[root@hadoop-master bin]# vi /etc/profile
添加HIVE_HOME到环境变量中
#java环境变量
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
#配置Hadoop环境变量
export HADOOP_HOME=/usr/local/hadoop/apps/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#配置hive环境变量
export HIVE_HOME=/usr/local/hive/apache-hive-2.1.1
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/sbin
添加完成之后执行刷新:
[root@hadoop-master bin]# source /etc/profile
6、初始化hive操作
选用MySQLysql
和Derby
二者之一为元数据库
注意:先查看MySQL
中是否有残留的Hive
元数据,若有,需先删除
schematool -dbType mysql -initSchema ## MySQL作为元数据库
其中mysql表示用mysql
做为存储hive
元数据的数据库, 若不用mysql做为元数据库
, 则执行
schematool -dbType derby -initSchema ## Derby作为元数据库
7、启动Metastore服务
执行Hive前, 须先启动metastore
服务, 否则会报错。客户端连接metastore
服务,metastore再去连接MySQL数据库来存取元数据。有了metastore服务,就可以有多个客户端同时连接,而且这些客户端不需要知道MySQL数据库的用户名和密码,只需要连接metastore 服务即可。
hive --service metastore &
启动hiveserver2服务,这样远程才能访问到
hive --service hiveserver2 &
远程访问地址:
jdbc:hive2://hadoop-master:10000
同时可以从web页面上查看
http://hadoop-master:10002/
8、开始测试
8.1、hdsf目录创建
首先如果hadoop还未启动的先保证hadoop正常启动。然后执行hive的目录创建:
hdfs dfs -mkdir -p /user/hive/warehouse
8.2、将hive拷贝到集群的其他几台机器
将hadoop-master上的hive拷贝到其他几台机器上
scp -r /usr/local/hive/* hadoop-slave3:/usr/local/hive/
scp -r /usr/local/hive/* hadoop-slave2:/usr/local/hive/
scp -r /usr/local/hive/* hadoop-slave1:/usr/local/hive/
并且注意修改其他几台机器的配置文件:/etc/profile
#java环境变量
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
#配置Hadoop环境变量
export HADOOP_HOME=/usr/local/hadoop/apps/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#配置hive环境变量
export HIVE_HOME=/usr/local/hive/apache-hive-2.1.1
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/sbin
然后,刷新环境变量配置文件:
source /etc/profile
8.3、启动测试
在hadoop-slave1上直接输入hive,然后进行如下测试
[root@hadoop-slave1 hive]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/jdk/jdk1.8.0_261/bin:/usr/local/hadoop/apps/hadoop-2.7.3/bin:/usr/local/hadoop/apps/hadoop-2.7.3/sbin::/usr/local/hive/apache-hive-2.1.1/bin:/usr/local/hive/apache-hive-2.1.1/sbin::/root/bin:/usr/local/jdk/jdk1.8.0_261/bin:/usr/local/hadoop/apps/hadoop-2.7.3/bin:/usr/local/hadoop/apps/hadoop-2.7.3/sbin:/usr/local/hive/apache-hive-2.1.1/bin:/usr/local/hive/apache-hive-2.1.1/sbin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/apache-hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/apps/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/usr/local/hive/apache-hive-2.1.1/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> create database hive_test;
OK
Time taken: 0.29 seconds
hive> show databases;
OK
default
hive_test
Time taken: 0.026 seconds, Fetched: 2 row(s)
hive> use hive_test;
OK
Time taken: 0.042 seconds
hive> create table book (id bigint, name string) row format delimited fields terminated by '\t';
OK
Time taken: 0.408 seconds
hive> show tables;
OK
book
Time taken: 0.027 seconds, Fetched: 1 row(s)
hive> select * from book;
OK
Time taken: 1.111 seconds
hive>
8.4、远程访问hive测试
使用beeline方式连接
[root@hadoop-master hive]# beeline
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/jdk/jdk1.8.0_261/bin:/usr/local/hadoop/apps/hadoop-2.7.3/bin:/usr/local/hadoop/apps/hadoop-2.7.3/sbin:/usr/local/hive/apache-hive-2.1.1/bin:/usr/local/hive/apache-hive-2.1.1/sbin:/usr/local/git/git-2.9.5/bin:/usr/local/go/bin:/root/bin)
Beeline version 2.1.1 by Apache Hive
beeline> !connect jdbc:hive2://hadoop-master:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/apache-hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/apps/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hadoop-master:10000
Enter username for jdbc:hive2://hadoop-master:10000: root
Enter password for jdbc:hive2://hadoop-master:10000: ****
Connected to: Apache Hive (version 2.1.1)
Driver: Hive JDBC (version 2.1.1)
21/06/27 11:26:43 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop-master:10000> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| hive_test |
+----------------+--+
2 rows selected (0.385 seconds)
0: jdbc:hive2://hadoop-master:10000>
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/72654.html