准备
并没有集群一说, 只要Hadoop以集群的模式运行, Hive就可以在Hadoop集群中运行。Hive 是基于 Hadoop 的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供完整的 SQL 查询功能,将类 SQL 语句转换为 MapReduce 任务执行。
Hive的下载地址
http://mirror.bit.edu.cn/apache/hive/
这里使用的版本是apache-hive-2.3.7-bin.tar.gz
环境准备
hadoop-2.7.3
三台机器:
master 、slave1、slave2
解压
tar -zxvf apache-hive-2.3.7-bin.tar.gz
ln -s apache-hive-2.3.7-bin hive-2.3.7
设置环境变量
export HIVE_HOME=/home/hadoop/hive-2.3.7
PATH=$PATH:$HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$HIVE_HOME/bin
准备mysql的环境
配置hive
1. 修改hive-site.xml
cp hive-default.xml.template hive-site.xml
vi hive-site.xml
修改如下四项的内容
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>mysql.用户名</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mysql.密码</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>mysql
<value>mysql.url</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>mysql.driver</value>
2. mysql驱动复制到$HIVE_HOME/lib下
这里使用的是mysql-connector-java-8.0.18.jar
3.创建hive数据库
4.初始化mysql中hive的schema
schematool -dbType mysql -initSchema
[root@master bin]# schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apache-hive-2.3.7-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://192.168.0.7:3306/hive?serverTimezone=UTC
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
5.启动hive
[root@master bin]# hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apache-hive-2.3.7-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/home/hadoop/apache-hive-2.3.7-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
hive> show databases;
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
OK
default
Time taken: 11.624 seconds, Fetched: 1 row(s)
hive> show tables;
OK
Time taken: 0.131 seconds
hive> create table employee (id bigint,name string) row format delimited fields terminated by '\t';
OK
Time taken: 1.393 seconds
hive> select * from employee;
OK
Time taken: 3.214 seconds
hive>
创建表
结果
错误1
把conf下的文件中含有的${system:java.io.tmpdir}换成绝对路径
Logging initialized using configuration in file:/home/hadoop/apache-hive-2.3.7-bin/conf/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bhive.session.id%7D_resources
at org.apache.hadoop.fs.Path.initialize(Path.java:205)
at org.apache.hadoop.fs.Path.<init>(Path.java:171)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:666)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:586)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:553)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bhive.session.id%7D_resources
at java.net.URI.checkPath(URI.java:1823)
at java.net.URI.<init>(URI.java:745)
at org.apache.hadoop.fs.Path.initialize(Path.java:202)
... 12 more
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/140798.html