Hive-on-Spark r囧r小猫 2021-11-05 03:38 334阅读 0赞 <div id="article\_content" class="article\_content clearfix"> <div class="article-copyright"> <svg class="icon" title="CSDN认证原创" aria-hidden="true" style="width:53px; height: 18px; vertical-align: -4px;"> <use xlink:href="\#CSDN\_Cert"></use> </svg> 版权声明:本文为博主原创文章,未经博主允许不得转载。 <a class="copy-right-url" href=" https://blog.csdn.net/zuochang\_liu/article/details/82292076"> https://blog.csdn.net/zuochang\_liu/article/details/82292076</a> </div> <link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck\_htmledit\_views-3019150162.css"> <link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck\_htmledit\_views-3019150162.css"> <div class="htmledit\_views" id="content\_views"> <h1><a name="t0"></a>1 HiveOnSpark简介</h1> <p style="margin-left:0pt;">Hive On Spark (跟hive没太大的关系,就是使用了hive的标准(HQL, 元数据库、UDF、序列化、反序列化机制))</p> <p style="margin-left:0pt;">Hive原来的计算模型是MR,有点慢(将中间结果写入到HDFS中)</p> <p style="margin-left:0pt;">Hive On Spark 使用RDD(DataFrame),然后运行在spark 集群上</p> <p style="margin-left:0pt;">真正要计算的数据是保存在HDFS中,mysql这个元数据库,保存的是hive表的描述信息,描述了有哪些database、table、以及表有多少列,每一列是什么类型,还要描述表的数据保存在hdfs的什么位置?</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">hive跟mysql的区别?</p> <p style="margin-left:0pt;">hive是一个数据仓库(存储数据并分析数据,分析数据仓库中的数据量很大,一般要分析很长的时间)</p> <p style="margin-left:0pt;">mysql是一个关系型数据库(关系型数据的增删改查(低延迟))</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">hive的元数据库中保存要计算的数据吗?</p> <p style="margin-left:0pt;">不保存,保存hive仓库的表、字段、等描述信息</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">真正要计算的数据保存在哪里了?</p> <p style="margin-left:0pt;">保存在HDFS中了</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">hive的元数据库的功能</p> <p style="margin-left:0pt;">建立了一种映射关系,执行HQL时,先到MySQL元数据库中查找描述信息,然后根据描述信息生成任务,然后将任务下发到spark集群中执行</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">hive on spark 使用的仅仅是hive的标准,规范,不需要有hive数据库一样可行。</p> <p style="margin-left:0pt;">hive : 元数据,是存放在mysql中,然后真正的数据是存放在hdfs中。</p> <h1 style="margin-left:0pt;"><a name="t1"></a>2 安装mysql</h1> <p>mysql数据库作为hive使用的元数据</p> <h1><a name="t2"></a>3 配置HiveOnSpark</h1> <p style="margin-left:0pt;">生成hive的元数据库表,根据hive的配置文件,生成对应的元数据库表。</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">spark-sql 是spark专门用于编写sql的交互式命令行。</p> <p style="margin-left:0pt;">当直接启动spark-sql以local模式运行时,如果报错:</p> <p style="margin-left:0pt;"><img alt="" class="has" height="77" src="https://img-blog.csdn.net/20180901235524809?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">是因为配置了Hadoop的配置参数导致的:</p> <p style="margin-left:0pt;"><img alt="" class="has" height="204" src="https://img-blog.csdn.net/20180901235541736?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">执行测试命令:</p> <p style="margin-left:0pt;">create table test (name string);</p> <p style="margin-left:0pt;">insert into test values(“xxtest”);</p> <p style="margin-left:0pt;"><img alt="" class="has" height="47" src="https://img-blog.csdn.net/20180901235849375?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">local模式下,默认使用derby数据库,数据存储于本地位置。</p> <p style="margin-left:0pt;"><strong><span style="color:\#ff0000;"><strong>要想使用hive的标准,需要把hive的配置文件放到spark的conf目录下</strong></span></strong></p> <p style="margin-left:0pt;">cd /root/apps/spark-2.2.0-bin-hadoop2.7/conf/</p> <p style="margin-left:0pt;">vi hive-site.xml</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">hive-site.xml文件:</p> <pre class="has" name="code"><code class="hljs xml"><ol class="hljs-ln"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-tag"><<span class="hljs-name">configuration</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>javax.jdo.option.ConnectionURL<span class="hljs-tag"></<span class="hljs-name">name</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">value</span>></span>jdbc:mysql://hdp-01:3306/hive?createDatabaseIfNotExist=true<span class="hljs-tag"></<span class="hljs-name">value</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">description</span>></span>JDBC connect string for a JDBC metastore<span class="hljs-tag"></<span class="hljs-name">description</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"></<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>javax.jdo.option.ConnectionDriverName<span class="hljs-tag"></<span class="hljs-name">name</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="10"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">value</span>></span>com.mysql.jdbc.Driver<span class="hljs-tag"></<span class="hljs-name">value</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="11"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">description</span>></span>Driver class name for a JDBC metastore<span class="hljs-tag"></<span class="hljs-name">description</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="12"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"></<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="13"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="14"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="15"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>javax.jdo.option.ConnectionUserName<span class="hljs-tag"></<span class="hljs-name">name</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="16"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">value</span>></span>root<span class="hljs-tag"></<span class="hljs-name">value</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="17"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">description</span>></span>username to use against metastore database<span class="hljs-tag"></<span class="hljs-name">description</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="18"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"></<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="19"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="20"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="21"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>javax.jdo.option.ConnectionPassword<span class="hljs-tag"></<span class="hljs-name">name</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="22"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">value</span>></span>123456<span class="hljs-tag"></<span class="hljs-name">value</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="23"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">description</span>></span>password to use against metastore database<span class="hljs-tag"></<span class="hljs-name">description</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="24"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"></<span class="hljs-name">property</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="25"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-tag"></<span class="hljs-name">configuration</span>></span></div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;">把该配置文件,发送给集群中的其他节点:</p> <blockquote> <p style="margin-left:0pt;">cd /root/apps/spark-2.2.0-bin-hadoop2.7/conf/</p> <p style="margin-left:0pt;">for i in 2 3 ;do scp hive-site.xml hdp-0$i:\`pwd\` ;done</p> </blockquote> <p style="margin-left:0pt;">重新停止并重启spark: start-all.sh</p> <p style="margin-left:0pt;">启动spark-sql时,</p> <p style="margin-left:0pt;">出现如下错误是因为操作mysql时缺少mysql的驱动jar包,</p> <p style="margin-left:0pt;">解决方案1:--jars 或者 --driver-class-path 引入msyql的jar包</p> <p style="margin-left:0pt;">解决方案2: 把mysql的jar包添加到$spark\_home/jars目录下</p> <p style="margin-left:0pt;"><img alt="" class="has" height="83" src="https://img-blog.csdn.net/20180902000211826?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">启动时指定集群:(如果不指定master,默认就是local模式)</p> <p style="margin-left:0pt;">spark-sql --master spark://hdp-01:7077 --jars /root/mysql-connector-java-5.1.38.jar</p> <p style="margin-left:0pt;">sparkSQL会在mysql上创建一个database,需要手动改一下DBS表中的DB\_LOCATION\_UIR改成hdfs的地址</p> <p style="margin-left:0pt;"><img alt="" class="has" height="170" src="https://img-blog.csdn.net/20180902000504421?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">hdfs://hdp-01:9000/user/hive/spark-warehouse</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">也需要查看一下,自己创建的数据库表的存储路径是否是hdfs的目录。</p> <p style="margin-left:0pt;"><img alt="" class="has" height="122" src="https://img-blog.csdn.net/20180902000652708?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">执行spark-sql任务之后:可以在集群的监控界面查看</p> <p style="margin-left:0pt;"><img alt="" class="has" height="369" src="https://img-blog.csdn.net/2018090200071394?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="813"></p> <p style="margin-left:0pt;">同样 ,会有SparkSubmit进程存在。</p> <p style="margin-left:0pt;"> </p> <h1 style="margin-left:0pt;"><a name="t3"></a>4 IDEA编程</h1> <p style="margin-left:0pt;"><strong><strong>要先开启spark对hive的支持</strong></strong></p> <pre class="has" name="code"><code class="hljs kotlin"><ol class="hljs-ln"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">//如果想让hive运行在spark上,一定要开启spark对hive的支持</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">val</span> session = SparkSession.builder()</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> .master(<span class="hljs-string">"local"</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> .appName(<span class="hljs-string">"xx"</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> .enableHiveSupport() <span class="hljs-comment">// 启动对hive的支持, 还需添加支持jar包</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> .getOrCreate()</div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;">要添加spark对hive的兼容jar包</p> <pre class="has" name="code"><code class="hljs xml"><ol class="hljs-ln"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment"><!--sparksql对hive的支持--></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-tag"><<span class="hljs-name">dependency</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">groupId</span>></span>org.apache.spark<span class="hljs-tag"></<span class="hljs-name">groupId</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">artifactId</span>></span>spark-hive\_2.11<span class="hljs-tag"></<span class="hljs-name">artifactId</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-tag"><<span class="hljs-name">version</span>></span>$\{spark.version\}<span class="hljs-tag"></<span class="hljs-name">version</span>></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-tag"></<span class="hljs-name">dependency</span>></span></div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;">在本地运行,还需把hive-site.xml文件拷贝到resource目录下。</p> <p style="margin-left:0pt;">resources目录,存放着当前项目的配置文件</p> <p style="margin-left:0pt;"><img alt="" class="has" height="209" src="https://img-blog.csdn.net/2018090200102849?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3p1b2NoYW5nX2xpdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70" width="320"></p> <p style="margin-left:0pt;">编写代码,local模式下测试:</p> <pre class="has" name="code"><code class="hljs go"><ol class="hljs-ln"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">// 执行查询</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">val query = session.sql(<span class="hljs-string">"select \* from t\_access\_times"</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">query.show()</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">// 释放资源</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">session.<span class="hljs-built\_in">close</span>()</div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;">创建表的时候,需要伪装客户端身份</p> <pre class="has" name="code"><code class="hljs java"><ol class="hljs-ln"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">System.setProperty(<span class="hljs-string">"HADOOP\_USER\_NAME"</span>, <span class="hljs-string">"root"</span>) <span class="hljs-comment">// 伪装客户端的用户身份为root</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">// 或者添加运行参数 –DHADOOP\_USER\_NAME=root</span></div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">基本操作</p> <pre class="has" name="code"><code class="hljs sql"><ol class="hljs-ln" style="width:1285px"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> // 求每个用户的每月总金额</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> // session.sql("<span class="hljs-keyword">select</span> username,<span class="hljs-keyword">month</span>,<span class="hljs-keyword">sum</span>(salary) <span class="hljs-keyword">as</span> salary <span class="hljs-keyword">from</span> t\_access\_times <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> username,<span class="hljs-keyword">month</span><span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 创建表</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">create</span> <span class="hljs-keyword">table</span> t\_access1(username <span class="hljs-keyword">string</span>,<span class="hljs-keyword">month</span> <span class="hljs-keyword">string</span>,salary <span class="hljs-built\_in">int</span>) <span class="hljs-keyword">row</span> <span class="hljs-keyword">format</span> <span class="hljs-keyword">delimited</span> <span class="hljs-keyword">fields</span> <span class="hljs-keyword">terminated</span> <span class="hljs-keyword">by</span> <span class="hljs-string">','</span><span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 删除表</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">drop</span> <span class="hljs-keyword">table</span> t\_access1<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 插入数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="10"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> t\_access\_times<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="11"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="12"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 覆盖写数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="13"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">insert</span> overwrite <span class="hljs-keyword">table</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> t\_access\_times <span class="hljs-keyword">where</span> username=<span class="hljs-string">'A'</span><span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="14"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="15"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 覆盖load新数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="16"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // C,2015-01,10</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="17"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // C,2015-01,20</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="18"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">load</span> <span class="hljs-keyword">data</span> <span class="hljs-keyword">local</span> inpath <span class="hljs-string">'t\_access\_time\_log'</span> overwrite <span class="hljs-keyword">into</span> <span class="hljs-keyword">table</span> t\_access1<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="19"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="20"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 清空数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="21"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">truncate</span> <span class="hljs-keyword">table</span> t\_access1<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="22"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="23"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="24"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="25"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 写入自定义数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="26"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val access: Dataset\[String\] = session.createDataset(List("</span>b,<span class="hljs-number">2015</span><span class="hljs-number">-01</span>,<span class="hljs-number">10</span><span class="hljs-string">", "</span>c,<span class="hljs-number">2015</span><span class="hljs-number">-02</span>,<span class="hljs-number">20</span><span class="hljs-string"><span class="hljs-string">"))</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="27"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="28"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val accessdf = access.map(\{</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="29"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> t =></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="30"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val lines = t.split("</span>,<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="31"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> (lines(0), lines(1), lines(2).toInt)</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="32"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> \}).toDF("</span>username<span class="hljs-string">", "</span><span class="hljs-keyword">month</span><span class="hljs-string">", "</span>salary<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="33"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="34"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="35"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="36"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> accessdf.createTempView("</span>t\_ac<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="37"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // session.sql("</span><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> t\_ac<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="38"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="39"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // overwrite模式会重新创建新的表 根据指定schema信息 SaveMode.Overwrite</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="40"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 本地模式只支持 overwrite,必须在sparksession上添加配置参数:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="41"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// .config("</span>spark.sql.warehouse.dir<span class="hljs-string">", "</span>hdfs://hdp<span class="hljs-number">-01</span>:<span class="hljs-number">9000</span>/<span class="hljs-keyword">user</span>/hive/warehouse<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="42"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> accessdf</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="43"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> .write.mode("</span>overwrite<span class="hljs-string">").saveAsTable("</span>t\_access1<span class="hljs-string">")</span></div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">集群运行:</p> <p style="margin-left:0pt;">需要把hive-site.xml配置文件,添加到$SPARK\_HOME/conf目录中去,重启spark</p> <p style="margin-left:0pt;">上传一个mysql连接驱动(sparkSubmit也要连接MySQL,获取元数据信息)</p> <p style="margin-left:0pt;">spark-sql --master spark://hdp-01:7077 --driver-class-path /root/mysql-connector-java-5.1.38.jar</p> <p style="margin-left:0pt;">--class xx.jar</p> <p style="margin-left:0pt;"> </p> <p style="margin-left:0pt;">然后执行代码的编写:</p> <pre class="has" name="code"><code class="hljs sql"><ol class="hljs-ln" style="width:1249px"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> // 执行查询 hive的数据表</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">// session.sql("<span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> t\_access\_times<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 创建表</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// session.sql("</span><span class="hljs-keyword">create</span> <span class="hljs-keyword">table</span> t\_access1(username <span class="hljs-keyword">string</span>,<span class="hljs-keyword">month</span> <span class="hljs-keyword">string</span>,salary <span class="hljs-built\_in">int</span>) <span class="hljs-keyword">row</span> <span class="hljs-keyword">format</span> <span class="hljs-keyword">delimited</span> <span class="hljs-keyword">fields</span> <span class="hljs-keyword">terminated</span> <span class="hljs-keyword">by</span> <span class="hljs-string">','</span><span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// session.sql("</span><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> t\_access\_times<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="10"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="11"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="12"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 写数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="13"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val access: Dataset\[String\] = session.createDataset(List("</span>b,<span class="hljs-number">2015</span><span class="hljs-number">-01</span>,<span class="hljs-number">10</span><span class="hljs-string">", "</span>c,<span class="hljs-number">2015</span><span class="hljs-number">-02</span>,<span class="hljs-number">20</span><span class="hljs-string"><span class="hljs-string">"))</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="14"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="15"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val accessdf = access.map(\{</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="16"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> t =></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="17"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> val lines = t.split("</span>,<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="18"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> (lines(0), lines(1), lines(2).toInt)</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="19"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> \}).toDF("</span>username<span class="hljs-string">", "</span><span class="hljs-keyword">month</span><span class="hljs-string">", "</span>salary<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="20"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="21"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="22"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> accessdf.createTempView("</span>v\_tmp<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="23"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> // 插入数据</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="24"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// session.sql("</span><span class="hljs-keyword">insert</span> overwrite <span class="hljs-keyword">table</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> v\_tmp<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="25"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> session.sql("</span><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> t\_access1 <span class="hljs-keyword">select</span> \* <span class="hljs-keyword">from</span> v\_tmp<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="26"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// .show()</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="27"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="28"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">// insertInto的api 入库</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="29"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string">accessdf.write.insertInto("</span>databaseName.tableName<span class="hljs-string"><span class="hljs-string">")</span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="30"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-string"> session.close()</span></div></div></li></ol></code><div class="hljs-button \{2\}" data-title="复制" οnclick="hljs.copyCode(event)"></div></pre> <h1 style="margin-left:0pt;"><a name="t4"></a>5 sparksql连接方式</h1> <h2 style="margin-left:0pt;"><a name="t5"></a>5.1 交互式的命令行</h2> <p><img alt="" class="has" height="47" src="https://img-blog.csdnimg.cn/20190729232532323.png" width="693"></p> <p style="margin-left:0pt;">spark-sql 本地模式运行</p> <p style="margin-left:0pt;">spark-sql --master spark://hdp-01:7077 集群模式运行 </p> <p style="margin-left:0pt;">如果没有hive-site.xml文件,spark-sql 默认使用的是derby数据库,数据写在执行命令的当前目录(spark-warehouse)。</p> <p style="margin-left:0pt;"><img alt="" class="has" height="87" src="https://img-blog.csdnimg.cn/20190729232631578.png" width="693"></p> <p style="margin-left:0pt;">如果有hive-site.xml ,才能实现,元数据用mysql管理,数据存储在HDFS中</p> <p style="margin-left:0pt;"> </p> <h2 style="margin-left:0pt;"><a name="t6"></a>5.2 jdbc的连接方式</h2> <p>在服务端修改配置文件hive-site.xml</p> <blockquote> <p style="margin-left:0pt;"><property></p> <p style="margin-left:0pt;"><name>hive.server2.thrift.bind.host</name></p> <p style="margin-left:0pt;"><value><span style="color:\#ff0000;">hdp-03</span></value></p> <p style="margin-left:0pt;"><description>Bind host on which to run the HiveServer2 Thrift service.</description></p> <p style="margin-left:0pt;"></property></p> <p style="margin-left:0pt;"><property></p> <p style="margin-left:0pt;"><name>hive.server2.thrift.port</name></p> <p style="margin-left:0pt;"><value><span style="color:\#ff0000;">10000</span></value></p> <p style="margin-left:0pt;"><description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mo</p> <p style="margin-left:0pt;">de is 'binary'.</description></p> <p style="margin-left:0pt;"></property></p> </blockquote> <p> </p> <p>启动服务端</p> <p><img alt="" class="has" height="32" src="https://img-blog.csdnimg.cn/20190729232858663.png" width="693"></p> <p style="margin-left:0pt;">服务端的进程是SparkSubmit:</p> <p style="margin-left:0pt;"><img alt="" class="has" height="181" src="https://img-blog.csdnimg.cn/20190729232923760.png" width="609"></p> <p style="margin-left:0pt;">启动客户端:</p> <p style="margin-left:0pt;"><img alt="" class="has" height="157" src="https://img-blog.csdnimg.cn/201907292329372.png" width="693"></p> <p style="margin-left:0pt;"> </p> </div> </div>
还没有评论,来说两句吧...