yarn-ats Love The Way You Lie 2022-10-27 05:27 150阅读 0赞 进入到zookeeper查看是否有`/atsv2-hbase-secure/meta-region-server`文件 su - zookeeper kinit -kt /etc/security/keytabs/zk.service.keytab zookeeper/bg6.test.com.cn@HADOOP.COM sh /usr/hdp/3.1.0.0-78/zookeeper/bin/zkCli.sh -server bg6.test.com.cn:2181 查看zookeeper的目录结构,可以看出并没有`atsv2-hbase-secure`,那报错就是必然的。 [zk: bg6.test.com.cn:2181(CONNECTED) 0] ls / [hive, cluster, brokers, infra-solr, kafka-acl, kafka-acl-changes, admin, isr_change_notification, log_dir_event_notification, kafka-acl-extended, rmstore, kafka-acl-extended-changes, consumers, latest_producer_id_block, hbase, registry, controller, zookeeper, delegation_token, hiveserver2, controller_epoch, hiveserver2-leader, kafka-manager, ambari-metrics-cluster, apache_atlas, config, kylin] [ambari 集群 遇到的一些问题][ambari _],这篇文章的困惑在于`hdfs dfs -mv /atsv2/hbase/tmp/`这路径没有 su - yarn kinit -kt /etc/security/keytabs/yarn.service.keytab yarn/bg3.test.com.cn@HADOOP.COM yarn app -list -bash-4.2$ yarn app -list 21/02/03 11:11:39 INFO client.RMProxy: Connecting to ResourceManager at bg4.test.com.cn/10.128.2.171:8050 21/02/03 11:11:39 INFO client.AHSProxy: Connecting to Application History server at bg3.test.com.cn/10.128.2.121:10200 Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):2 Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL application_1611552872177_1446 SparkSQL::10.128.2.210 SPARK hbase default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4041 application_1611552872177_0144 Thrift JDBC/ODBC Server SPARK spark default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4040 既然从上面看到就没有`ats-hbase`,接着切换到`yarn-ats`目录,从下面可以看到依旧没有`ats-hbase`, 按照[Remove ats-hbase before switching between clusters][],仍然没法执行destroy,因为就没有`ats-hbase`这个程序。 su - yarn-ats kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab yarn-ats-test_data@HADOOP.COM [yarn-ats@bg7 ~]$ yarn app -list 21/02/03 11:21:44 INFO client.RMProxy: Connecting to ResourceManager at bg4.test.com.cn/10.128.2.171:8050 21/02/03 11:21:44 INFO client.AHSProxy: Connecting to Application History server at bg3.test.com.cn/10.128.2.121:10200 Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):2 Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL application_1611552872177_1446 SparkSQL::10.128.2.210 SPARK hbase default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4041 application_1611552872177_0144 Thrift JDBC/ODBC Server SPARK spark default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4040 接着执行 su - yarn-ats kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab yarn-ats-test_data@HADOOP.COM hdfs dfs -rm -R ./3.1.0.0-78/* hdfs dfs -ls ./3.1.0.0-78/* exit su - hdfs kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-test_data@HADOOP.COM hadoop fs -rm -R /services/sync/yarn-ats/hbase.yarnfile 参考[ambari hdfs 启动报错\_Ambari 环境启动时遇到的一些问题记录][ambari hdfs _Ambari] 这里更改了几个配置, 将yarn中的`/atsv2-hbase-secure`,更改为hbase的`/hbase`,直接更改会有问题 参考[Yarn timeline service v2.0启动成功但查询日志报错:AbstractChannel$AnnotatedConnectException: Connection refused][Yarn timeline service v2.0_AbstractChannel_AnnotatedConnectException_ Connection refused],下面的配置是不对的,因为yarn-ats应该使用内部的hbase,而不应该采用外部的hbase。 use_external_hbase=true is_hbase_system_service_launch=true 执行命令 su - hdfs kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-test_data@HADOOP.COM hdfs dfs -rm -r /atsv2 hdfs dfs -ls /atsv2 参考[Configure External HBase for Timeline Service 2.0][],将`use_external_hbase`设置为true `tail -fn100 /appdata/home/hadoop/var/log/hadoop-yarn/yarn/hadoop-yarn-resourcemanager-bg4.test.com.cn.log` 这里再阅读以下[HDP 之 Timeline Service 2.0][HDP _ Timeline Service 2.0] [Timeline Service v.2 (HDP3.1 )参数配置以及相关环境][Timeline Service v.2 _HDP3.1] su - hbase kinit -kt /etc/security/keytabs/hbase.headless.keytab hbase-test_data@HADOOP.COM hbase --config /etc/hadoop/3.1.0.0-78/0/embedded-yarn-ats-hbase shell 如果是正常的 TABLE prod.timelineserviceapp_flow prod.timelineservice.application prod.timelineservice.entity prod.timelineservice.flowactivity prod.timelineservice.flowrun prod.timelineservice.subapplication 6 row(s) Took 0.0257 seconds => ["prod.timelineservice.app_flow", "prod.timelineservice.application", "prod.timelineservice.entity", "prod.timelineservice.flowactivity", "prod.timelineservice.flowrun", "prod.timelineservice.subapplication"] 异常的是 ERROR: KeeperErrorCode = NoNode for /atsv2-hbase-secure/master Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The default is 'summary'. Examples: hbase> status hbase> status 'simple' hbase> status 'summary' hbase> status 'detailed' hbase> status 'replication' hbase> status 'replication', 'source' hbase> status 'replication', 'sink' 将`is_hbase_system_service_launch`和`use_external_hbase`设置为false,提示的异常信息如下: 2021-02-03 14:54:21,077 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=36, started=38336 ms ago, cancelled=false, msg=org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-secure/meta-region-server, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null [开发大数据坑][Link 1] 至此问题解决,主要原因是因为`Advanced yarn-hbase-site`中hbase.master.info.port、HBase Master Port、hbase.regionserver.info.port、hbase.regionserver.port不应该与hbase相同,因为hbase相当对于`yarn-ats`来说是外部。 再者 use_external_hbase=false is_hbase_system_service_launch=false [root@sh102 ~]# lsof -i:17010 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 28419 yarn-ats 570u IPv6 538518 0t0 TCP *:17010 (LISTEN) 下面是yarn中`yarn-hbase-site`配置, ![1][] 下面是hbase中`hbase-site`的端口配置 ![1][1 1] [ambari _]: https://www.cnblogs.com/fbiswt/p/12455364.html [Remove ats-hbase before switching between clusters]: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/remove_ats_hbase_before_switching_between_clusters.html [ambari hdfs _Ambari]: https://blog.csdn.net/weixin_39946364/article/details/111857328 [Yarn timeline service v2.0_AbstractChannel_AnnotatedConnectException_ Connection refused]: https://blog.csdn.net/u011940366/article/details/107207455/ [Configure External HBase for Timeline Service 2.0]: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/data-operating-system/content/configure_hbase_for_timeline_service_2.0.html [HDP _ Timeline Service 2.0]: https://blog.csdn.net/github_39577257/article/details/112727690 [Timeline Service v.2 _HDP3.1]: https://blog.51cto.com/1196740/2387464 [Link 1]: https://www.wyx.cloudns.asia/blog/2019/12/21/big_develop_tip [1]: /images/20221024/f68c3266fa854b4e91e10f65d78b7d39.png [1 1]: /images/20221024/41f333d0f6bc4f4f82b7498634aa0613.png
还没有评论,来说两句吧...