You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vim /home/hadoop//Linkis20191218/metadata/target/classes/com/webank/wedatasphere/linkis/metadata/hive/dao/impl/HiveMetaDao.xml
<selectid="getDbsByUser"resultType="java.lang.String"parameterType="java.lang.String">
<!--select NAME from( select t2.NAME as NAME from DB_PRIVS t1, DBS t2 where (lcase(t1.PRINCIPAL_NAME) = #{userName,jdbcType=VARCHAR} OR t1.PRINCIPAL_NAME IN (SELECT ROLE FROM(SELECT r.ROLE_NAME AS ROLE, u.PRINCIPAL_NAME AS USER FROM ROLES r LEFT JOIN (SELECT * FROM ROLE_MAP WHERE PRINCIPAL_TYPE = 'USER') u ON r.ROLE_ID = u.ROLE_ID)AS T where T.USER = #{userName,jdbcType=VARCHAR})) and lcase(t1.DB_PRIV) in ('select','all') and t1.DB_ID =t2.DB_ID union all select t3.NAME as NAME from TBL_PRIVS t1, TBLS t2 , DBS t3 where t1.TBL_ID=t2.TBL_ID and lcase(t1.TBL_PRIV) in ('select','all') and ( lcase(t1.PRINCIPAL_NAME) = #{userName,jdbcType=VARCHAR} or lcase(t1.PRINCIPAL_NAME) in (SELECT ROLE FROM(SELECT r.ROLE_NAME AS ROLE, u.PRINCIPAL_NAME AS USER FROM ROLES r LEFT JOIN (SELECT * FROM ROLE_MAP WHERE PRINCIPAL_TYPE = 'USER') u ON r.ROLE_ID = u.ROLE_ID)AS T where T.USER = #{userName,jdbcType=VARCHAR})) and t2.DB_ID=t3.DB_ID) a GROUP BY NAME order by NAME-->
select name from DBS
</select>
<selectid="getTablesByDbNameAndUser"resultType="map"parameterType="map">
<!--select t2.TBL_NAME as NAME, t2.TBL_TYPE as TYPE, t2.CREATE_TIME as CREATE_TIME, t2.LAST_ACCESS_TIME as LAST_ACCESS_TIME, t2.OWNER as OWNER from DB_PRIVS t1,TBLS t2, DBS t3 where t1.DB_ID =t3.DB_ID and t2.DB_ID=t3.DB_ID and lcase(t1.DB_PRIV) in ('select','all') and lcase(t1.PRINCIPAL_NAME) = #{userName,jdbcType=VARCHAR} and t3.NAME = #{dbName,jdbcType=VARCHAR} union select t2.TBL_NAME as NAME, t2.TBL_TYPE as TYPE, t2.CREATE_TIME as CREATE_TIME, t2.LAST_ACCESS_TIME as LAST_ACCESS_TIME, t2.OWNER as OWNER from DB_PRIVS t1,TBLS t2, DBS t3 where t1.DB_ID =t3.DB_ID and t2.DB_ID=t3.DB_ID and lcase(t1.DB_PRIV) in ('select','all') and lcase(t1.PRINCIPAL_NAME) in (select ROLE_NAME from ROLES where ROLE_ID in (select ROLE_ID from ROLE_MAP where PRINCIPAL_NAME = #{userName,jdbcType=VARCHAR})) and t3.NAME = #{dbName,jdbcType=VARCHAR} union select t2.TBL_NAME as NAME, t2.TBL_TYPE as TYPE, t2.CREATE_TIME as CREATE_TIME, t2.LAST_ACCESS_TIME as LAST_ACCESS_TIME, t2.OWNER as OWNER from TBL_PRIVS t1, TBLS t2 , DBS t3 where t1.TBL_ID=t2.TBL_ID and t2.DB_ID=t3.DB_ID and lcase(t1.TBL_PRIV) in ('select','all') and t1.PRINCIPAL_NAME = #{userName,jdbcType=VARCHAR} and t3.NAME = #{dbName,jdbcType=VARCHAR} union select t2.TBL_NAME as NAME, t2.TBL_TYPE as TYPE, t2.CREATE_TIME as CREATE_TIME, t2.LAST_ACCESS_TIME as LAST_ACCESS_TIME, t2.OWNER as OWNER from TBL_PRIVS t1, TBLS t2 , DBS t3 where t1.TBL_ID=t2.TBL_ID and t2.DB_ID=t3.DB_ID and lcase(t1.TBL_PRIV) in ('select','all') and t1.PRINCIPAL_NAME in (select ROLE_NAME from ROLES where ROLE_ID in (select ROLE_ID from ROLE_MAP where PRINCIPAL_NAME = #{userName,jdbcType=VARCHAR})) and t3.NAME = #{dbName,jdbcType=VARCHAR} order by NAME;-->
select t2.TBL_NAME as NAME, t2.TBL_TYPE as TYPE, t2.CREATE_TIME as CREATE_TIME, t2.LAST_ACCESS_TIME as LAST_ACCESS_TIME, t2.OWNER as OWNER
from TBLS t2 , DBS t3
where
t2.DB_ID=t3.DB_ID
and t3.NAME = #{dbName,jdbcType=VARCHAR}
</select>
一. 应用场景
珑玺科技的大数据管理平台
DMP
依托于Hadoop
的下的Hive
和Spark
等工具链展开, 之前的相关任务都是通过shell脚本的通过定时任务开展, 随着业务需求越来越复杂, 所沉淀的脚本越来越多, 变得难以维护, 增加了后续迭代和项目成员之间的沟通成本. 这时候我们看到微众刚刚开源的这个项目, 可以利用整个平台完成大数据的以下几个工作流:完成从数据的收集, 去重, 清洗, 规整等工作流, 使整个数据治理中的几乎所有工作, 可以通过
DSS
得到顺利流转, 提升了大数据的数据管理和分析的体验.我司的
DMP
平台主要基于Hadoop 2.6
来做集成的, 主要功能集中在元数据处理, 数据的去重,清洗, 和标准化. 实现平台的OLAP
的工作流, 最终实现的业务需求, 主要集中在4个方面:二. 解决的问题
多数据源支持
平台支持主要的数据源有
MySQL
,Hive
, 甚至支持最新的NewSQL
平台分布式数据库TiDB
, 和其他第三方数据数据格式. 使用此平台前, 需要做频繁的数据转换操作, 上下文切换成本比较高; 通过引入此系统之后, 可以通过平台集成的数据交换模块, 非常平滑的引入各种数据源进行交叉分析, 提升了多数据源管理的处理效率.数据脚本开发
DSS
部署之前平台的各种任务都是编写shell脚本, 来实现对这个大数据看分析的流程, 随着业务的迭代, 和需求的增多, 脚本的可维护性变得越来越差, 而通过数据开发Scripts
模块, 完全兼容hql
,MySQL
,PySpark
, 几乎可以重用之前的大部分脚本, 而且执行过程和结果可视化, 提升了数据分析的工作效率.三. 最佳实践
阿里云OSS数据读取的问题
我们大部分数据都存储在阿里云对象存储
OSS
中, 所以需要另外配置读取OSS
的jar
文件同步阿里云
OSS
的jars
包到linkis
的lib
目录下面附:
CDH
集成阿里云OSS
的说明$ cd aliyun-oss-jars/ $ ls -l total 2932 -rw-r--r-- 1 hadoop hadoop 116337 Jan 2 10:59 aliyun-java-sdk-core-3.4.0.jar -rw-r--r-- 1 hadoop hadoop 788137 Jan 2 10:59 aliyun-java-sdk-ecs-4.2.0.jar -rw-r--r-- 1 hadoop hadoop 215492 Jan 2 10:59 aliyun-java-sdk-ram-3.0.0.jar -rw-r--r-- 1 hadoop hadoop 13277 Jan 2 10:59 aliyun-java-sdk-sts-3.0.0.jar -rw-r--r-- 1 hadoop hadoop 562719 Jan 2 10:59 aliyun-sdk-oss-3.4.1.jar -rw-r--r-- 1 hadoop hadoop 71074 Jan 2 15:12 hadoop-aliyun-2.6.0-cdh5.14.4.jar -rw-r--r-- 1 hadoop hadoop 736658 Jan 2 15:10 httpclient-4.5.2.jar -rw-r--r-- 1 hadoop hadoop 326724 Jan 2 15:10 httpcore-4.4.4.jar -rw-r--r-- 1 hadoop hadoop 153115 Jan 2 15:10 jdom-1.1.jar
同步阿里云的库到
DSS
:Scriptis的右侧刷不出来数据,一直在刷新中
步骤1: 修改文件
步骤2: 进入Linkis-20191218/metadata 重新编译
步骤3: 确认相应的包已经更新
步骤4: 重启所有
linkis-metadata
微服务错误码: errCode: 10905
错误日志
解决方案:
到部署目录
linkis
删除下面所有微服务的包软件版本:
Hadoop 2.6.0
MySQL 5.7.26
Hive 1.1.0
HBase 1.2.0
Spark2 2.2.0
The text was updated successfully, but these errors were encountered: