AlexRomeo
发布于 2024-06-03 / 79 阅读
0
0

MDX-for-Kylin-下载与部署

介绍

MDX for Kylin官网:https://github.com/Kyligence/mdx-kylin

MDX for Kylin 是基于 Mondrian 二次开发的、由 Kyligence 贡献的、使用 Apache Kylin 作为数据源的 MDX 查询引擎 。MDX for Kylin 的使用体验比较接近 Microsoft SSAS,可以集成多种数据分析工具,包括 Microsoft Excel、Tableau 等,可以为大数据分析场景下提供更极致的体验。

**MDX for Kylin 是在决策支持和业务分析中使用的分析数据引擎。MDX for Kylin 助力消除数据孤岛,统一数据口径,提高数据业务转化效率,提升运营决策能力。欢迎阅读 **技术文章 了解更多。

MDX for Kylin 相对其它开源 MDX 查询引擎,具有以下优势:

  • 更好支持 BI (Excel/Tableau/Power BI 等) 产品,适配 XMLA 协议;
  • 针对 BI 的 MDX Query 进行了特定优化重写;
  • 适配 Kylin 查询,通过 Kylin 的预计算能力加速 MDX 查询;
  • 通过简洁易懂的操作界面,提供了统一的指标定义和管理能力。

语义层 - 强大的业务语义层和MDX查询接口

MDX for Kylin 能自动同步 Kylin 中的模型,并基于这些模型进行语义定义,将数据模型转换为业务友好的语言,赋予数据业务价值。MDX for Kylin 语义层为业务提供统一的分析指标库,支持年累计,月累计,同比环比等复杂业务计算,统一业务数据口径。MDX for Kylin 语义层还提供了 MDX 查询接口,可以对接 Excel、Tableau、Smartbi 等 BI 工具进行多维分析。

细粒度权限管控

MDX for Kylin 提供了对于数据集所有语义信息的权限管控,满足分析平台对于数据安全管理的需求。

用户手册和源代码获取

在相关快速介绍和使用的基础上,Apache Kylin 社区提供了关于 MDX for Kylin 的用户使用手册。更多关于 MDX for Kylin 的介绍和使用,请参考:

https://kyligence.github.io/mdx-kylin/zh-hans/

源代码基于 Apache License V2 开源,源代码地址:

https://github.com/Kyligence/mdx-kylin

下载与安装

官方部署文档

MDX for Kylin 需要准备一个部署好的 Apache Kylin,Kylin 的适配版本等注意事项请查看:

https://github.com/Kyligence/mdx-kylin/issues/1

在准备好适配的 Kylin 后,大家可以参考手册来完成 MDX for Kylin 的安装和部署:

https://kyligence.github.io/mdx-kylin/zh-hans/installation/

简易部署文档

获取MDX程序及MDX for Kylin.jar包的地址,后面还会继续下载,这里只做一个记录。

# MDX
wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/tar/mdx-for-kylin-1.0.0-beta.tar.gz
# MDX for Kylin
wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/jars/kylin-server-base-4.0.0.jar

获取到程序及jar包后,启动kylin(因为当前是kylin在docker里面运行的,所以不需要执行下面的命令)

${KYLIN_HOME}/bin/kylin.sh start

替换kylin的jar包

cd ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-IN
mv kylin-server-base-4.0.0.jar kylin-server-base-4.0.0.jar.bak
wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/jars/kylin-server-base-4.0.0.jar

修改tomcat目录下的jar包后,需要重新启动tomcat,也就意味着重新启动kylin

${KYLIN_HOME}/bin/kylin.sh restart

下载MDX for Kylin

首先进入kylin所在的docker

# 进入容器
docker exec -it hive-dw /bin/bash

# 下载MDX
wget https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/tar/mdx-for-kylin-1.0.0-beta.tar.gz

# 解压
tar -xvf mdx-for-kylin-1.0.0-beta.tar.gz

创建mdx需要的数据库

在mysql中创建mdx库,并执行脚本,可以把下面的内容保存成一个脚本文件(文件存放在E:\HK_Workspace\01授课\北城\2023年\上\数据仓库原理与实践\07-授课相关\mdx建表语句/mdx.sql):

create database if not exists mdx;
use mdx;
CREATE TABLE `calculate_measure` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `name` varchar(300) NOT NULL,
  `expression` varchar(5000) NOT NULL,
  `format` varchar(150) NOT NULL,
  `format_type` varchar(50) DEFAULT '' COMMENT 'format type, such as currency',
  `measure_folder` varchar(200) NOT NULL DEFAULT '' COMMENT 'the folder which CM belongs to',
  `extend` varchar(3000) NOT NULL DEFAULT '',
  `visible_flag` tinyint(1) NOT NULL DEFAULT '1',
  `translation` varchar(500) DEFAULT NULL,
  `subfolder` varchar(100) DEFAULT NULL,
  `non_empty_behavior` varchar(3000) NOT NULL DEFAULT '[]',
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='calculation measure table';

CREATE TABLE `common_dim_relation` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `model_related` varchar(100) NOT NULL,
  `relation` varchar(2000) NOT NULL COMMENT 'the relation between two models, like dim_t1=dim_t1,dim_t2=dim_t2',
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_pk` (`dataset_id`,`model`,`model_related`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='common dimension among models';

CREATE TABLE `custom_hierarchy` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `dim_table` varchar(300) NOT NULL COMMENT 'such as:DEFAULT.KYLIN_SALES',
  `name` varchar(300) NOT NULL,
  `dim_col` varchar(300) NOT NULL,
  `description` varchar(1000) NOT NULL DEFAULT '',
  `translation` varchar(500) DEFAULT NULL,
  `weight_col` varchar(300) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='custom hierarchy table';

CREATE TABLE `dataset` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `project` varchar(100) NOT NULL,
  `dataset` varchar(100) NOT NULL,
  `status` varchar(10) NOT NULL DEFAULT 'NORMAL' COMMENT 'dataset status: NORMAL | BROKEN',
  `broken_msg` text COMMENT 'the broken detail info when dataset is broken',
  `canvas` varchar(5000) NOT NULL DEFAULT '',
  `front_v` varchar(20) NOT NULL DEFAULT 'v0.1',
  `create_user` varchar(255) NOT NULL,
  `create_time` bigint(20) NOT NULL DEFAULT '0',
  `modify_time` bigint(20) NOT NULL DEFAULT '0',
  `extend` varchar(5000) NOT NULL DEFAULT '',
  `translation_types` varchar(100) DEFAULT NULL,
  `access` int(11) DEFAULT '0' COMMENT '0: allow_list, 1:block_list',
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_pk` (`project`,`dataset`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='semantic-dataset';

CREATE TABLE `dim_table_model_rel` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `dim_table` varchar(300) NOT NULL,
  `relation` int(11) NOT NULL COMMENT 'joint type,0:joint | 1:not joint | 2: many to many',
  `intermediate_dim_table` varchar(300) NOT NULL DEFAULT '' COMMENT 'when relation is 2,need to specify it',
  `primary_dim_col` varchar(300) NOT NULL DEFAULT '' COMMENT 'when relation is 2,need to specify it',
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='the relationship between dimension table and model';

CREATE TABLE `mdx_info` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `mdx_version` varchar(50) DEFAULT NULL,
  `create_time` bigint(20) NOT NULL DEFAULT '0',
  `modify_time` bigint(20) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COMMENT='mdx info';

CREATE TABLE `mdx_query` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `mdx_query_id` char(36) NOT NULL,
  `mdx_text` text NOT NULL,
  `start` bigint(20) DEFAULT NULL,
  `total_execution_time` bigint(20) DEFAULT NULL,
  `username` varchar(255) NOT NULL,
  `success` tinyint(1) NOT NULL DEFAULT '0',
  `project` varchar(255) NOT NULL,
  `application` varchar(64) NOT NULL,
  `mdx_cache_used` tinyint(1) NOT NULL DEFAULT '0',
  `before_connection` bigint(20) DEFAULT NULL,
  `connection` bigint(20) DEFAULT NULL,
  `hierarchy_load` bigint(20) DEFAULT NULL,
  `olaplayout_construction` bigint(20) DEFAULT NULL,
  `aggregationqueries_construction` bigint(20) DEFAULT NULL,
  `aggregationqueries_execution` bigint(20) DEFAULT NULL,
  `otherresult_construction` bigint(20) DEFAULT NULL,
  `network_package` int(11) DEFAULT NULL,
  `timeout` tinyint(1) NOT NULL DEFAULT '0',
  `message` text,
  `calculate_axes` bigint(20) DEFAULT NULL,
  `calculate_cell` bigint(20) DEFAULT NULL,
  `calculate_cellrequest_num` bigint(20) DEFAULT NULL,
  `create_rolapresult` bigint(20) DEFAULT NULL,
  `create_multidimensional_dataset` bigint(20) DEFAULT NULL,
  `marshall_soap_message` bigint(20) DEFAULT NULL,
  `dataset_name` varchar(255) NOT NULL,
  `gateway` tinyint(1) NOT NULL DEFAULT '0',
  `other_used` tinyint(1) NOT NULL DEFAULT '0',
  `node` varchar(64) DEFAULT NULL,
  `reserved_field_1` varchar(2000) DEFAULT NULL,
  `reserved_field_2` varchar(2000) DEFAULT NULL,
  `reserved_field_3` varchar(5000) DEFAULT NULL,
  `reserved_field_4` varchar(5000) DEFAULT NULL,
  `reserved_field_5` text,
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_pk` (`mdx_query_id`),
  KEY `mdx_query_id_key` (`mdx_query_id`),
  KEY `project_key` (`project`),
  KEY `start_key` (`start`),
  KEY `success_key` (`success`),
  KEY `total_execution_time_key` (`total_execution_time`),
  KEY `dataset_name_key` (`dataset_name`),
  KEY `node_key` (`node`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='MDX QUERY table';

CREATE TABLE `measure_group` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `name` varchar(300) NOT NULL,
  `dim_table` varchar(300) NOT NULL DEFAULT '',
  `calculate_measure` varchar(300) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='measure group info table';

CREATE TABLE `named_dim_col` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `dim_table` varchar(300) NOT NULL COMMENT 'KYLIN_SALES',
  `dim_col` varchar(300) NOT NULL,
  `dim_col_alias` varchar(300) NOT NULL,
  `col_type` int(11) NOT NULL DEFAULT '0' COMMENT '0:default, 1:levelYears, 2:levelQuarters, 3:levelMonths, 4:levelWeeks',
  `data_type` varchar(50) NOT NULL DEFAULT '',
  `extend` varchar(3000) NOT NULL DEFAULT '',
  `visible_flag` tinyint(1) NOT NULL DEFAULT '1',
  `name_column` varchar(300) DEFAULT NULL,
  `value_column` varchar(300) DEFAULT NULL,
  `translation` varchar(500) DEFAULT NULL,
  `subfolder` varchar(100) DEFAULT NULL,
  `default_member` varchar(1000) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='table column alias and type info';

CREATE TABLE `named_dim_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `dim_table` varchar(300) NOT NULL COMMENT 'KYLIN_SALES',
  `dim_table_alias` varchar(300) NOT NULL DEFAULT '',
  `dim_table_type` varchar(100) NOT NULL DEFAULT 'regular' COMMENT 'values: regular|time',
  `actual_table` varchar(300) NOT NULL DEFAULT '',
  `fact_table` varchar(300) NOT NULL DEFAULT '',
  `translation` varchar(500) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='dim table alias and type info';

CREATE TABLE `named_measure` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `model` varchar(100) NOT NULL,
  `name` varchar(300) NOT NULL,
  `alias` varchar(300) NOT NULL COMMENT 'measure name renamed in MDX',
  `dim_column` varchar(300) NOT NULL DEFAULT '',
  `data_type` varchar(50) NOT NULL DEFAULT '',
  `expression` varchar(200) NOT NULL DEFAULT '',
  `extend` varchar(3000) NOT NULL DEFAULT '',
  `visible_flag` tinyint(1) NOT NULL DEFAULT '1',
  `translation` varchar(500) DEFAULT NULL,
  `format` varchar(150) DEFAULT NULL,
  `format_type` varchar(50) DEFAULT '' COMMENT 'format type, such as currency',
  `subfolder` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='measure alias info';

CREATE TABLE `named_set` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dataset_id` int(11) NOT NULL,
  `name` varchar(300) NOT NULL,
  `expression` varchar(5000) NOT NULL,
  `folder` varchar(200) NOT NULL DEFAULT '' COMMENT 'the folder which named set belongs to',
  `location` varchar(300) NOT NULL DEFAULT '' COMMENT 'the location which named set locates',
  `extend` varchar(3000) NOT NULL DEFAULT '',
  `visible_flag` tinyint(1) NOT NULL DEFAULT '1',
  `translation` varchar(500) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dataset_id_key` (`dataset_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='named set info';
 
CREATE TABLE `role_info` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) NOT NULL,
  `extend` varchar(20000) NOT NULL DEFAULT '',
  `description` varchar(500) NOT NULL,
  `create_time` bigint(20) NOT NULL DEFAULT '0',
  `modify_time` bigint(20) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `name_key` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COMMENT='role info';

CREATE TABLE `sql_query` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `mdx_query_id` char(36) NOT NULL,
  `sql_text` text NOT NULL,
  `sql_execution_time` bigint(20) DEFAULT NULL,
  `sql_cache_used` tinyint(1) NOT NULL DEFAULT '0',
  `exec_status` tinyint(1) NOT NULL DEFAULT '1',
  `ke_query_id` varchar(100) DEFAULT NULL,
  `reserved_field_1` varchar(2000) DEFAULT NULL,
  `reserved_field_2` varchar(2000) DEFAULT NULL,
  `reserved_field_3` varchar(5000) DEFAULT NULL,
  `reserved_field_4` varchar(5000) DEFAULT NULL,
  `reserved_field_5` text,
  PRIMARY KEY (`id`),
  KEY `mdx_query_id_key` (`mdx_query_id`),
  CONSTRAINT `sql_mdx_query_id` FOREIGN KEY (`mdx_query_id`) REFERENCES `mdx_query` (`mdx_query_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='SQL QUERY table';

CREATE TABLE `user_info` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `username` varchar(255) NOT NULL,
  `password` varchar(4096) NOT NULL,
  `active` int(11) NOT NULL,
  `license_auth` int(11) NOT NULL,
  `login_count` int(11) NOT NULL DEFAULT '0',
  `last_login` bigint(20) NOT NULL DEFAULT '0',
  `create_time` bigint(20) NOT NULL DEFAULT '0',
  `conf_usr` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `username_key` (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='user information';

在容器中执行命令:

# 需要注意mdx.sql文件的位置,如果不在当前目录下需要在前面加入路径
mysql -uroot -p123456 < mdx.sql

因为mdx需要去连接kylin,所以要通过修改mdx的配置文件把kylin的信息填写进去,文件在mdx的conf目录下。(其实下面的内容配置文件本身就有。可以逐个查找。)

# 打开文件
vim insight.properties

# 处理文件内容
# 这里是kylin配置信息
insight.kylin.host=localhost
insight.kylin.port=7070
# 这里是数据库配置信息
insight.database.type=mysql
insight.database.username=root
insight.database.ip=localhost
insight.database.name=mdx
insight.database.port=3306
# 不要修改这个密码字符串,这是个加密后的
insight.database.password=698d2c7907fc9b6dbe7f8a8c4cb0297a
# 这里是mdx的配置信息
insight.mdx.cluster.nodes=127.0.0.1:7080
insight.semantic.datasource-version=2
insight.semantic.port=7080
insight.mdx.jvm.xms=-Xms2g
insight.mdx.jvm.xmx=-Xmx2g

进入MDX目录,启动MDX

cd /home/admin/mdx-for-kylin-1.0.0-beta/bin/
./mdx.sh start

在首次启动时,会出现****Can't synchronize metadata. 此时需要登录界面,进行配置

Can't synchronize metadata. The connection user information or password maybe empty or has been changed, please contact system admin to update in Configuration page under Management. Please see logs/semantic.log for detail.

访问mdx

通过浏览器访问

http://192.168.159.101:7080/login

登录的账号和密码是admin/KYLIN

image-20230509224424712

**首次访问的时候强制使用kylin的用户名和密码(默认admin和KYLIN),**注意这里不是要求修改密码!

image-20230509224546152

点击ok后会看到同步成功,此后再启动mdx时就会恢复正常。

回到容器中重启mdx

./mdx.sh restart

返回的信息

Working directory: /home/admin/mdx-for-kylin-1.0.0-beta
MDX context path: /
Stop MDX for Kylin...
.
Semantic service has been stopped, pid=25025.
MDX for Kylin is now stopped.
Start MDX for Kylin...
MDX_HOME=/home/admin/mdx-for-kylin-1.0.0-beta
MDX_CONF=/home/admin/mdx-for-kylin-1.0.0-beta/conf
JAVA_HOME has been set,it's /home/admin/jdk1.8.0_141
JAVA=/home/admin/jdk1.8.0_141/bin/java
Use database: mysql
JVM memory minimum set as : -Xms2g
JVM memory maximum set as : -Xmx2g
Use /home/admin/mdx-for-kylin-1.0.0-beta/conf/log4j2.xml 
Time zone is specified as: GMT+8:00
Semantic service is starting at port 7080, please check the log at logs/semantic.log.
..........
MDX for Kylin is started. Now you can visit http://0.0.0.0:7080/ to explore.
Starting syncing metadata...
Successfully starting syncing metadata...

至此,MDX for Kylin安装完成。


评论