概述
官网:
The ENABLE_EXPR_REWRITES query option controls whether to enable or disable the query compile time optimizations. These optimizations rewrite the expression trees to a more compact and optimized form that helps avoid redundant expression evaluation at run time. Performance optimizations controlled by this query option include:
- Constant folding (added in )
- Extracting common conjuncts from disjunctions (added in )
- Simplify conditionals with constant conditions (added in )
自Impala 2.8(CDH 5.10)版本引入,默认为1,即true。设置为0,即false,可以禁用性能优化。
问题
注:Java代码通过hive-jdbc提交SQL到CDH集群:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.1-cdh6.2.1</version>
</dependency>
AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?)
报错SQL有点复杂,精简失败。在SQL可视化查询验证平台执行也是报错:
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
AnalysisException: GROUP BY expression must not contain aggregate functions: cnt
报错SQL如下:
select concat(cast(cnt as string),'个城市使用') city_cnt,
cnt city_num,level1_name name,level2_name,count(1) cnt
from (
select commodity_id,count(1) cnt
from edw.dim_gy_area_commodity_chn
where tag in (0) and isactive = 1 and status = 0
and price < 10000 and is_valid = 1
group by 1 ) a
join edw.dim_gy_commodity_chn cc on a.commodity_id = cc.id and is_valid = 1
join edw.dim_gy_commodity_category ct on cc.category_code = ct.level3_code
group by 1,2,3,4
之前是执行成功的,在CDH集群升级到7.1.7版本后执行失败。
另外,这个SQL在公司内部的SQL执行平台(adhoc platform)是执行成功:
hive查询引擎是通过beeline
,impala查询引擎通过impala-shell
。程序代码是通过hive-jdbc提交SQL到大数据平台,版本号见文首。
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
经过验证,升级hive-jdbc
版本到3.1.3000.7.1.7.67-1
,不能解决问题。
IllegalStateException: null
在公司adhoc查询平台也是执行失败:
经过简化的SQL如下:
drop table if exists project_b.insure_channel_campaign0 ;
create table project_b.insure_channel_campaign0 as
select
2 as campaign_link,
0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b
on a.dte = b.insert_date and b.channel_name = 'SEM' and b.campaign_name regexp '直加' and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-10-31'
group by 1,2
union all
select
2 as campaign_link,
0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b
on a.dte = b.insert_date and b.channel_name = 'SEM' and (b.campaign_name regexp '年金|开门红' or b.utm_campaign regexp '开门红') and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-12-31'
group by 1,2
;
另外把with子句抽取出来:
with core as (
-- 省略中间若干
)
where cnt >= 20
又报错:AnalysisException: GROUP BY: ordinal must be >= 1: -1
很诡异的是,后面执行这个SQL又没有报错!!!
不算结论的结论:一般地,group by 1,2,3,
遇到columns里面有常量数字的,就可能会报错。
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
但是!!!有些用户写的SQL,通过设置set ENABLE_EXPR_REWRITES = 0;
,不一定可以解决问题:
解决方案:
联系用户手动修改其SQL:
具体而言 :
cast(2 as int) as campaign_link,
cast(0 as int) as is_pay
AnalysisException: GROUP BY: ordinal must be >= 1: 0
也是在旧版平台可以执行,升级CDH版本后不能执行的问题。SQL如下:
select a.dt
,0 as fin_guarantee_id
from fin.fin_di_warranty a
where substr(a.dt,1,7) = substr(to_date(date_sub(now(),1)),1,7)
group by 1,2;
解决方法:前置增加set ENABLE_EXPR_REWRITES = 0;
参考
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/142103.html