尚硅谷大數據技術之Hive(新)第9章 企業級調優
第9章?企業級調優
9.1 Fetch抓取
Fetch抓取是指,Hive中對某些情況的查詢可以不必使用MapReduce計算。例如:SELECT * FROM employees;在這種情況下,Hive可以簡單地讀取employee對應的存儲目錄下的文件,然后輸出查詢結果到控制臺。
在hive-default.xml.template文件中hive.fetch.task.conversion默認是more,老版本hive默認是minimal,該屬性修改為more以后,在全局查找、字段查找、limit查找等都不走mapreduce。
<property> ????<name>hive.fetch.task.conversion</name> ????<value>more</value> ????<description> ??????Expects one of [none, minimal, more]. ??????Some select queries can be converted to single FETCH task minimizing latency. ??????Currently the query should be single sourced not having any subquery and should not have ??????any aggregations or distincts (which incurs RS), lateral views and joins. ??????0. none : disable hive.fetch.task.conversion ??????1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only ??????2. more ?: SELECT, FILTER, LIMIT only (support TABLESAMPLE and virtual columns) ????</description> ??</property> |
案例實操:
1)把hive.fetch.task.conversion設置成none,然后執行查詢語句,都會執行mapreduce程序。
hive (default)> set hive.fetch.task.conversion=none;
hive (default)> select * from emp;
hive (default)> select ename from emp;
hive (default)> select ename from emp limit 3;
2)把hive.fetch.task.conversion設置成more,然后執行查詢語句,如下查詢方式都不會執行mapreduce程序。
hive (default)> set hive.fetch.task.conversion=more;
hive (default)> select * from emp;
hive (default)> select ename from emp;
hive (default)> select ename from emp limit 3;