Hive - Parquet format - OR clause in where not working as expected
I am creating a table and inserting data like below
CREATE TABLE `mydb.mytable`(
`seq_num` decimal(18,0),
`ins_upd_flag` char(1)
)
stored as parquet;
INSERT OVERWRITE TABLE mydb.mytable
SELECT 10457 SEQ_NUM,'I' INS_UPD_FLAG UNION ALL
SELECT 10462,'I' UNION ALL
SELECT 10461,'I' UNION ALL
SELECT 10460,'I' UNION ALL
SELECT 10459,'I' UNION ALL
SELECT 10456,'I' UNION ALL
SELECT 10458,'I';
As you can see above, the second column value is just 'I' and it does not have any extra characters.
When I run the following query, it does not result anything
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR INS_UPD_FLAG <> 'I' ;
However, if I run the below query it returns all the inserted rows as expected
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG)) -- both length =1
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR TRIM(INS_UPD_FLAG) <> 'I' ;
Any explanation on why this is happening and how to resolve the same? I need the results without using trim
Please note that this issue is happening only with OR condition. If I run the below query, I am getting results
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 and INS_UPD_FLAG = 'I' ;
Also, if I create table with ORC or Text Format, I am getting results without trim.
hive hiveql hadoop2
add a comment |
I am creating a table and inserting data like below
CREATE TABLE `mydb.mytable`(
`seq_num` decimal(18,0),
`ins_upd_flag` char(1)
)
stored as parquet;
INSERT OVERWRITE TABLE mydb.mytable
SELECT 10457 SEQ_NUM,'I' INS_UPD_FLAG UNION ALL
SELECT 10462,'I' UNION ALL
SELECT 10461,'I' UNION ALL
SELECT 10460,'I' UNION ALL
SELECT 10459,'I' UNION ALL
SELECT 10456,'I' UNION ALL
SELECT 10458,'I';
As you can see above, the second column value is just 'I' and it does not have any extra characters.
When I run the following query, it does not result anything
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR INS_UPD_FLAG <> 'I' ;
However, if I run the below query it returns all the inserted rows as expected
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG)) -- both length =1
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR TRIM(INS_UPD_FLAG) <> 'I' ;
Any explanation on why this is happening and how to resolve the same? I need the results without using trim
Please note that this issue is happening only with OR condition. If I run the below query, I am getting results
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 and INS_UPD_FLAG = 'I' ;
Also, if I create table with ORC or Text Format, I am getting results without trim.
hive hiveql hadoop2
add a comment |
I am creating a table and inserting data like below
CREATE TABLE `mydb.mytable`(
`seq_num` decimal(18,0),
`ins_upd_flag` char(1)
)
stored as parquet;
INSERT OVERWRITE TABLE mydb.mytable
SELECT 10457 SEQ_NUM,'I' INS_UPD_FLAG UNION ALL
SELECT 10462,'I' UNION ALL
SELECT 10461,'I' UNION ALL
SELECT 10460,'I' UNION ALL
SELECT 10459,'I' UNION ALL
SELECT 10456,'I' UNION ALL
SELECT 10458,'I';
As you can see above, the second column value is just 'I' and it does not have any extra characters.
When I run the following query, it does not result anything
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR INS_UPD_FLAG <> 'I' ;
However, if I run the below query it returns all the inserted rows as expected
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG)) -- both length =1
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR TRIM(INS_UPD_FLAG) <> 'I' ;
Any explanation on why this is happening and how to resolve the same? I need the results without using trim
Please note that this issue is happening only with OR condition. If I run the below query, I am getting results
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 and INS_UPD_FLAG = 'I' ;
Also, if I create table with ORC or Text Format, I am getting results without trim.
hive hiveql hadoop2
I am creating a table and inserting data like below
CREATE TABLE `mydb.mytable`(
`seq_num` decimal(18,0),
`ins_upd_flag` char(1)
)
stored as parquet;
INSERT OVERWRITE TABLE mydb.mytable
SELECT 10457 SEQ_NUM,'I' INS_UPD_FLAG UNION ALL
SELECT 10462,'I' UNION ALL
SELECT 10461,'I' UNION ALL
SELECT 10460,'I' UNION ALL
SELECT 10459,'I' UNION ALL
SELECT 10456,'I' UNION ALL
SELECT 10458,'I';
As you can see above, the second column value is just 'I' and it does not have any extra characters.
When I run the following query, it does not result anything
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR INS_UPD_FLAG <> 'I' ;
However, if I run the below query it returns all the inserted rows as expected
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG)) -- both length =1
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 OR TRIM(INS_UPD_FLAG) <> 'I' ;
Any explanation on why this is happening and how to resolve the same? I need the results without using trim
Please note that this issue is happening only with OR condition. If I run the below query, I am getting results
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0 and INS_UPD_FLAG = 'I' ;
Also, if I create table with ORC or Text Format, I am getting results without trim.
hive hiveql hadoop2
hive hiveql hadoop2
asked Nov 22 at 18:38
Mohan
3261210
3261210
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
I managed to re-pro your issue and getting same results as you have explained.
I ran a query to see what exactly your where condition is producing:
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
and results are as following:
|seq_num|ins_upd_flag|_c1|_c2|_c3 |_c4 |_c5 |
|10457 |I |1 |1 |true|true|true|
|10462 |I |1 |1 |true|true|true|
|10461 |I |1 |1 |true|true|true|
|10460 |I |1 |1 |true|true|true|
|10459 |I |1 |1 |true|true|true|
|10456 |I |1 |1 |true|true|true|
|10458 |I |1 |1 |true|true|true|
As you can see it is returning true
for your join condition.
Out of curiosity, I wrapped the where condition around ()
and got the results without trim.
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
where (SEQ_NUM <> 0 or INS_UPD_FLAG = 'I')
Results: Same as previous query.
I am using Hive 1.2.1000 via Hue 2.6.1-227.
I don't know what magic ()
is doing and why query is not working without it.
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
add a comment |
may i know reason why you dont want to use the trim function.As hadoop stores the data in files there is alot of chances that it can add the space while inserting.i faced this issue a lot of time while using parque format.it depends sometimes you face it some time you wont face it.
whatever it is can you run you query only based on this condition
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0
if it is not returning anything the way data inserted is wrong..please try it let me know
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
add a comment |
The first thing is that I am not sure why you aren't seeing the expected results. I ran a very similar test, below, and received the expected results. What version of Hive are you on? I did the same tests on hive-1.1.0+cdh5.13.3+1307 (Cloudera distro w/v5.13):
CREATE TABLE test_cond_parq
(
seq_num decimal(10,0),
ind string
)
STORED AS PARQUET;
INSERT INTO TABLE test_cond_parq
SELECT 10457, 'I' UNION ALL
SELECT 10458, 'I'
;
Then I ran the following query, similar to yours above...
SELECT *, LENGTH(ind) len_ind, LENGTH(TRIM(ind)) len_trim_ind
FROM test_cond_parq
WHERE seq_num <> 0 OR ind <> 'I';
This returned the following results:
10457 | I | 1 | 1
10458 | I | 1 | 1
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436581%2fhive-parquet-format-or-clause-in-where-not-working-as-expected%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I managed to re-pro your issue and getting same results as you have explained.
I ran a query to see what exactly your where condition is producing:
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
and results are as following:
|seq_num|ins_upd_flag|_c1|_c2|_c3 |_c4 |_c5 |
|10457 |I |1 |1 |true|true|true|
|10462 |I |1 |1 |true|true|true|
|10461 |I |1 |1 |true|true|true|
|10460 |I |1 |1 |true|true|true|
|10459 |I |1 |1 |true|true|true|
|10456 |I |1 |1 |true|true|true|
|10458 |I |1 |1 |true|true|true|
As you can see it is returning true
for your join condition.
Out of curiosity, I wrapped the where condition around ()
and got the results without trim.
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
where (SEQ_NUM <> 0 or INS_UPD_FLAG = 'I')
Results: Same as previous query.
I am using Hive 1.2.1000 via Hue 2.6.1-227.
I don't know what magic ()
is doing and why query is not working without it.
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
add a comment |
I managed to re-pro your issue and getting same results as you have explained.
I ran a query to see what exactly your where condition is producing:
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
and results are as following:
|seq_num|ins_upd_flag|_c1|_c2|_c3 |_c4 |_c5 |
|10457 |I |1 |1 |true|true|true|
|10462 |I |1 |1 |true|true|true|
|10461 |I |1 |1 |true|true|true|
|10460 |I |1 |1 |true|true|true|
|10459 |I |1 |1 |true|true|true|
|10456 |I |1 |1 |true|true|true|
|10458 |I |1 |1 |true|true|true|
As you can see it is returning true
for your join condition.
Out of curiosity, I wrapped the where condition around ()
and got the results without trim.
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
where (SEQ_NUM <> 0 or INS_UPD_FLAG = 'I')
Results: Same as previous query.
I am using Hive 1.2.1000 via Hue 2.6.1-227.
I don't know what magic ()
is doing and why query is not working without it.
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
add a comment |
I managed to re-pro your issue and getting same results as you have explained.
I ran a query to see what exactly your where condition is producing:
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
and results are as following:
|seq_num|ins_upd_flag|_c1|_c2|_c3 |_c4 |_c5 |
|10457 |I |1 |1 |true|true|true|
|10462 |I |1 |1 |true|true|true|
|10461 |I |1 |1 |true|true|true|
|10460 |I |1 |1 |true|true|true|
|10459 |I |1 |1 |true|true|true|
|10456 |I |1 |1 |true|true|true|
|10458 |I |1 |1 |true|true|true|
As you can see it is returning true
for your join condition.
Out of curiosity, I wrapped the where condition around ()
and got the results without trim.
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
where (SEQ_NUM <> 0 or INS_UPD_FLAG = 'I')
Results: Same as previous query.
I am using Hive 1.2.1000 via Hue 2.6.1-227.
I don't know what magic ()
is doing and why query is not working without it.
I managed to re-pro your issue and getting same results as you have explained.
I ran a query to see what exactly your where condition is producing:
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
and results are as following:
|seq_num|ins_upd_flag|_c1|_c2|_c3 |_c4 |_c5 |
|10457 |I |1 |1 |true|true|true|
|10462 |I |1 |1 |true|true|true|
|10461 |I |1 |1 |true|true|true|
|10460 |I |1 |1 |true|true|true|
|10459 |I |1 |1 |true|true|true|
|10456 |I |1 |1 |true|true|true|
|10458 |I |1 |1 |true|true|true|
As you can see it is returning true
for your join condition.
Out of curiosity, I wrapped the where condition around ()
and got the results without trim.
SELECT
T.*,
length(INS_UPD_FLAG),
length(trim(INS_UPD_FLAG)),
SEQ_NUM <> 0,
INS_UPD_FLAG = 'I',
SEQ_NUM <> 0 or INS_UPD_FLAG = 'I'
FROM mydb.mytable T
where (SEQ_NUM <> 0 or INS_UPD_FLAG = 'I')
Results: Same as previous query.
I am using Hive 1.2.1000 via Hue 2.6.1-227.
I don't know what magic ()
is doing and why query is not working without it.
answered Nov 24 at 5:35
Ambrish
2,71421631
2,71421631
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
add a comment |
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
Good finding. Unfortunately, the brackets '()' does not do anything in my version of hive (Hive 1.2.1000.2.6.5.0-292 on HDP). This looks like a bug with Hive 1.2
– Mohan
Nov 24 at 17:55
add a comment |
may i know reason why you dont want to use the trim function.As hadoop stores the data in files there is alot of chances that it can add the space while inserting.i faced this issue a lot of time while using parque format.it depends sometimes you face it some time you wont face it.
whatever it is can you run you query only based on this condition
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0
if it is not returning anything the way data inserted is wrong..please try it let me know
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
add a comment |
may i know reason why you dont want to use the trim function.As hadoop stores the data in files there is alot of chances that it can add the space while inserting.i faced this issue a lot of time while using parque format.it depends sometimes you face it some time you wont face it.
whatever it is can you run you query only based on this condition
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0
if it is not returning anything the way data inserted is wrong..please try it let me know
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
add a comment |
may i know reason why you dont want to use the trim function.As hadoop stores the data in files there is alot of chances that it can add the space while inserting.i faced this issue a lot of time while using parque format.it depends sometimes you face it some time you wont face it.
whatever it is can you run you query only based on this condition
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0
if it is not returning anything the way data inserted is wrong..please try it let me know
may i know reason why you dont want to use the trim function.As hadoop stores the data in files there is alot of chances that it can add the space while inserting.i faced this issue a lot of time while using parque format.it depends sometimes you face it some time you wont face it.
whatever it is can you run you query only based on this condition
SELECT T.*, length(INS_UPD_FLAG), length(trim(INS_UPD_FLAG))
FROM mydb.mytable T
WHERE SEQ_NUM <> 0
if it is not returning anything the way data inserted is wrong..please try it let me know
answered Nov 23 at 3:27
sivaraj
143
143
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
add a comment |
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
I do not want to use trim because I am having a similar char column as partitioned column and I am not sure if a condition with trim(partitioned_column) will be used for predicate push down
– Mohan
Nov 23 at 6:56
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
*I am not sure trim(partitioned_column) will be used for partition scan
– Mohan
Nov 23 at 7:02
add a comment |
The first thing is that I am not sure why you aren't seeing the expected results. I ran a very similar test, below, and received the expected results. What version of Hive are you on? I did the same tests on hive-1.1.0+cdh5.13.3+1307 (Cloudera distro w/v5.13):
CREATE TABLE test_cond_parq
(
seq_num decimal(10,0),
ind string
)
STORED AS PARQUET;
INSERT INTO TABLE test_cond_parq
SELECT 10457, 'I' UNION ALL
SELECT 10458, 'I'
;
Then I ran the following query, similar to yours above...
SELECT *, LENGTH(ind) len_ind, LENGTH(TRIM(ind)) len_trim_ind
FROM test_cond_parq
WHERE seq_num <> 0 OR ind <> 'I';
This returned the following results:
10457 | I | 1 | 1
10458 | I | 1 | 1
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
add a comment |
The first thing is that I am not sure why you aren't seeing the expected results. I ran a very similar test, below, and received the expected results. What version of Hive are you on? I did the same tests on hive-1.1.0+cdh5.13.3+1307 (Cloudera distro w/v5.13):
CREATE TABLE test_cond_parq
(
seq_num decimal(10,0),
ind string
)
STORED AS PARQUET;
INSERT INTO TABLE test_cond_parq
SELECT 10457, 'I' UNION ALL
SELECT 10458, 'I'
;
Then I ran the following query, similar to yours above...
SELECT *, LENGTH(ind) len_ind, LENGTH(TRIM(ind)) len_trim_ind
FROM test_cond_parq
WHERE seq_num <> 0 OR ind <> 'I';
This returned the following results:
10457 | I | 1 | 1
10458 | I | 1 | 1
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
add a comment |
The first thing is that I am not sure why you aren't seeing the expected results. I ran a very similar test, below, and received the expected results. What version of Hive are you on? I did the same tests on hive-1.1.0+cdh5.13.3+1307 (Cloudera distro w/v5.13):
CREATE TABLE test_cond_parq
(
seq_num decimal(10,0),
ind string
)
STORED AS PARQUET;
INSERT INTO TABLE test_cond_parq
SELECT 10457, 'I' UNION ALL
SELECT 10458, 'I'
;
Then I ran the following query, similar to yours above...
SELECT *, LENGTH(ind) len_ind, LENGTH(TRIM(ind)) len_trim_ind
FROM test_cond_parq
WHERE seq_num <> 0 OR ind <> 'I';
This returned the following results:
10457 | I | 1 | 1
10458 | I | 1 | 1
The first thing is that I am not sure why you aren't seeing the expected results. I ran a very similar test, below, and received the expected results. What version of Hive are you on? I did the same tests on hive-1.1.0+cdh5.13.3+1307 (Cloudera distro w/v5.13):
CREATE TABLE test_cond_parq
(
seq_num decimal(10,0),
ind string
)
STORED AS PARQUET;
INSERT INTO TABLE test_cond_parq
SELECT 10457, 'I' UNION ALL
SELECT 10458, 'I'
;
Then I ran the following query, similar to yours above...
SELECT *, LENGTH(ind) len_ind, LENGTH(TRIM(ind)) len_trim_ind
FROM test_cond_parq
WHERE seq_num <> 0 OR ind <> 'I';
This returned the following results:
10457 | I | 1 | 1
10458 | I | 1 | 1
edited Nov 24 at 5:18
Ambrish
2,71421631
2,71421631
answered Nov 22 at 23:01
ajdams
2,0121117
2,0121117
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
add a comment |
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
I am using Hive 1.2.1000.2.6.5.0-292
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
Thank you for trying it out. I am surprised why it is not working in the version I am using.
– Mohan
Nov 23 at 5:39
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
NP. What SQL editor are you running your test in? I did it in HUE so it could perhaps be related to that?
– ajdams
Nov 23 at 13:10
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
I ran my queries directly in Hive command line interface
– Mohan
Nov 23 at 14:01
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436581%2fhive-parquet-format-or-clause-in-where-not-working-as-expected%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown