Storing multi-byte data in BLOB for single byte oracle deployments











up vote
0
down vote

favorite












Based on the our client requirements we configure our oracle (version - 12c) deployments to support single or multi-byte data (through character set setting). There is a need to cache third party multi-byte data(json) for performance reasons. We found that we could encode the data in UTF-8 and persist it (after converting it to bytes) in a BLOB column of an Oracle table. This is a hack that allows us to store multi-byte data in single byte deployments. There are certain limitations that come with this approach like




  1. The data cannot be queried or updated through SQL code (stored procedures).

  2. Search operation using for e.g. LIKE operators could not be performed.

  3. Marshaling and unmarshaling overhead for every operation at the application layer (java)


Assuming we compromise with these limitations are there any other drawbacks that we should be aware off?



Thanks.










share|improve this question


















  • 2




    1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
    – Carlo Sirna
    Nov 22 at 6:17






  • 1




    2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
    – Carlo Sirna
    Nov 22 at 6:25










  • Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
    – Wernfried Domscheit
    Nov 22 at 9:55






  • 1




    @AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
    – Carlo Sirna
    Nov 22 at 17:47








  • 1




    @AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
    – Carlo Sirna
    Nov 22 at 17:58















up vote
0
down vote

favorite












Based on the our client requirements we configure our oracle (version - 12c) deployments to support single or multi-byte data (through character set setting). There is a need to cache third party multi-byte data(json) for performance reasons. We found that we could encode the data in UTF-8 and persist it (after converting it to bytes) in a BLOB column of an Oracle table. This is a hack that allows us to store multi-byte data in single byte deployments. There are certain limitations that come with this approach like




  1. The data cannot be queried or updated through SQL code (stored procedures).

  2. Search operation using for e.g. LIKE operators could not be performed.

  3. Marshaling and unmarshaling overhead for every operation at the application layer (java)


Assuming we compromise with these limitations are there any other drawbacks that we should be aware off?



Thanks.










share|improve this question


















  • 2




    1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
    – Carlo Sirna
    Nov 22 at 6:17






  • 1




    2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
    – Carlo Sirna
    Nov 22 at 6:25










  • Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
    – Wernfried Domscheit
    Nov 22 at 9:55






  • 1




    @AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
    – Carlo Sirna
    Nov 22 at 17:47








  • 1




    @AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
    – Carlo Sirna
    Nov 22 at 17:58













up vote
0
down vote

favorite









up vote
0
down vote

favorite











Based on the our client requirements we configure our oracle (version - 12c) deployments to support single or multi-byte data (through character set setting). There is a need to cache third party multi-byte data(json) for performance reasons. We found that we could encode the data in UTF-8 and persist it (after converting it to bytes) in a BLOB column of an Oracle table. This is a hack that allows us to store multi-byte data in single byte deployments. There are certain limitations that come with this approach like




  1. The data cannot be queried or updated through SQL code (stored procedures).

  2. Search operation using for e.g. LIKE operators could not be performed.

  3. Marshaling and unmarshaling overhead for every operation at the application layer (java)


Assuming we compromise with these limitations are there any other drawbacks that we should be aware off?



Thanks.










share|improve this question













Based on the our client requirements we configure our oracle (version - 12c) deployments to support single or multi-byte data (through character set setting). There is a need to cache third party multi-byte data(json) for performance reasons. We found that we could encode the data in UTF-8 and persist it (after converting it to bytes) in a BLOB column of an Oracle table. This is a hack that allows us to store multi-byte data in single byte deployments. There are certain limitations that come with this approach like




  1. The data cannot be queried or updated through SQL code (stored procedures).

  2. Search operation using for e.g. LIKE operators could not be performed.

  3. Marshaling and unmarshaling overhead for every operation at the application layer (java)


Assuming we compromise with these limitations are there any other drawbacks that we should be aware off?



Thanks.







oracle






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 at 3:35









Andy Dufresne

3,51362964




3,51362964








  • 2




    1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
    – Carlo Sirna
    Nov 22 at 6:17






  • 1




    2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
    – Carlo Sirna
    Nov 22 at 6:25










  • Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
    – Wernfried Domscheit
    Nov 22 at 9:55






  • 1




    @AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
    – Carlo Sirna
    Nov 22 at 17:47








  • 1




    @AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
    – Carlo Sirna
    Nov 22 at 17:58














  • 2




    1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
    – Carlo Sirna
    Nov 22 at 6:17






  • 1




    2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
    – Carlo Sirna
    Nov 22 at 6:25










  • Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
    – Wernfried Domscheit
    Nov 22 at 9:55






  • 1




    @AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
    – Carlo Sirna
    Nov 22 at 17:47








  • 1




    @AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
    – Carlo Sirna
    Nov 22 at 17:58








2




2




1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
– Carlo Sirna
Nov 22 at 6:17




1) NVARCHAR, NCLOB and NCHAR columns should be able to store multi byte data even on single-byte installations. Can't you just declare all columns that are expected to contain multi-byte data as Nxxxx columns?
– Carlo Sirna
Nov 22 at 6:17




1




1




2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
– Carlo Sirna
Nov 22 at 6:25




2) if the requirement comes only for JSON columns: it is entirely possible to store json data as pure ASCII data by escaping all characters whose code is greater than 127. For example the Json string '{"UnicodeCharsTest":"niu00f1o"}' represents the very same object of this other one: '{"UnicodeCharsTest" : "niño"}'. you could re-encode all json strings this way and you could store them. And Oracle 12 has both the JSON_VALUE function and the "field is json" constraint that allows you to correctly query values stored in json objects (you don't have to decode yourself escape sequences)
– Carlo Sirna
Nov 22 at 6:25












Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
– Wernfried Domscheit
Nov 22 at 9:55




Nowadays the default is NLS_CHARACTERSET=AL32UTF8, i.e. UTF-8. Of course UTF-8 supports also single byte characters. Why do you like to use a single-byte characters set still in 2018?
– Wernfried Domscheit
Nov 22 at 9:55




1




1




@AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
– Carlo Sirna
Nov 22 at 17:47






@AndyDufresne NCLOB/NCHAR/NVARCHAR has always been the official type to use for multi-byte character strings. It has always worked this way. Whay I am suggesting is not a hack.
– Carlo Sirna
Nov 22 at 17:47






1




1




@AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
– Carlo Sirna
Nov 22 at 17:58




@AndyDufresne: let me elaborate: any installation of oracle supports TWO character sets: the normal character set (which in old versions of oracle defaulted to the single byte character set that matched the language using the installation process... and this is used for all normal varchar, char and clob fields... and also for table names, column names, etc...) and the "national" character set used for storing string with weird characters (Nxxx columns). personally I have never found an oracle installation where the character set used for these other columns isn't a UNICODE charset.
– Carlo Sirna
Nov 22 at 17:58

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423514%2fstoring-multi-byte-data-in-blob-for-single-byte-oracle-deployments%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423514%2fstoring-multi-byte-data-in-blob-for-single-byte-oracle-deployments%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to ignore python UserWarning in pytest?

What visual should I use to simply compare current year value vs last year in Power BI desktop

Script to remove string up to first number