how to set up SQL/Hive connection with cloudera cluster to read data stored on cluster











up vote
0
down vote

favorite
1












I wanted to retrieve the data stored onto Hadoop Cloudera cluster either via Hive, Spark or SQL. I have SQL query written which should fetch data from the cluster.
But prior to that, I want to understand how to set up connection /Cursor with cluster so that it will know where to read from or write to?



sc = spark.sparkContext or similarly HIVECONTEXT or SPARKCONTEXT will not suffice.



We might need to give URL for node and all. So how to do that?



Any Small example would suffice.










share|improve this question
























  • If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
    – VIN
    Nov 22 at 14:24










  • exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
    – Tilo
    Nov 23 at 5:11










  • Please find the example below and let me know if you still need help
    – VIN
    Nov 23 at 14:28















up vote
0
down vote

favorite
1












I wanted to retrieve the data stored onto Hadoop Cloudera cluster either via Hive, Spark or SQL. I have SQL query written which should fetch data from the cluster.
But prior to that, I want to understand how to set up connection /Cursor with cluster so that it will know where to read from or write to?



sc = spark.sparkContext or similarly HIVECONTEXT or SPARKCONTEXT will not suffice.



We might need to give URL for node and all. So how to do that?



Any Small example would suffice.










share|improve this question
























  • If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
    – VIN
    Nov 22 at 14:24










  • exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
    – Tilo
    Nov 23 at 5:11










  • Please find the example below and let me know if you still need help
    – VIN
    Nov 23 at 14:28













up vote
0
down vote

favorite
1









up vote
0
down vote

favorite
1






1





I wanted to retrieve the data stored onto Hadoop Cloudera cluster either via Hive, Spark or SQL. I have SQL query written which should fetch data from the cluster.
But prior to that, I want to understand how to set up connection /Cursor with cluster so that it will know where to read from or write to?



sc = spark.sparkContext or similarly HIVECONTEXT or SPARKCONTEXT will not suffice.



We might need to give URL for node and all. So how to do that?



Any Small example would suffice.










share|improve this question















I wanted to retrieve the data stored onto Hadoop Cloudera cluster either via Hive, Spark or SQL. I have SQL query written which should fetch data from the cluster.
But prior to that, I want to understand how to set up connection /Cursor with cluster so that it will know where to read from or write to?



sc = spark.sparkContext or similarly HIVECONTEXT or SPARKCONTEXT will not suffice.



We might need to give URL for node and all. So how to do that?



Any Small example would suffice.







hive apache-spark-sql hadoop-streaming






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 18:16









VIN

14111




14111










asked Nov 22 at 12:44









Tilo

747




747












  • If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
    – VIN
    Nov 22 at 14:24










  • exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
    – Tilo
    Nov 23 at 5:11










  • Please find the example below and let me know if you still need help
    – VIN
    Nov 23 at 14:28


















  • If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
    – VIN
    Nov 22 at 14:24










  • exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
    – Tilo
    Nov 23 at 5:11










  • Please find the example below and let me know if you still need help
    – VIN
    Nov 23 at 14:28
















If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
– VIN
Nov 22 at 14:24




If you want to query the data through hive you will have to define the schema so make hive table first load the data into that table and then run queries like SQL and you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from .
– VIN
Nov 22 at 14:24












exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
– Tilo
Nov 23 at 5:11




exactly I agree, I just need example for "you basically define the source and destination address while creating the table in the hive in order to figure out where to write and read from"
– Tilo
Nov 23 at 5:11












Please find the example below and let me know if you still need help
– VIN
Nov 23 at 14:28




Please find the example below and let me know if you still need help
– VIN
Nov 23 at 14:28












1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










There are two ways to create the table in the hive:



1- Creating an external table schema:



CREATE EXTERNAL TABLE IF NOT EXISTS names_text(
student_ID INT, FirstName STRING, LastName STRING,
year STRING, Major STRING)
COMMENT 'Student Names'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/user/andrena';


2- a) Create the schema for a managed table:



CREATE TABLE IF NOT EXISTS Names(
student_ID INT, FirstName STRING, LastName STRING,
year STRING, Major STRING)
COMMENT 'Student Names'
STORED AS ORC;


b) Move the external table data to the managed table:



INSERT OVERWRITE TABLE Names SELECT * FROM names_text;


And finally, verify that the Hive warehouse stores the student names in the external and internal table respectively :



SELECT * FROM names_text;

SELECT * from Names;





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53431323%2fhow-to-set-up-sql-hive-connection-with-cloudera-cluster-to-read-data-stored-on-c%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    There are two ways to create the table in the hive:



    1- Creating an external table schema:



    CREATE EXTERNAL TABLE IF NOT EXISTS names_text(
    student_ID INT, FirstName STRING, LastName STRING,
    year STRING, Major STRING)
    COMMENT 'Student Names'
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    LOCATION '/user/andrena';


    2- a) Create the schema for a managed table:



    CREATE TABLE IF NOT EXISTS Names(
    student_ID INT, FirstName STRING, LastName STRING,
    year STRING, Major STRING)
    COMMENT 'Student Names'
    STORED AS ORC;


    b) Move the external table data to the managed table:



    INSERT OVERWRITE TABLE Names SELECT * FROM names_text;


    And finally, verify that the Hive warehouse stores the student names in the external and internal table respectively :



    SELECT * FROM names_text;

    SELECT * from Names;





    share|improve this answer

























      up vote
      1
      down vote



      accepted










      There are two ways to create the table in the hive:



      1- Creating an external table schema:



      CREATE EXTERNAL TABLE IF NOT EXISTS names_text(
      student_ID INT, FirstName STRING, LastName STRING,
      year STRING, Major STRING)
      COMMENT 'Student Names'
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      STORED AS TEXTFILE
      LOCATION '/user/andrena';


      2- a) Create the schema for a managed table:



      CREATE TABLE IF NOT EXISTS Names(
      student_ID INT, FirstName STRING, LastName STRING,
      year STRING, Major STRING)
      COMMENT 'Student Names'
      STORED AS ORC;


      b) Move the external table data to the managed table:



      INSERT OVERWRITE TABLE Names SELECT * FROM names_text;


      And finally, verify that the Hive warehouse stores the student names in the external and internal table respectively :



      SELECT * FROM names_text;

      SELECT * from Names;





      share|improve this answer























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        There are two ways to create the table in the hive:



        1- Creating an external table schema:



        CREATE EXTERNAL TABLE IF NOT EXISTS names_text(
        student_ID INT, FirstName STRING, LastName STRING,
        year STRING, Major STRING)
        COMMENT 'Student Names'
        ROW FORMAT DELIMITED
        FIELDS TERMINATED BY ','
        STORED AS TEXTFILE
        LOCATION '/user/andrena';


        2- a) Create the schema for a managed table:



        CREATE TABLE IF NOT EXISTS Names(
        student_ID INT, FirstName STRING, LastName STRING,
        year STRING, Major STRING)
        COMMENT 'Student Names'
        STORED AS ORC;


        b) Move the external table data to the managed table:



        INSERT OVERWRITE TABLE Names SELECT * FROM names_text;


        And finally, verify that the Hive warehouse stores the student names in the external and internal table respectively :



        SELECT * FROM names_text;

        SELECT * from Names;





        share|improve this answer












        There are two ways to create the table in the hive:



        1- Creating an external table schema:



        CREATE EXTERNAL TABLE IF NOT EXISTS names_text(
        student_ID INT, FirstName STRING, LastName STRING,
        year STRING, Major STRING)
        COMMENT 'Student Names'
        ROW FORMAT DELIMITED
        FIELDS TERMINATED BY ','
        STORED AS TEXTFILE
        LOCATION '/user/andrena';


        2- a) Create the schema for a managed table:



        CREATE TABLE IF NOT EXISTS Names(
        student_ID INT, FirstName STRING, LastName STRING,
        year STRING, Major STRING)
        COMMENT 'Student Names'
        STORED AS ORC;


        b) Move the external table data to the managed table:



        INSERT OVERWRITE TABLE Names SELECT * FROM names_text;


        And finally, verify that the Hive warehouse stores the student names in the external and internal table respectively :



        SELECT * FROM names_text;

        SELECT * from Names;






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 23 at 14:27









        VIN

        14111




        14111






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53431323%2fhow-to-set-up-sql-hive-connection-with-cloudera-cluster-to-read-data-stored-on-c%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            What visual should I use to simply compare current year value vs last year in Power BI desktop

            How to ignore python UserWarning in pytest?

            Alexandru Averescu