Python: Convert XML to CSV file











up vote
5
down vote

favorite
4












I have an XML file like this:



<hierachy>
<att>
<Order>1</Order>
<attval>Data</attval>
<children>
<att>
<Order>1</Order>
<attval>Studyval</attval>
</att>
<att>
<Order>2</Order>
<attval>Site</attval>
</att>
</children>
</att>
<att>
<Order>2</Order>
<attval>Info</attval>
<children>
<att>
<Order>1</Order>
<attval>age</attval>
</att>
<att>
<Order>2</Order>
<attval>gender</attval>
</att>
</children>
</att>
</hierachy>


I'm trying to convert it to a CSV file like this:



Data,Studyval
Date,Site
Info,age
Info,gender


My problem is, both the parent and child names are the same- 'att' and 'attval'. How do I tell Python to distinguish between the both and give me the output?



I tried this:



import xml.etree.cElementTree as ET

tree = ET.parse('input.xml')
rebase = tree.getroot()

list =

for att in rebase.findall('att'):
name = att.find('attval').text
for each_att in att.findall('attval'):
try:
val = att.find('attval').text
print name, val
except AttributeError:
print name


and it printed the same things twice.










share|improve this question




























    up vote
    5
    down vote

    favorite
    4












    I have an XML file like this:



    <hierachy>
    <att>
    <Order>1</Order>
    <attval>Data</attval>
    <children>
    <att>
    <Order>1</Order>
    <attval>Studyval</attval>
    </att>
    <att>
    <Order>2</Order>
    <attval>Site</attval>
    </att>
    </children>
    </att>
    <att>
    <Order>2</Order>
    <attval>Info</attval>
    <children>
    <att>
    <Order>1</Order>
    <attval>age</attval>
    </att>
    <att>
    <Order>2</Order>
    <attval>gender</attval>
    </att>
    </children>
    </att>
    </hierachy>


    I'm trying to convert it to a CSV file like this:



    Data,Studyval
    Date,Site
    Info,age
    Info,gender


    My problem is, both the parent and child names are the same- 'att' and 'attval'. How do I tell Python to distinguish between the both and give me the output?



    I tried this:



    import xml.etree.cElementTree as ET

    tree = ET.parse('input.xml')
    rebase = tree.getroot()

    list =

    for att in rebase.findall('att'):
    name = att.find('attval').text
    for each_att in att.findall('attval'):
    try:
    val = att.find('attval').text
    print name, val
    except AttributeError:
    print name


    and it printed the same things twice.










    share|improve this question


























      up vote
      5
      down vote

      favorite
      4









      up vote
      5
      down vote

      favorite
      4






      4





      I have an XML file like this:



      <hierachy>
      <att>
      <Order>1</Order>
      <attval>Data</attval>
      <children>
      <att>
      <Order>1</Order>
      <attval>Studyval</attval>
      </att>
      <att>
      <Order>2</Order>
      <attval>Site</attval>
      </att>
      </children>
      </att>
      <att>
      <Order>2</Order>
      <attval>Info</attval>
      <children>
      <att>
      <Order>1</Order>
      <attval>age</attval>
      </att>
      <att>
      <Order>2</Order>
      <attval>gender</attval>
      </att>
      </children>
      </att>
      </hierachy>


      I'm trying to convert it to a CSV file like this:



      Data,Studyval
      Date,Site
      Info,age
      Info,gender


      My problem is, both the parent and child names are the same- 'att' and 'attval'. How do I tell Python to distinguish between the both and give me the output?



      I tried this:



      import xml.etree.cElementTree as ET

      tree = ET.parse('input.xml')
      rebase = tree.getroot()

      list =

      for att in rebase.findall('att'):
      name = att.find('attval').text
      for each_att in att.findall('attval'):
      try:
      val = att.find('attval').text
      print name, val
      except AttributeError:
      print name


      and it printed the same things twice.










      share|improve this question















      I have an XML file like this:



      <hierachy>
      <att>
      <Order>1</Order>
      <attval>Data</attval>
      <children>
      <att>
      <Order>1</Order>
      <attval>Studyval</attval>
      </att>
      <att>
      <Order>2</Order>
      <attval>Site</attval>
      </att>
      </children>
      </att>
      <att>
      <Order>2</Order>
      <attval>Info</attval>
      <children>
      <att>
      <Order>1</Order>
      <attval>age</attval>
      </att>
      <att>
      <Order>2</Order>
      <attval>gender</attval>
      </att>
      </children>
      </att>
      </hierachy>


      I'm trying to convert it to a CSV file like this:



      Data,Studyval
      Date,Site
      Info,age
      Info,gender


      My problem is, both the parent and child names are the same- 'att' and 'attval'. How do I tell Python to distinguish between the both and give me the output?



      I tried this:



      import xml.etree.cElementTree as ET

      tree = ET.parse('input.xml')
      rebase = tree.getroot()

      list =

      for att in rebase.findall('att'):
      name = att.find('attval').text
      for each_att in att.findall('attval'):
      try:
      val = att.find('attval').text
      print name, val
      except AttributeError:
      print name


      and it printed the same things twice.







      python xml csv xpath elementtree






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 6 '15 at 0:24

























      asked Aug 5 '15 at 23:59









      pam

      2404620




      2404620
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          6
          down vote



          accepted










          Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.



          from xml.etree import ElementTree
          tree = ElementTree.parse('input.xml')
          root = tree.getroot()

          for att in root:
          first = att.find('attval').text
          for subatt in att.find('children'):
          second = subatt.find('attval').text
          print('{},{}'.format(first, second))


          Which gives:



          $ python process.py 
          Data,Studyval
          Data,Site
          Info,age
          Info,gender





          share|improve this answer























          • That is perfect! Thanks a ton!
            – pam
            Aug 6 '15 at 0:40











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f31844713%2fpython-convert-xml-to-csv-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          6
          down vote



          accepted










          Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.



          from xml.etree import ElementTree
          tree = ElementTree.parse('input.xml')
          root = tree.getroot()

          for att in root:
          first = att.find('attval').text
          for subatt in att.find('children'):
          second = subatt.find('attval').text
          print('{},{}'.format(first, second))


          Which gives:



          $ python process.py 
          Data,Studyval
          Data,Site
          Info,age
          Info,gender





          share|improve this answer























          • That is perfect! Thanks a ton!
            – pam
            Aug 6 '15 at 0:40















          up vote
          6
          down vote



          accepted










          Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.



          from xml.etree import ElementTree
          tree = ElementTree.parse('input.xml')
          root = tree.getroot()

          for att in root:
          first = att.find('attval').text
          for subatt in att.find('children'):
          second = subatt.find('attval').text
          print('{},{}'.format(first, second))


          Which gives:



          $ python process.py 
          Data,Studyval
          Data,Site
          Info,age
          Info,gender





          share|improve this answer























          • That is perfect! Thanks a ton!
            – pam
            Aug 6 '15 at 0:40













          up vote
          6
          down vote



          accepted







          up vote
          6
          down vote



          accepted






          Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.



          from xml.etree import ElementTree
          tree = ElementTree.parse('input.xml')
          root = tree.getroot()

          for att in root:
          first = att.find('attval').text
          for subatt in att.find('children'):
          second = subatt.find('attval').text
          print('{},{}'.format(first, second))


          Which gives:



          $ python process.py 
          Data,Studyval
          Data,Site
          Info,age
          Info,gender





          share|improve this answer














          Do not use the findall function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.



          from xml.etree import ElementTree
          tree = ElementTree.parse('input.xml')
          root = tree.getroot()

          for att in root:
          first = att.find('attval').text
          for subatt in att.find('children'):
          second = subatt.find('attval').text
          print('{},{}'.format(first, second))


          Which gives:



          $ python process.py 
          Data,Studyval
          Data,Site
          Info,age
          Info,gender






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 6 '15 at 0:30

























          answered Aug 6 '15 at 0:24









          Havok

          3,57912028




          3,57912028












          • That is perfect! Thanks a ton!
            – pam
            Aug 6 '15 at 0:40


















          • That is perfect! Thanks a ton!
            – pam
            Aug 6 '15 at 0:40
















          That is perfect! Thanks a ton!
          – pam
          Aug 6 '15 at 0:40




          That is perfect! Thanks a ton!
          – pam
          Aug 6 '15 at 0:40


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f31844713%2fpython-convert-xml-to-csv-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Trompette piccolo

          Slow SSRS Report in dynamic grouping and multiple parameters

          Simon Yates (cyclisme)