compare xml files using python












0














I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question






















  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
    – Idlehands
    Nov 22 at 14:33












  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
    – Mark
    Nov 22 at 15:45










  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
    – Idlehands
    Nov 22 at 15:52










  • Helpful to say <type st="9999" /> is deleted.
    – Mark
    Nov 22 at 16:25
















0














I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question






















  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
    – Idlehands
    Nov 22 at 14:33












  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
    – Mark
    Nov 22 at 15:45










  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
    – Idlehands
    Nov 22 at 15:52










  • Helpful to say <type st="9999" /> is deleted.
    – Mark
    Nov 22 at 16:25














0












0








0







I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question













I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?







python xml xmldiff






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 at 13:58









Mark

1611516




1611516












  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
    – Idlehands
    Nov 22 at 14:33












  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
    – Mark
    Nov 22 at 15:45










  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
    – Idlehands
    Nov 22 at 15:52










  • Helpful to say <type st="9999" /> is deleted.
    – Mark
    Nov 22 at 16:25


















  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
    – Idlehands
    Nov 22 at 14:33












  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
    – Mark
    Nov 22 at 15:45










  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
    – Idlehands
    Nov 22 at 15:52










  • Helpful to say <type st="9999" /> is deleted.
    – Mark
    Nov 22 at 16:25
















For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
– Idlehands
Nov 22 at 14:33






For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?
– Idlehands
Nov 22 at 14:33














Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
– Mark
Nov 22 at 15:45




Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
– Mark
Nov 22 at 15:45












What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
– Idlehands
Nov 22 at 15:52




What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.
– Idlehands
Nov 22 at 15:52












Helpful to say <type st="9999" /> is deleted.
– Mark
Nov 22 at 16:25




Helpful to say <type st="9999" /> is deleted.
– Mark
Nov 22 at 16:25












2 Answers
2






active

oldest

votes


















1














You can switch to the XMLFormatter and manually filter out the results:



...
# Change formatter:
formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

...

# after `out` has been retrieved:
import re
for i in out.splitlines():
if re.search(r'bdiff:w+', i):
print(i)

# Result:
# <type st="9999" diff:delete=""/>





share|improve this answer





























    0














    Use the xmldiff to perform this exact task.



    main.py



    from xmldiff import main
    diff = main.diff_files("file1.xml", "file2.xml")
    print(diff)


    output



    [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





    share|improve this answer





















    • Not sure if you read the question but this doesnt answer my query
      – Mark
      Nov 22 at 14:10











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432591%2fcompare-xml-files-using-python%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You can switch to the XMLFormatter and manually filter out the results:



    ...
    # Change formatter:
    formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

    ...

    # after `out` has been retrieved:
    import re
    for i in out.splitlines():
    if re.search(r'bdiff:w+', i):
    print(i)

    # Result:
    # <type st="9999" diff:delete=""/>





    share|improve this answer


























      1














      You can switch to the XMLFormatter and manually filter out the results:



      ...
      # Change formatter:
      formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

      ...

      # after `out` has been retrieved:
      import re
      for i in out.splitlines():
      if re.search(r'bdiff:w+', i):
      print(i)

      # Result:
      # <type st="9999" diff:delete=""/>





      share|improve this answer
























        1












        1








        1






        You can switch to the XMLFormatter and manually filter out the results:



        ...
        # Change formatter:
        formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

        ...

        # after `out` has been retrieved:
        import re
        for i in out.splitlines():
        if re.search(r'bdiff:w+', i):
        print(i)

        # Result:
        # <type st="9999" diff:delete=""/>





        share|improve this answer












        You can switch to the XMLFormatter and manually filter out the results:



        ...
        # Change formatter:
        formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

        ...

        # after `out` has been retrieved:
        import re
        for i in out.splitlines():
        if re.search(r'bdiff:w+', i):
        print(i)

        # Result:
        # <type st="9999" diff:delete=""/>






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 22 at 18:01









        Idlehands

        3,9721417




        3,9721417

























            0














            Use the xmldiff to perform this exact task.



            main.py



            from xmldiff import main
            diff = main.diff_files("file1.xml", "file2.xml")
            print(diff)


            output



            [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





            share|improve this answer





















            • Not sure if you read the question but this doesnt answer my query
              – Mark
              Nov 22 at 14:10
















            0














            Use the xmldiff to perform this exact task.



            main.py



            from xmldiff import main
            diff = main.diff_files("file1.xml", "file2.xml")
            print(diff)


            output



            [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





            share|improve this answer





















            • Not sure if you read the question but this doesnt answer my query
              – Mark
              Nov 22 at 14:10














            0












            0








            0






            Use the xmldiff to perform this exact task.



            main.py



            from xmldiff import main
            diff = main.diff_files("file1.xml", "file2.xml")
            print(diff)


            output



            [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





            share|improve this answer












            Use the xmldiff to perform this exact task.



            main.py



            from xmldiff import main
            diff = main.diff_files("file1.xml", "file2.xml")
            print(diff)


            output



            [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 22 at 14:09









            Victor 'Chris' Cabral

            1,4341221




            1,4341221












            • Not sure if you read the question but this doesnt answer my query
              – Mark
              Nov 22 at 14:10


















            • Not sure if you read the question but this doesnt answer my query
              – Mark
              Nov 22 at 14:10
















            Not sure if you read the question but this doesnt answer my query
            – Mark
            Nov 22 at 14:10




            Not sure if you read the question but this doesnt answer my query
            – Mark
            Nov 22 at 14:10


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432591%2fcompare-xml-files-using-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to ignore python UserWarning in pytest?

            What visual should I use to simply compare current year value vs last year in Power BI desktop

            Script to remove string up to first number