Python column retains original updated 'NA'; never gets updated with float











up vote
0
down vote

favorite












When updating dataframe column, FractionOfVote, my first step was to add a new column, FractionOfVote, with default
NA value. Then parse the dataframe column, Votes, using split.



The following two functions code works fine: 1) add_new_column_fraction(), 2) add_new_column_votes().



def add_new_column_fraction(df):
df['FractionOfVote'] = 'NA'

def add_new_column_votes(df):
df[['YesVotes','NumVotes']] = df['Votes'].str.split('/',expand=True)[[0,1]]


The problem code is found in function calc_fraction_ratio_for_votes()



def calc_fraction_ratio_for_votes(df):
for idx, row in df.iterrows():
numerator = row['YesVotes']
denomerator = row['NumVotes']
try:
row['FractionOfVote'] = float(numerator) / float(denomerator)
except ZeroDivisionError:
row['FractionOfVote'] = 'NaN'


This function takes two other dataframe columns, YesVotes, NumVotes, and calculates a new float value for the new
column, FractionOfVote, defined previously in add_new_column_fraction().



The logical error is that column, FractionOfVote, retains the original updated 'NA'; and never received the update from "row['FractionOfVote'] = float(numerator) / float(denomerator)" with either the float value calculation, or the 'NaN' from the "except ZeroDivisionError".










share|improve this question




























    up vote
    0
    down vote

    favorite












    When updating dataframe column, FractionOfVote, my first step was to add a new column, FractionOfVote, with default
    NA value. Then parse the dataframe column, Votes, using split.



    The following two functions code works fine: 1) add_new_column_fraction(), 2) add_new_column_votes().



    def add_new_column_fraction(df):
    df['FractionOfVote'] = 'NA'

    def add_new_column_votes(df):
    df[['YesVotes','NumVotes']] = df['Votes'].str.split('/',expand=True)[[0,1]]


    The problem code is found in function calc_fraction_ratio_for_votes()



    def calc_fraction_ratio_for_votes(df):
    for idx, row in df.iterrows():
    numerator = row['YesVotes']
    denomerator = row['NumVotes']
    try:
    row['FractionOfVote'] = float(numerator) / float(denomerator)
    except ZeroDivisionError:
    row['FractionOfVote'] = 'NaN'


    This function takes two other dataframe columns, YesVotes, NumVotes, and calculates a new float value for the new
    column, FractionOfVote, defined previously in add_new_column_fraction().



    The logical error is that column, FractionOfVote, retains the original updated 'NA'; and never received the update from "row['FractionOfVote'] = float(numerator) / float(denomerator)" with either the float value calculation, or the 'NaN' from the "except ZeroDivisionError".










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      When updating dataframe column, FractionOfVote, my first step was to add a new column, FractionOfVote, with default
      NA value. Then parse the dataframe column, Votes, using split.



      The following two functions code works fine: 1) add_new_column_fraction(), 2) add_new_column_votes().



      def add_new_column_fraction(df):
      df['FractionOfVote'] = 'NA'

      def add_new_column_votes(df):
      df[['YesVotes','NumVotes']] = df['Votes'].str.split('/',expand=True)[[0,1]]


      The problem code is found in function calc_fraction_ratio_for_votes()



      def calc_fraction_ratio_for_votes(df):
      for idx, row in df.iterrows():
      numerator = row['YesVotes']
      denomerator = row['NumVotes']
      try:
      row['FractionOfVote'] = float(numerator) / float(denomerator)
      except ZeroDivisionError:
      row['FractionOfVote'] = 'NaN'


      This function takes two other dataframe columns, YesVotes, NumVotes, and calculates a new float value for the new
      column, FractionOfVote, defined previously in add_new_column_fraction().



      The logical error is that column, FractionOfVote, retains the original updated 'NA'; and never received the update from "row['FractionOfVote'] = float(numerator) / float(denomerator)" with either the float value calculation, or the 'NaN' from the "except ZeroDivisionError".










      share|improve this question















      When updating dataframe column, FractionOfVote, my first step was to add a new column, FractionOfVote, with default
      NA value. Then parse the dataframe column, Votes, using split.



      The following two functions code works fine: 1) add_new_column_fraction(), 2) add_new_column_votes().



      def add_new_column_fraction(df):
      df['FractionOfVote'] = 'NA'

      def add_new_column_votes(df):
      df[['YesVotes','NumVotes']] = df['Votes'].str.split('/',expand=True)[[0,1]]


      The problem code is found in function calc_fraction_ratio_for_votes()



      def calc_fraction_ratio_for_votes(df):
      for idx, row in df.iterrows():
      numerator = row['YesVotes']
      denomerator = row['NumVotes']
      try:
      row['FractionOfVote'] = float(numerator) / float(denomerator)
      except ZeroDivisionError:
      row['FractionOfVote'] = 'NaN'


      This function takes two other dataframe columns, YesVotes, NumVotes, and calculates a new float value for the new
      column, FractionOfVote, defined previously in add_new_column_fraction().



      The logical error is that column, FractionOfVote, retains the original updated 'NA'; and never received the update from "row['FractionOfVote'] = float(numerator) / float(denomerator)" with either the float value calculation, or the 'NaN' from the "except ZeroDivisionError".







      python python-3.x pandas series divide-by-zero






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 at 17:08









      jpp

      88k195099




      88k195099










      asked Nov 22 at 16:51









      user1857373

      316




      316
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          0
          down vote



          accepted










          You should try and avoid Python-level loops. First ensure your series are numeric (if necessary):



          df = pd.DataFrame({'Yes': [0, 3, 0, 10, 0],
          'Num': [0, 5, 0, 30, 2]})

          num_cols = ['Yes', 'Num']
          df[num_cols] = df[num_cols].apply(pd.to_numeric, errors='coerce')


          Then use division and replace inf with NaN:



          print((df['Yes'] / df['Num']).replace(np.inf, np.nan))

          0 NaN
          1 0.600000
          2 NaN
          3 0.333333
          4 0.000000
          dtype: float64





          share|improve this answer





















          • thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
            – user1857373
            Nov 22 at 17:26


















          up vote
          1
          down vote













          Why are you using iterrrows() in the first place? You can achieve the same results with a vectorized implementation as below:



           # Create column and fill all values to NaN by default
          df['FractionOfVote'] = np.nan # import numpy as np if you didn't

          # Populate the valid values with the ratio.
          df.loc[df['NumVotes'].astype(float) > 0, 'FractionOfVote'] = df['YesVotes'] / df['NumVotes']





          share|improve this answer



















          • 1




            Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
            – user1857373
            Nov 22 at 17:28











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435387%2fpython-column-retains-original-updated-na-never-gets-updated-with-float%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote



          accepted










          You should try and avoid Python-level loops. First ensure your series are numeric (if necessary):



          df = pd.DataFrame({'Yes': [0, 3, 0, 10, 0],
          'Num': [0, 5, 0, 30, 2]})

          num_cols = ['Yes', 'Num']
          df[num_cols] = df[num_cols].apply(pd.to_numeric, errors='coerce')


          Then use division and replace inf with NaN:



          print((df['Yes'] / df['Num']).replace(np.inf, np.nan))

          0 NaN
          1 0.600000
          2 NaN
          3 0.333333
          4 0.000000
          dtype: float64





          share|improve this answer





















          • thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
            – user1857373
            Nov 22 at 17:26















          up vote
          0
          down vote



          accepted










          You should try and avoid Python-level loops. First ensure your series are numeric (if necessary):



          df = pd.DataFrame({'Yes': [0, 3, 0, 10, 0],
          'Num': [0, 5, 0, 30, 2]})

          num_cols = ['Yes', 'Num']
          df[num_cols] = df[num_cols].apply(pd.to_numeric, errors='coerce')


          Then use division and replace inf with NaN:



          print((df['Yes'] / df['Num']).replace(np.inf, np.nan))

          0 NaN
          1 0.600000
          2 NaN
          3 0.333333
          4 0.000000
          dtype: float64





          share|improve this answer





















          • thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
            – user1857373
            Nov 22 at 17:26













          up vote
          0
          down vote



          accepted







          up vote
          0
          down vote



          accepted






          You should try and avoid Python-level loops. First ensure your series are numeric (if necessary):



          df = pd.DataFrame({'Yes': [0, 3, 0, 10, 0],
          'Num': [0, 5, 0, 30, 2]})

          num_cols = ['Yes', 'Num']
          df[num_cols] = df[num_cols].apply(pd.to_numeric, errors='coerce')


          Then use division and replace inf with NaN:



          print((df['Yes'] / df['Num']).replace(np.inf, np.nan))

          0 NaN
          1 0.600000
          2 NaN
          3 0.333333
          4 0.000000
          dtype: float64





          share|improve this answer












          You should try and avoid Python-level loops. First ensure your series are numeric (if necessary):



          df = pd.DataFrame({'Yes': [0, 3, 0, 10, 0],
          'Num': [0, 5, 0, 30, 2]})

          num_cols = ['Yes', 'Num']
          df[num_cols] = df[num_cols].apply(pd.to_numeric, errors='coerce')


          Then use division and replace inf with NaN:



          print((df['Yes'] / df['Num']).replace(np.inf, np.nan))

          0 NaN
          1 0.600000
          2 NaN
          3 0.333333
          4 0.000000
          dtype: float64






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 22 at 17:06









          jpp

          88k195099




          88k195099












          • thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
            – user1857373
            Nov 22 at 17:26


















          • thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
            – user1857373
            Nov 22 at 17:26
















          thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
          – user1857373
          Nov 22 at 17:26




          thanks, right on, Python level loops on data.frames appear to operate somewhat irregular, thanks for catching and the commendation to avoid Python loops on data.frame when a data.frame level function is more appropriate to use
          – user1857373
          Nov 22 at 17:26












          up vote
          1
          down vote













          Why are you using iterrrows() in the first place? You can achieve the same results with a vectorized implementation as below:



           # Create column and fill all values to NaN by default
          df['FractionOfVote'] = np.nan # import numpy as np if you didn't

          # Populate the valid values with the ratio.
          df.loc[df['NumVotes'].astype(float) > 0, 'FractionOfVote'] = df['YesVotes'] / df['NumVotes']





          share|improve this answer



















          • 1




            Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
            – user1857373
            Nov 22 at 17:28















          up vote
          1
          down vote













          Why are you using iterrrows() in the first place? You can achieve the same results with a vectorized implementation as below:



           # Create column and fill all values to NaN by default
          df['FractionOfVote'] = np.nan # import numpy as np if you didn't

          # Populate the valid values with the ratio.
          df.loc[df['NumVotes'].astype(float) > 0, 'FractionOfVote'] = df['YesVotes'] / df['NumVotes']





          share|improve this answer



















          • 1




            Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
            – user1857373
            Nov 22 at 17:28













          up vote
          1
          down vote










          up vote
          1
          down vote









          Why are you using iterrrows() in the first place? You can achieve the same results with a vectorized implementation as below:



           # Create column and fill all values to NaN by default
          df['FractionOfVote'] = np.nan # import numpy as np if you didn't

          # Populate the valid values with the ratio.
          df.loc[df['NumVotes'].astype(float) > 0, 'FractionOfVote'] = df['YesVotes'] / df['NumVotes']





          share|improve this answer














          Why are you using iterrrows() in the first place? You can achieve the same results with a vectorized implementation as below:



           # Create column and fill all values to NaN by default
          df['FractionOfVote'] = np.nan # import numpy as np if you didn't

          # Populate the valid values with the ratio.
          df.loc[df['NumVotes'].astype(float) > 0, 'FractionOfVote'] = df['YesVotes'] / df['NumVotes']






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 22 at 17:11

























          answered Nov 22 at 17:09









          Julian Peller

          849511




          849511








          • 1




            Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
            – user1857373
            Nov 22 at 17:28














          • 1




            Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
            – user1857373
            Nov 22 at 17:28








          1




          1




          Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
          – user1857373
          Nov 22 at 17:28




          Why I was using iterrow(), too many years of Java iteration programming, it's still in my head :)
          – user1857373
          Nov 22 at 17:28


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435387%2fpython-column-retains-original-updated-na-never-gets-updated-with-float%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to ignore python UserWarning in pytest?

          What visual should I use to simply compare current year value vs last year in Power BI desktop

          Héron pourpré