Optimization Error for Logistic Regression using Scipy.opt












0















I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



Kindly point me in the right direction as to what is causing the problem and how to avoid it.



Thanks a million!



import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
print(dataset.head())

positive = dataset[dataset["Admitted"] == 1]
negative = dataset[dataset["Admitted"] == 0]

#Visualizing Dataset
plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
plt.xlabel("Exam 1 Score")
plt.ylabel("Exam 2 Score")
plt.title("Admission Graph")
plt.legend()
#plt.show()

#Preprocessing Data
dataset.insert(0, "x0", 1)
col = len(dataset.columns)
x = dataset.iloc[:,0:col-1].values
y = dataset.iloc[:,col-1:col].values
b = np.zeros([1,col-1])
m = len(y)
print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

#Defining Functions
def hypothesis(x, y, b):
h = 1 / (1+np.exp(-x @ b.T))
return h

def cost(x, y, b):
first = (y.T @ np.log(hypothesis(x, y, b)))
second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
j = (-1/m) * np.sum(first+second)
return j

def gradient(x, y, b):
grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
return b

#Output
initial_cost = cost(x, y, b)
print(f"nInitial Cost = {initial_cost}")
final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
print(f"Final Cost = {final_cost} nTheta = {b}")


Dataset : Student Dataset.txt










share|improve this question













migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.























    0















    I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



    Kindly point me in the right direction as to what is causing the problem and how to avoid it.



    Thanks a million!



    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.optimize as opt

    dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
    print(dataset.head())

    positive = dataset[dataset["Admitted"] == 1]
    negative = dataset[dataset["Admitted"] == 0]

    #Visualizing Dataset
    plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
    plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
    plt.xlabel("Exam 1 Score")
    plt.ylabel("Exam 2 Score")
    plt.title("Admission Graph")
    plt.legend()
    #plt.show()

    #Preprocessing Data
    dataset.insert(0, "x0", 1)
    col = len(dataset.columns)
    x = dataset.iloc[:,0:col-1].values
    y = dataset.iloc[:,col-1:col].values
    b = np.zeros([1,col-1])
    m = len(y)
    print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

    #Defining Functions
    def hypothesis(x, y, b):
    h = 1 / (1+np.exp(-x @ b.T))
    return h

    def cost(x, y, b):
    first = (y.T @ np.log(hypothesis(x, y, b)))
    second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
    j = (-1/m) * np.sum(first+second)
    return j

    def gradient(x, y, b):
    grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
    return b

    #Output
    initial_cost = cost(x, y, b)
    print(f"nInitial Cost = {initial_cost}")
    final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
    print(f"Final Cost = {final_cost} nTheta = {b}")


    Dataset : Student Dataset.txt










    share|improve this question













    migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


    This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.





















      0












      0








      0








      I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



      Kindly point me in the right direction as to what is causing the problem and how to avoid it.



      Thanks a million!



      import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      import scipy.optimize as opt

      dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
      print(dataset.head())

      positive = dataset[dataset["Admitted"] == 1]
      negative = dataset[dataset["Admitted"] == 0]

      #Visualizing Dataset
      plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
      plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
      plt.xlabel("Exam 1 Score")
      plt.ylabel("Exam 2 Score")
      plt.title("Admission Graph")
      plt.legend()
      #plt.show()

      #Preprocessing Data
      dataset.insert(0, "x0", 1)
      col = len(dataset.columns)
      x = dataset.iloc[:,0:col-1].values
      y = dataset.iloc[:,col-1:col].values
      b = np.zeros([1,col-1])
      m = len(y)
      print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

      #Defining Functions
      def hypothesis(x, y, b):
      h = 1 / (1+np.exp(-x @ b.T))
      return h

      def cost(x, y, b):
      first = (y.T @ np.log(hypothesis(x, y, b)))
      second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
      j = (-1/m) * np.sum(first+second)
      return j

      def gradient(x, y, b):
      grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
      return b

      #Output
      initial_cost = cost(x, y, b)
      print(f"nInitial Cost = {initial_cost}")
      final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
      print(f"Final Cost = {final_cost} nTheta = {b}")


      Dataset : Student Dataset.txt










      share|improve this question














      I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



      Kindly point me in the right direction as to what is causing the problem and how to avoid it.



      Thanks a million!



      import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      import scipy.optimize as opt

      dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
      print(dataset.head())

      positive = dataset[dataset["Admitted"] == 1]
      negative = dataset[dataset["Admitted"] == 0]

      #Visualizing Dataset
      plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
      plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
      plt.xlabel("Exam 1 Score")
      plt.ylabel("Exam 2 Score")
      plt.title("Admission Graph")
      plt.legend()
      #plt.show()

      #Preprocessing Data
      dataset.insert(0, "x0", 1)
      col = len(dataset.columns)
      x = dataset.iloc[:,0:col-1].values
      y = dataset.iloc[:,col-1:col].values
      b = np.zeros([1,col-1])
      m = len(y)
      print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

      #Defining Functions
      def hypothesis(x, y, b):
      h = 1 / (1+np.exp(-x @ b.T))
      return h

      def cost(x, y, b):
      first = (y.T @ np.log(hypothesis(x, y, b)))
      second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
      j = (-1/m) * np.sum(first+second)
      return j

      def gradient(x, y, b):
      grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
      return b

      #Output
      initial_cost = cost(x, y, b)
      print(f"nInitial Cost = {initial_cost}")
      final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
      print(f"Final Cost = {final_cost} nTheta = {b}")


      Dataset : Student Dataset.txt







      machine-learning python logistic-regression scipy






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 '18 at 2:42









      Antony JohnAntony John

      337




      337




      migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


      This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.









      migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


      This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.


























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402572%2foptimization-error-for-logistic-regression-using-scipy-opt%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402572%2foptimization-error-for-logistic-regression-using-scipy-opt%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          鏡平學校

          ꓛꓣだゔៀៅຸ໢ທຮ໕໒ ,ໂ'໥໓າ໼ឨឲ៵៭ៈゎゔit''䖳𥁄卿' ☨₤₨こゎもょの;ꜹꟚꞖꞵꟅꞛေၦေɯ,ɨɡ𛃵𛁹ޝ޳ޠ޾,ޤޒޯ޾𫝒𫠁သ𛅤チョ'サノބޘދ𛁐ᶿᶇᶀᶋᶠ㨑㽹⻮ꧬ꧹؍۩وَؠ㇕㇃㇪ ㇦㇋㇋ṜẰᵡᴠ 軌ᵕ搜۳ٰޗޮ޷ސޯ𫖾𫅀ल, ꙭ꙰ꚅꙁꚊꞻꝔ꟠Ꝭㄤﺟޱސꧨꧼ꧴ꧯꧽ꧲ꧯ'⽹⽭⾁⿞⼳⽋២៩ញណើꩯꩤ꩸ꩮᶻᶺᶧᶂ𫳲𫪭𬸄𫵰𬖩𬫣𬊉ၲ𛅬㕦䬺𫝌𫝼,,𫟖𫞽ហៅ஫㆔ాఆఅꙒꚞꙍ,Ꙟ꙱エ ,ポテ,フࢰࢯ𫟠𫞶 𫝤𫟠ﺕﹱﻜﻣ𪵕𪭸𪻆𪾩𫔷ġ,ŧآꞪ꟥,ꞔꝻ♚☹⛵𛀌ꬷꭞȄƁƪƬșƦǙǗdžƝǯǧⱦⱰꓕꓢႋ神 ဴ၀க௭எ௫ឫោ ' េㇷㇴㇼ神ㇸㇲㇽㇴㇼㇻㇸ'ㇸㇿㇸㇹㇰㆣꓚꓤ₡₧ ㄨㄟ㄂ㄖㄎ໗ツڒذ₶।ऩछएोञयूटक़कयँृी,冬'𛅢𛅥ㇱㇵㇶ𥄥𦒽𠣧𠊓𧢖𥞘𩔋цѰㄠſtʯʭɿʆʗʍʩɷɛ,əʏダヵㄐㄘR{gỚṖḺờṠṫảḙḭᴮᵏᴘᵀᵷᵕᴜᴏᵾq﮲ﲿﴽﭙ軌ﰬﶚﶧ﫲Ҝжюїкӈㇴffצּ﬘﭅﬈軌'ffistfflſtffतभफɳɰʊɲʎ𛁱𛁖𛁮𛀉 𛂯𛀞నఋŀŲ 𫟲𫠖𫞺ຆຆ ໹້໕໗ๆทԊꧢꧠ꧰ꓱ⿝⼑ŎḬẃẖỐẅ ,ờỰỈỗﮊDžȩꭏꭎꬻ꭮ꬿꭖꭥꭅ㇭神 ⾈ꓵꓑ⺄㄄ㄪㄙㄅㄇstA۵䞽ॶ𫞑𫝄㇉㇇゜軌𩜛𩳠Jﻺ‚Üမ႕ႌႊၐၸဓၞၞၡ៸wyvtᶎᶪᶹစဎ꣡꣰꣢꣤ٗ؋لㇳㇾㇻㇱ㆐㆔,,㆟Ⱶヤマފ޼ޝަݿݞݠݷݐ',ݘ,ݪݙݵ𬝉𬜁𫝨𫞘くせぉて¼óû×ó£…𛅑הㄙくԗԀ5606神45,神796'𪤻𫞧ꓐ㄁ㄘɥɺꓵꓲ3''7034׉ⱦⱠˆ“𫝋ȍ,ꩲ軌꩷ꩶꩧꩫఞ۔فڱێظペサ神ナᴦᵑ47 9238їﻂ䐊䔉㠸﬎ffiﬣ,לּᴷᴦᵛᵽ,ᴨᵤ ᵸᵥᴗᵈꚏꚉꚟ⻆rtǟƴ𬎎

          Why https connections are so slow when debugging (stepping over) in Java?