Optimization Error for Logistic Regression using Scipy.opt












0















I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



Kindly point me in the right direction as to what is causing the problem and how to avoid it.



Thanks a million!



import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
print(dataset.head())

positive = dataset[dataset["Admitted"] == 1]
negative = dataset[dataset["Admitted"] == 0]

#Visualizing Dataset
plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
plt.xlabel("Exam 1 Score")
plt.ylabel("Exam 2 Score")
plt.title("Admission Graph")
plt.legend()
#plt.show()

#Preprocessing Data
dataset.insert(0, "x0", 1)
col = len(dataset.columns)
x = dataset.iloc[:,0:col-1].values
y = dataset.iloc[:,col-1:col].values
b = np.zeros([1,col-1])
m = len(y)
print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

#Defining Functions
def hypothesis(x, y, b):
h = 1 / (1+np.exp(-x @ b.T))
return h

def cost(x, y, b):
first = (y.T @ np.log(hypothesis(x, y, b)))
second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
j = (-1/m) * np.sum(first+second)
return j

def gradient(x, y, b):
grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
return b

#Output
initial_cost = cost(x, y, b)
print(f"nInitial Cost = {initial_cost}")
final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
print(f"Final Cost = {final_cost} nTheta = {b}")


Dataset : Student Dataset.txt










share|improve this question













migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.























    0















    I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



    Kindly point me in the right direction as to what is causing the problem and how to avoid it.



    Thanks a million!



    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.optimize as opt

    dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
    print(dataset.head())

    positive = dataset[dataset["Admitted"] == 1]
    negative = dataset[dataset["Admitted"] == 0]

    #Visualizing Dataset
    plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
    plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
    plt.xlabel("Exam 1 Score")
    plt.ylabel("Exam 2 Score")
    plt.title("Admission Graph")
    plt.legend()
    #plt.show()

    #Preprocessing Data
    dataset.insert(0, "x0", 1)
    col = len(dataset.columns)
    x = dataset.iloc[:,0:col-1].values
    y = dataset.iloc[:,col-1:col].values
    b = np.zeros([1,col-1])
    m = len(y)
    print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

    #Defining Functions
    def hypothesis(x, y, b):
    h = 1 / (1+np.exp(-x @ b.T))
    return h

    def cost(x, y, b):
    first = (y.T @ np.log(hypothesis(x, y, b)))
    second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
    j = (-1/m) * np.sum(first+second)
    return j

    def gradient(x, y, b):
    grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
    return b

    #Output
    initial_cost = cost(x, y, b)
    print(f"nInitial Cost = {initial_cost}")
    final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
    print(f"Final Cost = {final_cost} nTheta = {b}")


    Dataset : Student Dataset.txt










    share|improve this question













    migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


    This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.





















      0












      0








      0








      I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



      Kindly point me in the right direction as to what is causing the problem and how to avoid it.



      Thanks a million!



      import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      import scipy.optimize as opt

      dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
      print(dataset.head())

      positive = dataset[dataset["Admitted"] == 1]
      negative = dataset[dataset["Admitted"] == 0]

      #Visualizing Dataset
      plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
      plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
      plt.xlabel("Exam 1 Score")
      plt.ylabel("Exam 2 Score")
      plt.title("Admission Graph")
      plt.legend()
      #plt.show()

      #Preprocessing Data
      dataset.insert(0, "x0", 1)
      col = len(dataset.columns)
      x = dataset.iloc[:,0:col-1].values
      y = dataset.iloc[:,col-1:col].values
      b = np.zeros([1,col-1])
      m = len(y)
      print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

      #Defining Functions
      def hypothesis(x, y, b):
      h = 1 / (1+np.exp(-x @ b.T))
      return h

      def cost(x, y, b):
      first = (y.T @ np.log(hypothesis(x, y, b)))
      second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
      j = (-1/m) * np.sum(first+second)
      return j

      def gradient(x, y, b):
      grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
      return b

      #Output
      initial_cost = cost(x, y, b)
      print(f"nInitial Cost = {initial_cost}")
      final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
      print(f"Final Cost = {final_cost} nTheta = {b}")


      Dataset : Student Dataset.txt










      share|improve this question














      I've been trying to write Andrew NG's Logistic Regression Problem Using python and Scipy.opt for optimizing the function. However, I get a VALUE ERROR that says I have mismatching dimensions. I've tried to flatten() my theta array as scipy.opt doesn't seem to work very well with single column/row vector, however the problem still persists. I've also reshaped the array, but the code doesn't respond to it and shows the same error.



      Kindly point me in the right direction as to what is causing the problem and how to avoid it.



      Thanks a million!



      import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      import scipy.optimize as opt

      dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
      print(dataset.head())

      positive = dataset[dataset["Admitted"] == 1]
      negative = dataset[dataset["Admitted"] == 0]

      #Visualizing Dataset
      plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
      plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
      plt.xlabel("Exam 1 Score")
      plt.ylabel("Exam 2 Score")
      plt.title("Admission Graph")
      plt.legend()
      #plt.show()

      #Preprocessing Data
      dataset.insert(0, "x0", 1)
      col = len(dataset.columns)
      x = dataset.iloc[:,0:col-1].values
      y = dataset.iloc[:,col-1:col].values
      b = np.zeros([1,col-1])
      m = len(y)
      print(f"X Shape: {x.shape} Y Shape: {y.shape} B Shape: {b.shape}")

      #Defining Functions
      def hypothesis(x, y, b):
      h = 1 / (1+np.exp(-x @ b.T))
      return h

      def cost(x, y, b):
      first = (y.T @ np.log(hypothesis(x, y, b)))
      second = (1-y).T @ np.log(1 - hypothesis(x, y, b))
      j = (-1/m) * np.sum(first+second)
      return j

      def gradient(x, y, b):
      grad_step = ((hypothesis(x, y, b) - y) @ x.T) / m
      return b

      #Output
      initial_cost = cost(x, y, b)
      print(f"nInitial Cost = {initial_cost}")
      final_cost = opt.fmin_tnc(func=cost, x0=b.flatten() , fprime=gradient, args=(x,y))
      print(f"Final Cost = {final_cost} nTheta = {b}")


      Dataset : Student Dataset.txt







      machine-learning python logistic-regression scipy






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 '18 at 2:42









      Antony JohnAntony John

      337




      337




      migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


      This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.









      migrated from datascience.stackexchange.com Nov 20 '18 at 22:31


      This question came from our site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field.


























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402572%2foptimization-error-for-logistic-regression-using-scipy-opt%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402572%2foptimization-error-for-logistic-regression-using-scipy-opt%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Guess what letter conforming each word

          Port of Spain

          Run scheduled task as local user group (not BUILTIN)