ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • [Machine Learning] Regression & Classification
    ๐ŸณDev/Machine Learning 2022. 1. 10. 15:21

    ๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ (๊น€์„ฑํ›ˆ)

    Colab_ML01-02
    Colab_ML03
    Colab_ML04
    Colab_ML05
    Colab_ML06


    Regression

    1. Linear Regression

    ํ•˜๋‚˜์˜ ๋…๋ฆฝ๋ณ€์ˆ˜ x์— ์˜ํ•ด ์ข…์†๋ณ€์ˆ˜ y๊ฐ€ ๊ฒฐ์ •๋˜๋Š” ์„ ํ˜• ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ฐ€์ง„๋‹ค.

    1. data
    2. Hypothesis: H(x) = Wx + b(W: weight, b: bias)
      cost๊ฐ€ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ€์„ค์„ ์„ ํƒ
      cost function(loss function) = cost(W, b) = E( square(H(x) - y) )
    3. goal: Minimize cost

     

    Code

    #Gradient Desendent Algorithm
    
    import numpy as np
    import tensorflow as tf
    
    #1.dataSet ์„ค์ •
    x_train = [1, 2, 3, 4]
    y_train = [2, 3, 4, 5]
    
    #2.Model ๊ตฌ์„ฑ
    tf.model = tf.keras.Sequential()
        # units == output shape, input_dim == input shape
    tf.model.add(tf.keras.layers.Dense(units=1, input_dim=1))
    
    #3. Model ํ•™์Šต๊ณผ์ • ์„ค์ •
      # SGD == standard gradient descendent, lr == learning rate
    sgd = tf.keras.optimizers.SGD(lr=0.1)
      # mse == mean_squared_error
    tf.model.compile(loss='mse', optimizer=sgd)
      # prints summary of the model to the terminal
    tf.model.summary()
    
    #4. Model ํ•™์Šต
      # fit() executes training
      # ํ•œ ๋ฒˆ์˜ epoch์—์„œ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ๋ฒˆ์— ๋„ฃ์„ ์ˆ˜ X, ๋ช‡ ๋ฒˆ ๋‚˜๋ˆ„์–ด์„œ ์ฃผ๋Š”๊ฐ€๋ฅผ iteration, ๊ฐ iteration๋งˆ๋‹ค ์ฃผ๋Š” ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ๋ฅผ batch size
    tf.model.fit(x_train, y_train, epochs = 100)
    
    #5. Model ์‚ฌ์šฉํ•˜๊ธฐ
      # predict() returns predicted value
    y_predict = tf.model.predict(np.array([5, 4, 3]))
    print(y_predict)
    

     

    Result

     

    2. Multi-Variable Linear Regression

     Hypothesis and Cost

     Cost(W,b) = E( square(H(X)-Y) ) (X = [x], Y = [y] : training set

     ์ด๋•Œ ์—ฌ๋Ÿฌ H(x)์ค‘์—์„œ cost๋ฅผ ์ตœ์†Œ๊ฐ’์œผ๋กœ ๊ฐ€์ง€๋Š” ๊ฐ€์„ค์„ ์„ ํƒํ•œ๋‹ค.
     (๊ทธ๋ž˜ํ”„์—์„œ W๊ฐ’์ด x์ถ•, cost(W,b)๊ฐ€ y์ถ•์— ์œ„์น˜)

     Gradient Descent Algorithm

     ์ž„์˜์˜ W์ ์—์„œ์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ํ†ตํ•ด, W๊ฐ’์˜ ์ฆ๊ฐ€/๊ฐ์†Œ๊ฐ€ ๊ฒฐ์ •๋œ๋‹ค. W๋ฅผ ์›€์ง์—ฌ์„œ ๊ฐ€์žฅ ๊ธฐ์šธ๊ธฐ๊ฐ€ ์ž‘์€ ๊ณณ(0์— ์ˆ˜๋ ดํ•˜๋Š” W๊ฐ’)์„ ์ฐพ์•„๊ฐ€๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜.

    ๊ธฐ์šธ๊ธฐ๊ฐ€ ์–‘์ˆ˜์ด๋ฉด ๋‹ค์Œ W๊ฐ’๋ณด๋‹ค ์ž‘์€ W์˜ ์™ผ์ชฝ์œผ๋กœ ์ด๋™ํ•˜๊ณ , ๊ธฐ์šธ๊ธฐ๊ฐ€ ์Œ์ˆ˜์ด๋ฉด ๋‹ค์Œ W๊ฐ’๋ณด๋‹ค ํฐ W์˜ ์˜ค๋ฅธ์ชฝ์œผ๋กœ ์ด๋™ํ•œ๋‹ค.

    (W์—์„œ ๋นผ๊ณ  ์žˆ๋Š” ๊ฐ’์€ LearningRate(a) * Cost'(: Cost์˜ ๊ธฐ์šธ๊ธฐ))
    (๊ณ„์‚ฐ์— ํฐ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š” ๊ฐ’๋“ค์€ ๊ฐ„๋žตํ•˜๊ฒŒ ์ฒ˜๋ฆฌ)

    Gradient Descent Algorithm์œผ๋กœ Linear Regression์„ ์ง„ํ–‰ํ•  ๋•Œ,
    Cost Function์€ Convex Function(๋ณผ๋ก ํ•จ์ˆ˜)

     

    Code

    ##Graph ๊ทธ๋ฆฌ๊ธฐ
    
    import numpy as np
    import tensorflow as tf
    import matplotlib.pyplot as plt
    
    #1.data set ์„ค์ •
    x_train = [1, 2, 3, 4]
    y_train = [1, 2, 3, 4]
    
    #2.Model ๊ตฌ์„ฑ
    tf.model = tf.keras.Sequential()
      # units == output shape, input_dim == input shape
    tf.model.add(tf.keras.layers.Dense(units=1, input_dim=1))
    
    #3. Model ํ•™์Šต๊ณผ์ • ์„ค์ •
      # SGD == standard gradient descendent, lr == learning rate
    sgd = tf.keras.optimizers.SGD(lr=0.1)
      # mse == mean_squared_error, 1/m * sig (y'-y)^2
    tf.model.compile(loss='mse', optimizer=sgd)
      # prints summary of the model to the terminal
    tf.model.summary()
    
    #4. Model ํ•™์Šต
      # fit() trains the model and returns history of train
    history = tf.model.fit(x_train, y_train, epochs=100)
    
    #6. Model ์‚ฌ์šฉํ•˜๊ธฐ
      # predict() returns predicted value
    y_predict = tf.model.predict(np.array([5, 4]))
    print(y_predict)
    
      # Plot training & validation loss values
    plt.plot(history.history['loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='upper left')
    plt.show()
    

     

    Result

    Multi-Variable Linear Regression

    H(x) = Wx + b์ผ ๋•Œ๋Š” ๋ณ€์ˆ˜๊ฐ€ ํ•œ ๊ฐœ, ๊ทธ๋Ÿฌ๋‚˜ ๋ณ€์ˆ˜๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์ธ ์ƒํ™ฉ์—์„œ๋Š” Matrix์˜ ๊ณฑ์…ˆ์„ ์ด์šฉํ•˜์—ฌ ๊ฐ„๋žตํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ–‰๋ ฌ์˜ ๊ณฑ์˜ ํŠน์ง•์„ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค.

    X ํ–‰๋ ฌ์„ [a, b]๋ผ๊ณ  ํ‘œํ˜„ํ•  ๋•Œ, a๋Š” instance์˜ ๊ฐœ์ˆ˜, b๋Š” variable์˜ ๊ฐœ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
    W ํ–‰๋ ฌ์„ [b, c]๋ผ๊ณ  ํ‘œํ˜„ํ•  ๋•Œ, b๋Š” variable์˜ ๊ฐœ์ˆ˜, c๋Š” ๊ฐ€์„ค์˜ X์— ๋Œ€ํ•œ Y๊ฐ’์˜ ๊ฐœ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค.(Linear์—์„œ, c = 1)
    X ํ–‰๋ ฌ๊ณผ Y ํ–‰๋ ฌ์˜ ๊ณฑ์ด๋ฏ€๋กœ, H ํ–‰๋ ฌ์€ [a, c]๋ผ๊ณ  ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.

    Matrix๋ฅผ ์ด์šฉํ•˜๋ฉด, W์™€ X์˜ ์ˆœ์„œ๊ฐ€ ๋ฐ”๋€Œ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ H(X) = XW๋ผ๊ณ  ํ‘œ๊ธฐํ•˜๋ฉด, ํ–‰๋ ฌ์„ ์˜๋ฏธํ•œ๋‹ค. ๊ฒฐ๋ก ์ ์œผ๋กœ ๊ฐ€์„ค์ด๋ž‘ ๊ฐ™์€ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„๋‹ค.

     

    Code

    
    ##multi-variable regression
    
    import tensorflow as tf
    import numpy as np
    
    #1.dataSet ์„ค์ •
    x_data = [[73., 80., 75.], [93., 88., 93.], [89., 91., 90.], [96., 98., 100.], [73., 66., 70.]]
    y_data = [[152.], [185.], [180.], [196.], [142.]]
    
    #2.Model ๊ตฌ์„ฑ
    tf.model = tf.keras.Sequential()
      # input_dim=3 gives multi-variable regression
    tf.model.add(tf.keras.layers.Dense(units=1, input_dim=3))
      # linear activation is default
    tf.model.add(tf.keras.layers.Activation('linear'))
    
    #3. Model ํ•™์Šต๊ณผ์ • ์„ค์ •
    tf.model.compile(loss='mse', optimizer=tf.keras.optimizers.SGD(lr=1e-5))
    
    #4. Model ์ถœ๋ ฅ
    tf.model.summary()
    
    #5. Model ํ•™์Šต
    history = tf.model.fit(x_data, y_data, epochs=100)
    
    #6. Model ์‚ฌ์šฉํ•˜๊ธฐ
    y_predict = tf.model.predict(np.array([[72., 93., 90.],[60., 80., 80]]))
    print(y_predict)
    

     

    Result

    Logistic Classification

    3. Binary Classification

    ์ด์‚ฐ์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ง€๋ฉฐ, 0๊ณผ 1๋กœ์˜ encoding์ด ํ•„์š”ํ•˜๋‹ค.

    spam detection: spam or ham, facebook feed์˜ show or hide

    Hypothesis

    -log(x) / -log(1-x)

    sigmoid

     

    Binary Classification์€ 0๋˜๋Š” 1์˜ ๊ฐ’์„ ๊ฐ€์ง€๊ธฐ ๋•Œ๋ฌธ์—, ๊ฒฐ๊ณผ๊ฐ’์ด 0~1์ธ Sigmoid(S์žํ˜•๊ณก์„ )์„ ์ด์šฉํ•œ ๊ฐ€์„ค์„ ์„ธ์šธ ์ˆ˜ ์žˆ๋‹ค. ์œ„์˜ H(x)๊ฐ€ sigmoid ํ•จ์ˆ˜์ด๋ฉฐ, logistic cost ํ•จ์ˆ˜๋ผ๊ณ ๋„ ํ•œ๋‹ค. Cost ์ตœ์†Œํ™”๋Š” Gradient Descent Optimizer๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌํ˜„ํ•œ๋‹ค.

     

    Cost


    H(x) = 1H(x) = 0

    Y = 1 0 INF
    Y = 0 INF 0

     

    Code

    
    ##binary classification
    
    import tensorflow as tf
    
    #1.dataSet ์„ค์ •
    x_data = [[1, 2],[2, 3],[3, 1],[4, 3],[5, 3],[6, 2]]
    y_data = [[0], [0], [0], [1], [1], [1]]
    
    #2.Model ๊ตฌ์„ฑ
    tf.model = tf.keras.Sequential()
    tf.model.add(tf.keras.layers.Dense(units=1, input_dim=2))
    tf.model.add(tf.keras.layers.Activation('sigmoid'))
    
    #3. Model ํ•™์Šต๊ณผ์ • ์„ค์ •
    tf.model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.SGD(lr=0.01), metrics=['accuracy'])
    
    #4. Model ์ถœ๋ ฅ
    tf.model.summary()
    
    #5. Model ํ•™์Šต
    history = tf.model.fit(x_data, y_data, epochs=100)#return np.array
    
    #6. Model ์‚ฌ์šฉํ•˜๊ธฐ
    print("Accuracy: ", history.history['accuracy'][-1])
    

     

    Result

    4. Multinomial Classification

    ์—ฌ๋Ÿฌ๊ฐœ์˜ ์„ ํƒ์ง€ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ง„๋‹ค.

    ์œ„์™€ ๊ฐ™์ด ์„ธ๊ฐ€์ง€๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•  ๋•Œ, 3๊ฐ€์ง€ Binary Classification์„ ์ง„ํ–‰ํ•˜์—ฌ Multinomial Classification์„ ๊ตฌํ˜„ํ•œ๋‹ค.

    Hypothesis

    ์„ธ๊ฐ€์ง€ ๊ฐ€์„ค์„ ํ–‰๋ ฌ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋žตํ™”ํ•œ๋‹ค. binary์˜ ๊ฒฝ์šฐ์—๋Š” sigmoid ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ, ์—ฌ๊ธฐ์„œ๋Š” softmax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก๊ฐ’์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. softmax๋Š” 0๋ถ€ํ„ฐ 1์‚ฌ์ด์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๋ฉฐ, ๋ชจ๋“  ์ถœ๋ ฅ๊ฐ’์˜ ํ•ฉ์ด 1์ด๋‹ค. One-Hot-Encoding์€ ๋ฐฐ์—ด ๋‚ด ๊ฐ€์žฅ ํฐ ๊ฐ’์„ 1๋กœ, ๋‚˜๋จธ์ง€ ๊ฐ’๋“ค์€ 0์œผ๋กœ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ๊ฒฐ๊ตญ Y_data์™€ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค.

     

    Cost

    Cross-Entropy
    : ๊ฒฐ๋ก ์ ์œผ๋กœ Logistic cost function๊ณผ ๊ฒฐ๊ณผ๊ฐ€ ๊ฐ™๋‹ค. Cost ์ตœ์†Œํ™”๋Š” Gradient Descent Optimizer๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

     

    Code

    
    ## multinominal classification
    import tensorflow as tf
    import numpy as np
    
    #1.dataSet ์„ค์ •
    x_raw = [[1, 2, 1, 1],
             [2, 1, 3, 2],
             [3, 1, 3, 4],
             [4, 1, 5, 5],
             [1, 7, 5, 5],
             [1, 2, 5, 6],
             [1, 6, 6, 6],
             [1, 7, 7, 7]]
    y_raw = [[0, 0, 1],
             [0, 0, 1],
             [0, 0, 1],
             [0, 1, 0],
             [0, 1, 0],
             [0, 1, 0],
             [1, 0, 0],
             [1, 0, 0]]
             #One Hot Encoding
    
    x_data = np.array(x_raw, dtype=np.float32)
    y_data = np.array(y_raw, dtype=np.float32)
    
    nb_classes = 3
    
    #2. Model ๊ตฌ์„ฑ
    tf.model = tf.keras.Sequential()
    tf.model.add(tf.keras.layers.Dense(input_dim=4, units=nb_classes, use_bias=True))
      # use softmax activations: softmax = exp(logits) / reduce_sum(exp(logits), dim)
    tf.model.add(tf.keras.layers.Activation('softmax'))
    
    #3. Model ํ•™์Šต๊ณผ์ • ์„ค์ •
    # use loss == categorical_crossentropy
    tf.model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.SGD(lr=0.1), metrics=['accuracy'])
    
    #4. Model ์ถœ๋ ฅ
    tf.model.summary()
    
    #5. Model ํ•™์Šต
    history = tf.model.fit(x_data, y_data, epochs=1000)
    
    #6. Model ์‚ฌ์šฉ
    print('--------------')
    #์ถœ๋ ฅ : One-hot encoding
    # argmax : array์—์„œ ๊ฐ€์žฅ ํฐ๊ฐ’์„ ์ถœ๋ ฅ
    a = tf.model.predict(np.array([[1, 11, 7, 9]]))
    print(a, tf.keras.backend.eval(tf.argmax(a, axis=1)))
    
    print('--------------')
    # or use argmax embedded method, predict_classes
    c = tf.model.predict(np.array([[1, 1, 0, 1]]))
    c_onehot = tf.model.predict_classes(np.array([[1, 1, 0, 1]]))
    print(c, c_onehot)
    
    print('--------------')
    all = tf.model.predict(np.array([[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]))
    all_onehot = tf.model.predict_classes(np.array([[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]))
    print(all, all_onehot)
    

     

    Result



Designed by Tistory.