2018-12-06 15:11:36 8 Comments

I am studying some machine learning on my own and I am practicing (in Python) with the assignments of the course held by Andrew Ng.

After completing the fourth exercise by hand, I tought to do it in Keras to practice with the library.

In the exercise we have 5000 images of hand written digits, going from 0 to 9. Each image is a 20x20 matrix. The dataset is stored in a matrix X of shape 5000x400 (each image has been 'unrolled') and the labels are stored in a matrix y of shape 5000x10. Each row of y is a hot-one vector.

The exercise asks to implement backpropagation to maximaze the log likelihood, for a simple neural network with one input layer, one hidden layer and one output layer. The hidden layer has 25 neurons and the output layer 10. We use sigmoid as activation for both layers.

My code in Keras is this

```
model=Sequential()
model.add(Dense(25,input_shape=(400,),use_bias=True,kernel_regularizer=regularizers.l2(1),activation='sigmoid',kernel_initializer='glorot_uniform'))
model.add(Dense(10,use_bias=True,kernel_regularizer=regularizers.l2(1),activation='sigmoid',kernel_initializer='glorot_uniform'))
model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy'])
model.fit(X, y, batch_size=5000,epochs=100, verbose=1)
```

Since I want this to be as similar as possible to the assignment I have used the same initial weights as the assignment, the same regularization parameter, the same activations and gradient descent as a optimizer (actually the assignment uses the Truncated Newton Method but I don't think my problem lies here).

I thought I was doing everything correctly but when I train the network I get a 10% accuracy on the training dataset. Even playing a little bit with the parameters the accuracy doesn't change much. To try to understand better the problem I tested it with smaller pieces of the dataset. For instance if I select a subdataset of 100 elements containing x images of zero and 100-x images of one, I get a x% training accuracy. My guess is that the network is optimizing the parameters to recognise only the first digit.

Now my questions are: what I am missing? Why isn't this the right implementation of the neural network described above?

### Related Questions

#### Sponsored Content

#### 3 Answered Questions

### [SOLVED] Usage of sigmoid activation function in Keras

**2018-11-30 08:25:28****Ahmad Hijazi****853**View**0**Score**3**Answer- Tags: python tensorflow keras neural-network sigmoid

#### 1 Answered Questions

### [SOLVED] how to check the classes a keras classifier/Neural Network is trained on?

**2018-08-17 06:03:07****Mohit Motwani****151**View**3**Score**1**Answer- Tags: python machine-learning neural-network keras

#### 2 Answered Questions

### [SOLVED] Deep Learning: small dataset with keras : local minima

**2017-07-19 10:41:14****Ajay****906**View**3**Score**2**Answer- Tags: machine-learning neural-network deep-learning keras minima

#### 1 Answered Questions

### How to study the effect of each data on a deep neural network model?

**2017-07-13 06:09:49****sakurami****189**View**5**Score**1**Answer- Tags: python machine-learning neural-network keras training-data

#### 0 Answered Questions

### How does Keras handles (?,?) input shapes? Able to train Keras VGG16 on Cifar10 despite input_shape smaller than minimum

**2018-03-15 04:14:41****Ryan Y****239**View**3**Score**0**Answer- Tags: keras

#### 2 Answered Questions

### [SOLVED] How does Keras read input data?

**2017-12-27 08:43:49****Jeremy****629**View**0**Score**2**Answer- Tags: python machine-learning neural-network keras keras-layer

#### 2 Answered Questions

### [SOLVED] Keras: Shape Mismatch between Dense and Activation layers

**2017-07-14 18:34:51****Flow Nuwen****614**View**1**Score**2**Answer- Tags: python tensorflow neural-network keras reshape

#### 1 Answered Questions

### [SOLVED] Is it possible to train an SVM or Random Forest on the final layer feature of a Convolutional Neural Network using Keras?

**2016-08-18 04:36:26****W. Hawk****1405**View**1**Score**1**Answer- Tags: computer-vision neural-network svm conv-neural-network keras

#### 0 Answered Questions

### Keras Neural Nets, How to remove NaN values in output?

**2016-05-10 15:25:40****chasep255****4759**View**7**Score**0**Answer- Tags: python floating-point neural-network keras

## 1 comments

## @Shubham Panchal 2018-12-06 15:37:33

If you are practising on the MNIST dataset, to classify 10 digits, you have 10 classes to predict. Rather than sigmoid, you should use ReLU in the hidden layers ( in your case the first layer ) and use softmax activation on the output layer. Use categorical crossentropy loss function with adam or sgd optimizer.