Chapter 6

Build your own

You've got the math. Now bring it to life. This chapter is a guided lap through the workspace using problems whose answers you already understand.

1

Round 1 — XOR with a single hidden layer

XOR is the canonical "linearly inseparable" problem. A network with no hidden layer cannot solve it, but a single hidden layer of two neurons can.

  1. Open the workspace and go to Datasets. Pick XOR.
  2. Go to Architecture. Set the hidden layer to 2 neurons with tanh activation. Output layer: 1 neuron, sigmoid.
  3. In Training, set epochs = 500, lr = 0.1, optimizer adam. Click Start.
  4. Watch the loss curve. After ~200 epochs it should be near zero.

Switch to Visualization and look at the decision boundary. It should carve a non-linear shape that correctly separates the four corners of the XOR truth table.

2

Round 2 — break it on purpose

Now reduce the hidden layer to 1 neuron. Retrain. What happens?

The loss plateaus around 0.25 and the boundary is a single straight line that gets two out of four points right. This is the linear-separability failure from chapter 3, in your hands.

3

Round 3 — multi-class with the spiral

Pick the 3-Class Spiral dataset. This needs serious capacity. Try:

  • Hidden layer: 16 ReLU.
  • Output layer: 3 softmax.
  • Loss: categorical_cross_entropy (the workspace switches to it automatically).
  • Train for 500 epochs.

Then halve the hidden size to 8. Then double it to 32. Watch how the decision boundary smoothness scales with capacity.

4

Round 4 — feel the learning rate

On the same spiral, hold the architecture fixed and try learning rates 0.001, 0.01, 0.1, 1.0. You'll observe (chapter 4):

  • 0.001 — barely moves. You'd need many more epochs.
  • 0.01 — clean, slow, reliable.
  • 0.1 — fast convergence.
  • 1.0 — chaotic loss, may diverge.
5

Round 5 — your own data

Drop a CSV onto the Datasets tab. Numeric features only; categorical targets are auto-detected. Pick your target column and the workspace builds the dataset for you.

Start small — < 10 features, < 5,000 rows — and use the workspace as a fast sandbox. When a model works there, you've got a useful baseline before reaching for a full deep-learning framework.