Chapter 6

Build your own

You've got the math. Now bring it to life. This chapter is a guided lap through the workspace using problems whose answers you already understand.

Round 1 — XOR with a single hidden layer

XOR is the canonical "linearly inseparable" problem. A network with no hidden layer cannot solve it, but a single hidden layer of two neurons can.

Open the workspace and go to Datasets. Pick XOR.
Go to Architecture. Set the hidden layer to 2 neurons with tanh activation. Output layer: 1 neuron, sigmoid.
In Training, set epochs = 500, lr = 0.1, optimizer adam. Click Start.
Watch the loss curve. After ~200 epochs it should be near zero.

Switch to Visualization and look at the decision boundary. It should carve a non-linear shape that correctly separates the four corners of the XOR truth table.

Round 2 — break it on purpose

Now reduce the hidden layer to 1 neuron. Retrain. What happens?

The loss plateaus around 0.25 and the boundary is a single straight line that gets two out of four points right. This is the linear-separability failure from chapter 3, in your hands.

Round 3 — multi-class with the spiral

Pick the 3-Class Spiral dataset. This needs serious capacity. Try:

Hidden layer: 16 ReLU.
Output layer: 3 softmax.
Loss: categorical_cross_entropy (the workspace switches to it automatically).
Train for 500 epochs.

Then halve the hidden size to 8. Then double it to 32. Watch how the decision boundary smoothness scales with capacity.

Round 4 — feel the learning rate

On the same spiral, hold the architecture fixed and try learning rates 0.001, 0.01, 0.1, 1.0. You'll observe (chapter 4):

0.001 — barely moves. You'd need many more epochs.
0.01 — clean, slow, reliable.
0.1 — fast convergence.
1.0 — chaotic loss, may diverge.

Round 5 — your own data

Drop a CSV onto the Datasets tab. Numeric features only; categorical targets are auto-detected. Pick your target column and the workspace builds the dataset for you.

Start small — < 10 features, < 5,000 rows — and use the workspace as a fast sandbox. When a model works there, you've got a useful baseline before reaching for a full deep-learning framework.

← Backpropagation Open the workspace →