Firstly to all my friends that made me write this, I hate you.

Functions are very very important in calculus. It’s vital for you to know the basics of a function. Read this super simple guide to understand more about functions.

Function of a line

Lines have functions. Every function can be plotted onto a graph. We want to find the function of a line having been given a line.

To work out the function of a line you need to find out how steep the line is, y, and find out what point should be on the line when x = 0, we’ll call this c.

It’s really handy to visualise this as much as possible. I’m not just stating some random numbers here, when I say x = 0 I mean the Y intercept, where the line crosses the Y axis.

This gives us the formula:

In America I believe that this equation is written y = mx + b


We can find the gradient of a line using this formula:

Or in other words

How much Y should change relative to how much X has changed.

Given two points on a line, <1, 2> and <4, 5> the gradient is 5–2 / 4–1 = 3/3 = 1. The gradience of this line is 1. M = 1.

Finding the offset

We also want to find the offset, where the line intercepts the Y axis.

BBC Bitesize

In this example the y intercept (where the line crosses through the y axis) is 1.

So we already know some facts:

<0, c> is on the line but we don’t know where. We know this line must cross the y axis.

We also know that <x1, y1> is a point on the line.

We know line has gradient M.

We know this gradient does not change, as it is a line.

So the gradient from <0, c> to <x1, y1> is the same as the gradient from <x1, y1> to <x2, y2>.

In other words, C must be (when x = 0)

Really working out the offset is just a case of defining what you know and using some algebra.

Weird looking graphs

Let’s say you have a graph that looks like this:

How do you calculate the line function of that graph?

We start with some function, we’ll call it F(x)

We choose two places, x = a and x = b and find the values of F(a) and F(b)

Then we use the line connecting <a, F(a)> to <b, F(b)> to “reason about” F(x).

The line joining <−10, F(−10)> to <5, F(5)> doesn’t seem to tell us much about F(x).

It actually looks quite rubbish. Trying to use this straight line to “reason about” the curve is ridiculous.

The line doesn’t tell us anything about the function.

What we’re seeking is how F(x) (the curve) behaves as a line when x = 5.

The “closer” <x, x2> is to <5, 25> the “better” the quality of information the line between the two points provides about the gradient of the “line touching” the curve at <5,25>. But we cannot use the line from <5,25> to <5,25> itself because if you attempt to calculate the gradient of this line the gradient will be 0.

We want to know what’s going on at the point <5, 25>. We look at the line between <5, 25> and <x, x2> with x getting “close to” 5: That is, set x= 5 +d and let d become “very very small”.[In general for a point (<x,x²>) we look at points (<x+d,(x+d)²>)].

In other words, we take the original equation and we multiply each component by d (but we do not let d = 0) so the second equation gets “closer and closer to” the original without them ever becoming the same equation.

We do this because we cannot reason the equation of a graph that looks like (X²). Okay, professional hat off. We cannot reason the equation of some whacky out of this world graph with lines going which way and every where all at once, so we use a normal line to attempt to represent these weird looking graphs.


A Derivative is a measure of Instaneous rate of change

Now we want to find a gradient

We want to find the gradient of the line joining <5, 25> to (<5 + d, (5 + d)²>). Earlier we learnt that for any 2 points in a plane (the graph) there exists a unique line connecting these 2 points and the line (<5 + d, (5 + d)²>) is a point that is ever so close to the original point.

And with a bit of rearranging we get

So we know that the nearer d gets to 0, the nearer the line gets to 10. This is simply seen in the equation (10 + d). If d was 1, the line would be at 11. But obviously d is very very small. Minute, almost. If d gets smaller and smaller than the line touching the curve gets closer and closer.

Let’s try another example.

Let’s say we have 2 points:

<x + d,(x + d)²> and <x, x²>

The gradient formula for any line is the change in y divided by the change in x.

And we know for a fact that there is always a unique line between 2 points.

Then we can use the gradient formula to work it out.

The gradient of the line touching on the curve approaches the value 2x, but it never becomes 2x because of the value d (where d can never be 0). The idea is to learn something about a curve by creating a straight line that gets ever so slightly close to a point on the curve without ever touching the exact point on the curve.

We do this because it’s easier to reason about straight lines than it is to reason about curved lines.

I highly suggest taking the two points and trying to find the gradient yourself using the formula


The gradient of the line touching the point (<x,x²>) on the curve (f(x) = x²) is ‘best described by’ the value of the function f(x) = 2x

Or in other words:

The term “lim ”is just like our notation of choosing a point (<x+d, (x+d)²>) “closer and closer” to our original point (<x, x²>). Letting d become as close to (without reaching) 0.

This is referred to as a “limit”.

Differentiating Functions

For a function the notion of:

“the function for the gradient of the line touching the point <x, f(x)>” is referred to as the first derivative of the function f(x).

So previously we have seen how to calculate the gradient of a line with 2 points, but now we want to calculate the gradient of a line touching an x coordinate that is also touching the line function f(x).

The formula to calculate that f’(x)=2x for (f(x) = x²) can be rewritten to be used by any function like so:

If you look, you’ll notice it’s very very similar to what we were playing with earlier. Limits are extremely useful because it’s like saying “ignoring what happens when we get there, look at what happens as we approach there”.

When “we get there” the equation becomes 0 / 0, and you never want to divide anything by 0, yet alone 0 divided by 0. But, on the way there we can see some cool things. Like the closer we get to 0, the more and more a function reaches the number 2. So you would expect that when we reach 0, the function hits the number 2. But because of the whole 0 divided by 0 thing we can’t actually just calculate that straight off the bat.

To my classmates

Luckily Paul said you don’t need to know any of this, just the general “how to do it”, not any specifics. The derivatives of functions have already been found. In the exam you will be provided with a table that shows you the answers to any questions like these. See slide 29 of 3rd powerpoint on his website for this table.

You do not need to memorise these tools. You only need to learn how to use them.

In short, you make a line that gets “closer and closer” to a curve without ever reaching the same coordinates of the curve. Over time you’ll start to see the line getting “closer and closer” to an arbitary value, that value will be the first derivative.


The first derivative f’(x) of a function f(x) describes the gradient of the line touching the point <x, f(x)> for (the graph or plot of) the function f(x).

f’(x) is itself a function. The first derivative is a function.

The idea of a first derivative raises an extraordinary question. If there is a first, there has to be a second, right?

Turning points, second derivative, maxima and minima values

The line with gradient 0 touches the curve f(x) = x² at <0, 0>.

When x < 0 all lines touching <x, x²> have negative gradients.

When x > 0, all lines touching <x, x²> have positive gradients.

When x = 0 the curve is at a turning point. It’s value is a minimum.

So in this image we have 2 functions. One function is a fancy (f(x) = x²) and the other function is a straight line ((2c)x — (cc)).

The value c changes from 5 all the way through to -5. When the gradient of this second line is positive (where m > 0) all the lines (that’s every single possible gradience from 0 through to 10) which are touching the curve at (x, x²) are positive (they’re above 0) and every similar line that has a negative gradience is negative (below 0). The line at exactly 0 gradience is the minimum. You can clearly see that the second line is always touching the (x²) curve.

In the words of Professor Dunne himself:

What you should notice is that (regardless of what the value c is) these always touch the x² curve (in fact for a given c, the line (2c)x-(c*c) will touch the point (c,c²) ).
You should also see that “in order to keep contact” (bit of a loose way of phrasing it) the slope of the line changes, the “turning point” is precisely the position where the slope goes from decreasing (before) to increasing (after). For this behaviour to happen then at the turning point the slope must be zero (and from the decreasing->increasing pattern is a minimum); if the the pattern where s(increasing->turn->decreasing) the turning point would be a maximum.

This looks extremely easy, because you just look at the graph, see the line of gradience 0 and say “that’s the minimum”. But if it was that easy this wouldn’t be taught. You have to find the minimum of lines and curves that you cannot see or visualise.

Second Derivative

The second derivative, f’‘(x) is the derivative of f’(x). The second derivative is the derivative of the first derivative.

The second derivative of (x²) is a constant which is positive. This tells us that 0 (the value of f’(x) =2 where the line touching (<x, x²>) is horizontal) is a local minimum for the function. That is: every value of x (with x != 0) is such that f(x) > f(0). In fact (for this function) x = 0 is also a global minimum.

This is extremely confusing so to simpilfy:

Comptute f’(x) — the first derivative.

Find the value(s) of x for which f’(x) = 0.

These give the turning points of f(x) and where local minima and maxima occur.

Comptute f’‘(x) — the second derivative.

For each turning point, t, computer the value f’‘(t): if f’‘(t) < 0 then f(t) is a local maximum if f’‘(t) > 0 then f(t) is a local minimum if f’‘(t) = 0 then we cannot make any conclusions.

Revisiting Maxima and Minima

Functions can have a minimum value which is the lowest value the graph touches. You’ll notice in the previous graph that the function (x²) does not go below 0. The maximum value is the value it does not go above.

Local maximum and minimum values are specified in a set interval and there can be more of 1 each. Global maximum and minimum values are the absolute highest of that fraph.

Let’s work through an example. Let’s say a ball is thrown into the air. Its height at any time, t, is given by:

What is its maximum height?

Well, using derivatives we can work out the slope of this function. The slop is (14–10t). How do we know this? Well, we just use a handy derivative table to tell us.

We can use these derivative rules for this specific fuction:

  • The slope of a constant value (like 3) is 0.
  • The slope of a line like 2x is 2, so 14t has a slope of 14.
  • A square function like (t²) has a slope of 2t, so (5t²) has a slope of 5(2t).
  • And then we add them up: 0 + 14–5(2t)

So we get the answer 14–5(2t).

Now we need to find when the slope is equal to 0. So we simply do this:

The slope is zero at t = 1.4.

And the height of that time (which is just taking the height equation given to us earlier)

The maximum height of this funtion is 12.8m at 1.4 seconds.

This kind of begs the question of “how can we prove it is the maximum (or minimum) value?”

Take the derivative of the slope (the second derivative of the first function) so the derivative of 14–10t is -10. This means that the slope starts out positive, goes through 0 and then the slope becomes negative (at -10) which means is a maximum value. This is called the second derivative test.

Let’s try another example.

Find the maxima and minima values for:

The derivative is:

Which is quadratic with roots of:

Could these values be minima or maxima?

The second derivative is y’’ = 30x + 4

At x = -3/5: y’’ = 30(-3/5) + 4 = -14 it is less than 0, so -3/5 is a local maximum.

At x = +1/3: y’’ = 30(1/3) + 4 = 14 It is greater than 0, so 1/3 is a local maximum.


The antiderivative is any function, F(x) with the property F’(x) = F(x).

Antiderivatives are extremely useful in measuring areas under sloping lines. For exampke, what’s the area of these two functions?

We can measure a smaller area, such as the area between x = 6 and x = 14 for the function f(x) = x².

We can make an estimation of the area by creating rectangles under the curve. As the rectangle base gets smaller and smaller our estimates get smaller and smaller resulting in our lower and upper estimate’s differences getting smaller and smaller.

The upper value is just a sum of rectangle areas: all have base length h and for each k between 1 and N = (b — a)/h there is a rectangle with height F(a + kh). In other words:

In order to find the area covered by the function F(x) between the values x = a, x = b it is enough to know the antiderivative of F(z).

If F(x) is the antiderivative of f(x), then the area is F(b) — F(a).

It is a little wearisome to constantly repeat the phrase “anti derivative of f(x)”. The standard notation used is the integral sign. The expression “F(x) is the antiderivative of f(x)” is written as:

This is called an indefinite integral since we do not have enough information about F(x) uniquely to define its behaviour. For the area applications when a and b are supplied, we use:

If you enjoyed this article, connect with me to learn more like this :)

LinkedIn | Website | Twitter

Buy Brandon Skerritt a Coffee.
Support the content you love. Buy a Coffee for Brandon Skerritt with Ko-fi.comko-fi.comPay Brandon Skerritt using PayPal.Me
Go to and type in the amount. Since it's PayPal, it's easy and secure. Don't have a PayPal…