Derivative Rules
This is the complete reference for derivative rules, covering everything from the limit definition to higher-order applications. Each rule includes why it works (proof sketches) and who discovered it (historical context), helping you truly understand the heart of calculus.
1. What is a Derivative
A derivative measures the instantaneous rate of change of a function. If f(x) describes a car's position at time x, then f'(x) is the velocity at that instant.
This limit definition says: the derivative is the limit of the "secant line slope" as it approaches the "tangent line slope." As h approaches 0, the average rate of change between two points converges to the instantaneous rate of change at that point.
The derivative was independently discovered by Isaac Newton (1665) and Gottfried Wilhelm Leibniz (1684). Newton called them "fluxions" and used them to describe motion and change; Leibniz developed the concept of "differentials" and created the dy/dx notation we still use today. The two engaged in a bitter priority dispute, but modern mathematics recognizes both as independent co-discoverers.
Three Notations
| Notation | Inventor | Form | Best Used When |
|---|---|---|---|
| Leibniz notation | Leibniz (1684) | dy/dx | Chain rule, implicit differentiation; intuitively shows "ratio of infinitesimals" |
| Lagrange notation | Lagrange (1770s) | f'(x) | General function analysis; compact and concise |
| Newton notation | Newton (1665) | y-dot (dot above y) | Physics, especially derivatives with respect to time |
Why do different notations exist? Each shines in specific contexts. Leibniz notation lets the chain rule look like fraction cancellation (dy/dx = dy/du * du/dx) -- extremely intuitive. Lagrange notation is more compact for higher-order derivatives (f''(x) vs d^2y/dx^2). Newton's dot notation is standard in physics for time derivatives (x-dot for velocity, x-double-dot for acceleration).
2. Basic Rules
2.1 Constant Rule
Why: A constant never changes, so its rate of change is zero. From the definition: lim[h->0] (c - c)/h = lim[h->0] 0/h = 0.
2.2 Power Rule
f(x+h) = (x+h)^n = x^n + n*x^(n-1)*h + (n(n-1)/2)*x^(n-2)*h^2 + ... + h^n
f(x+h) - f(x) = n*x^(n-1)*h + (terms with h^2 and higher)
Divide by h: n*x^(n-1) + (terms with h)
As h -> 0, all terms with h vanish, leaving n*x^(n-1).
2.3 Sum / Difference Rule
Why: Linearity of limits. Since lim(A+B) = lim A + lim B, and the derivative is fundamentally a limit, differentiation inherits additivity.
2.4 Constant Multiple Rule
Why: Again, linearity of limits. A constant can be pulled out: lim[h->0] c*(f(x+h)-f(x))/h = c * lim[h->0] (f(x+h)-f(x))/h.
Summary Table
| Rule | Formula | Example |
|---|---|---|
| Constant Rule | d/dx [c] = 0 | d/dx [5] = 0 |
| Power Rule | d/dx [x^n] = n*x^(n-1) | d/dx [x^3] = 3x^2 |
| Constant Multiple | d/dx [cf] = cf' | d/dx [5x^2] = 10x |
| Sum/Difference | d/dx [f +/- g] = f' +/- g' | d/dx [x^2 + x] = 2x + 1 |
3. Product and Quotient Rules
3.1 Product Rule
d/dx [fg] = lim[h->0] (f(x+h)g(x+h) - f(x)g(x)) / h
Key step -- add and subtract f(x+h)g(x):
= lim[h->0] (f(x+h)g(x+h) - f(x+h)g(x) + f(x+h)g(x) - f(x)g(x)) / h
= lim[h->0] [ f(x+h) * (g(x+h)-g(x))/h + g(x) * (f(x+h)-f(x))/h ]
= f(x) * g'(x) + g(x) * f'(x)
Why the trick works: It decomposes "both factors changing simultaneously" into "one factor changes while the other stays fixed" -- two pieces we already know how to handle.
3.2 Quotient Rule
d/dx [f * g^(-1)] = f' * g^(-1) + f * (-1) * g^(-2) * g'
= f'/g - f*g'/g^2
= (f'g - fg') / g^2
Mnemonic: "lo d-hi minus hi d-lo, over lo-lo" -- where "hi" is the numerator f and "lo" is the denominator g.
4. Chain Rule -- The Most Important Rule
In Leibniz notation: dy/dx = (dy/du) * (du/dx), where u = g(x).
Suppose y depends on u, and u depends on x. If x changes by a tiny amount dx:
- u changes by du ~ (du/dx) * dx
- y changes by dy ~ (dy/du) * du = (dy/du) * (du/dx) * dx
Therefore dy/dx ~ (dy/du) * (du/dx). In the limit, this becomes exact.
Intuition: If temperature rises at 3 degrees C/hour, and metal expands at 0.01 mm/degree C, then the metal expands at 0.03 mm/hour. Rates multiply!
Examples: Simple to Complex
Outer function: sin(u), inner function: u = x^2
Outer function: u^5, inner function: u = 3x+1
= e^(sin(x^2)) * cos(x^2) * 2x
Three-layer chain rule: outer e^u, middle sin(v), inner v = x^2
5. Trigonometric Derivatives
5.1 d/dx [sin x] = cos x
d/dx [sin x] = lim[h->0] (sin(x+h) - sin(x)) / h
Using the angle addition formula: sin(x+h) = sin x cos h + cos x sin h
= lim[h->0] (sin x (cos h - 1) + cos x * sin h) / h
= sin x * lim[h->0] (cos h - 1)/h + cos x * lim[h->0] sin h / h
= sin x * 0 + cos x * 1 = cos x
Key limits: lim[h->0] sin(h)/h = 1 (proved via the squeeze theorem) and lim[h->0] (cos h - 1)/h = 0.
5.2 d/dx [cos x] = -sin x
Why: cos x = sin(pi/2 - x). By the chain rule: d/dx [sin(pi/2 - x)] = cos(pi/2 - x) * (-1) = -sin x.
All Six Trigonometric Derivatives
| f(x) | f'(x) | How to Derive |
|---|---|---|
| sin x | cos x | Limit definition + squeeze theorem |
| cos x | -sin x | cos x = sin(pi/2 - x) + chain rule |
| tan x | sec^2 x | tan = sin/cos + quotient rule |
| cot x | -csc^2 x | cot = cos/sin + quotient rule |
| sec x | sec x * tan x | sec = 1/cos + chain rule |
| csc x | -csc x * cot x | csc = 1/sin + chain rule |
d/dx [tan x] = d/dx [sin x / cos x]
= (cos x * cos x - sin x * (-sin x)) / cos^2(x)
= (cos^2(x) + sin^2(x)) / cos^2(x) = 1/cos^2(x) = sec^2(x)
6. Exponential and Logarithmic Derivatives
6.1 d/dx [e^x] = e^x
The constant e ~ 2.71828 was systematically studied by Leonhard Euler (1748). e is the unique base for which d/dx [a^x] = a^x -- that is, e^x is the only exponential function that equals its own derivative. This property makes e ubiquitous in calculus, probability theory, and complex analysis.
Definition: e = lim[n->infinity] (1 + 1/n)^n
6.2 d/dx [a^x] = a^x * ln(a)
d/dx [e^(x*ln(a))] = e^(x*ln(a)) * ln(a) = a^x * ln(a)
6.3 d/dx [ln x] = 1/x
Let y = ln x, so x = e^y. Differentiate both sides with respect to x:
1 = e^y * (dy/dx)
dy/dx = 1/e^y = 1/x
6.4 d/dx [log_a(x)] = 1/(x * ln(a))
Derivation: log_a(x) = ln(x)/ln(a), so d/dx [log_a(x)] = (1/ln(a)) * (1/x) = 1/(x*ln(a)).
Summary Table
| f(x) | f'(x) | Note |
|---|---|---|
| e^x | e^x | Only function that equals its own derivative |
| a^x (a > 0, a != 1) | a^x * ln(a) | Reduces to e^x when a = e |
| ln x | 1/x | x > 0 |
| log_a(x) | 1/(x * ln(a)) | a > 0, a != 1, x > 0 |
7. Inverse Trigonometric Derivatives
7.1 d/dx [arcsin x] = 1/sqrt(1-x^2)
Let y = arcsin x, so sin y = x. Differentiate both sides with respect to x:
cos y * (dy/dx) = 1
dy/dx = 1/cos y
Since sin^2(y) + cos^2(y) = 1, we get cos y = sqrt(1 - sin^2(y)) = sqrt(1 - x^2)
Therefore dy/dx = 1/sqrt(1 - x^2) (positive root because y is in [-pi/2, pi/2], where cos y >= 0)
Complete Table
| f(x) | f'(x) | Domain |
|---|---|---|
| arcsin x | 1/sqrt(1 - x^2) | |x| < 1 |
| arccos x | -1/sqrt(1 - x^2) | |x| < 1 |
| arctan x | 1/(1 + x^2) | All real numbers |
| arccot x | -1/(1 + x^2) | All real numbers |
| arcsec x | 1/(|x|*sqrt(x^2 - 1)) | |x| > 1 |
| arccsc x | -1/(|x|*sqrt(x^2 - 1)) | |x| > 1 |
Note that arcsin and arccos have derivatives that are negatives of each other. This is because arcsin x + arccos x = pi/2 (a constant), so differentiating gives 0.
8. Implicit Differentiation
When y cannot be expressed explicitly as a function of x (e.g., the circle x^2 + y^2 = r^2), we use implicit differentiation: treat y as a function of x, differentiate both sides with respect to x, then solve for dy/dx.
Example: Circle x^2 + y^2 = r^2
2x + 2y * (dy/dx) = 0
dy/dx = -x/y
Geometric meaning: The tangent line at (x, y) on the circle has slope -x/y. At (r, 0) the slope is undefined (vertical tangent); at (0, r) the slope is zero (horizontal tangent) -- exactly matching intuition.
Example: x^3 + y^3 = 6xy
3x^2 + 3y^2 * (dy/dx) = 6y + 6x * (dy/dx)
(3y^2 - 6x) * (dy/dx) = 6y - 3x^2
dy/dx = (6y - 3x^2) / (3y^2 - 6x) = (2y - x^2) / (y^2 - 2x)
9. Higher-Order Derivatives
The derivative of the derivative is called the second derivative, written f''(x) or d^2y/dx^2. It measures the rate of change of the rate of change.
| Order | Lagrange | Leibniz | Meaning |
|---|---|---|---|
| First | f'(x) | dy/dx | Slope / velocity |
| Second | f''(x) | d^2y/dx^2 | Concavity / acceleration |
| Third | f'''(x) | d^3y/dx^3 | Jerk |
| n-th | f^(n)(x) | d^n y/dx^n | -- |
Concavity and Inflection Points
f''(x) > 0: The function is "concave up" (bowl-shaped), the tangent line lies below the curve.
f''(x) < 0: The function is "concave down" (arch-shaped), the tangent line lies above the curve.
Inflection point: Where f''(x) = 0 and the concavity actually changes.
f'(x) = 3x^2, f''(x) = 6x
f''(0) = 0 and f'' changes sign at x=0, so x=0 is an inflection point.
10. Applications of Derivatives
10.1 Finding Maxima and Minima (Fermat's Theorem)
Method: Set f'(x) = 0 to find critical points, then use the second derivative test: f''(c) > 0 means local minimum, f''(c) < 0 means local maximum.
f'(x) = 3x^2 - 3 = 0 -> x = +/-1
f''(x) = 6x -> f''(1) = 6 > 0 (local min), f''(-1) = -6 < 0 (local max)
10.2 Related Rates
When multiple quantities change over time and are linked by an equation, differentiate both sides with respect to t (implicit differentiation) to find the relationship between their rates.
V = (4/3)*pi*r^3 -> dV/dt = 4*pi*r^2 * (dr/dt) = 4*pi*(25)*(2) = 200*pi cm^3/s
10.3 Linear Approximation
f(4) = 2, f'(x) = 1/(2*sqrt(x)), f'(4) = 1/4
sqrt(4.1) ~ 2 + (1/4)(0.1) = 2.025 (exact value ~ 2.02485)
10.4 L'Hopital's Rule
= lim[x->0] cos(x)/1 = cos(0) = 1
11. Related References
12. Frequently Asked Questions (FAQ)
The product rule is for the derivative of two functions multiplied together f(x)*g(x); the chain rule is for composed functions f(g(x)). For example, sin(x)*x^2 uses the product rule, while sin(x^2) uses the chain rule. The key distinction: are the functions multiplied or nested?
The power rule d/dx [x^n] = n*x^(n-1) applies when the base is the variable and the exponent is constant. But e^x has a constant base and variable exponent -- a completely different situation. The correct approach is to write a^x = e^(x*ln(a)) and apply the chain rule.
Whenever the equation cannot (or is difficult to) express y explicitly as a function of x. Classic examples: circles x^2+y^2=r^2, ellipses, and any curve of the form F(x,y) = 0.
The second derivative f''(x) describes the direction of curvature (concavity). f'' > 0 means the curve bends upward (concave up), f'' < 0 means it bends downward (concave down). In physics, if f(t) is position, f'(t) is velocity, and f''(t) is acceleration.
Yes, as long as each application still yields a 0/0 or infinity/infinity form. But you must re-verify the indeterminate form before each application. A common mistake is applying it when the expression is no longer indeterminate, which gives incorrect results.