Derivative Rules

This is the complete reference for derivative rules, covering everything from the limit definition to higher-order applications. Each rule includes why it works (proof sketches) and who discovered it (historical context), helping you truly understand the heart of calculus.

Table of Contents

1. What is a Derivative

A derivative measures the instantaneous rate of change of a function. If f(x) describes a car's position at time x, then f'(x) is the velocity at that instant.

f'(x) = lim[h->0] (f(x+h) - f(x)) / h

This limit definition says: the derivative is the limit of the "secant line slope" as it approaches the "tangent line slope." As h approaches 0, the average rate of change between two points converges to the instantaneous rate of change at that point.

History: Newton and Leibniz
The derivative was independently discovered by Isaac Newton (1665) and Gottfried Wilhelm Leibniz (1684). Newton called them "fluxions" and used them to describe motion and change; Leibniz developed the concept of "differentials" and created the dy/dx notation we still use today. The two engaged in a bitter priority dispute, but modern mathematics recognizes both as independent co-discoverers.

Three Notations

NotationInventorFormBest Used When
Leibniz notationLeibniz (1684)dy/dxChain rule, implicit differentiation; intuitively shows "ratio of infinitesimals"
Lagrange notationLagrange (1770s)f'(x)General function analysis; compact and concise
Newton notationNewton (1665)y-dot (dot above y)Physics, especially derivatives with respect to time

Why do different notations exist? Each shines in specific contexts. Leibniz notation lets the chain rule look like fraction cancellation (dy/dx = dy/du * du/dx) -- extremely intuitive. Lagrange notation is more compact for higher-order derivatives (f''(x) vs d^2y/dx^2). Newton's dot notation is standard in physics for time derivatives (x-dot for velocity, x-double-dot for acceleration).

2. Basic Rules

2.1 Constant Rule

d/dx [c] = 0

Why: A constant never changes, so its rate of change is zero. From the definition: lim[h->0] (c - c)/h = lim[h->0] 0/h = 0.

Example: d/dx [7] = 0, d/dx [pi] = 0

2.2 Power Rule

d/dx [x^n] = n * x^(n-1)
History: The power rule for positive integers was known to several 17th-century mathematicians (including Fermat and Newton), but the first general proof for arbitrary real exponents n is attributed to Leonhard Euler.
Proof sketch (positive integer n, binomial theorem):
f(x+h) = (x+h)^n = x^n + n*x^(n-1)*h + (n(n-1)/2)*x^(n-2)*h^2 + ... + h^n
f(x+h) - f(x) = n*x^(n-1)*h + (terms with h^2 and higher)
Divide by h: n*x^(n-1) + (terms with h)
As h -> 0, all terms with h vanish, leaving n*x^(n-1).
Examples: d/dx [x^5] = 5x^4, d/dx [x^(-2)] = -2x^(-3), d/dx [sqrt(x)] = d/dx [x^(1/2)] = (1/2)x^(-1/2)

2.3 Sum / Difference Rule

d/dx [f(x) +/- g(x)] = f'(x) +/- g'(x)

Why: Linearity of limits. Since lim(A+B) = lim A + lim B, and the derivative is fundamentally a limit, differentiation inherits additivity.

Example: d/dx [x^3 + 2x] = 3x^2 + 2

2.4 Constant Multiple Rule

d/dx [c * f(x)] = c * f'(x)

Why: Again, linearity of limits. A constant can be pulled out: lim[h->0] c*(f(x+h)-f(x))/h = c * lim[h->0] (f(x+h)-f(x))/h.

Example: d/dx [5x^3] = 5 * 3x^2 = 15x^2

Summary Table

RuleFormulaExample
Constant Ruled/dx [c] = 0d/dx [5] = 0
Power Ruled/dx [x^n] = n*x^(n-1)d/dx [x^3] = 3x^2
Constant Multipled/dx [cf] = cf'd/dx [5x^2] = 10x
Sum/Differenced/dx [f +/- g] = f' +/- g'd/dx [x^2 + x] = 2x + 1

3. Product and Quotient Rules

3.1 Product Rule

d/dx [f(x) * g(x)] = f'(x) * g(x) + f(x) * g'(x)
History: The product rule was first published by Leibniz (1684) in his landmark paper Nova Methodus pro Maximis et Minimis.
Proof (the clever add-and-subtract trick):
d/dx [fg] = lim[h->0] (f(x+h)g(x+h) - f(x)g(x)) / h

Key step -- add and subtract f(x+h)g(x):
= lim[h->0] (f(x+h)g(x+h) - f(x+h)g(x) + f(x+h)g(x) - f(x)g(x)) / h
= lim[h->0] [ f(x+h) * (g(x+h)-g(x))/h + g(x) * (f(x+h)-f(x))/h ]
= f(x) * g'(x) + g(x) * f'(x)

Why the trick works: It decomposes "both factors changing simultaneously" into "one factor changes while the other stays fixed" -- two pieces we already know how to handle.
Example: d/dx [x * sin(x)] = 1 * sin(x) + x * cos(x) = sin(x) + x*cos(x)

3.2 Quotient Rule

d/dx [f(x)/g(x)] = (f'(x)*g(x) - f(x)*g'(x)) / [g(x)]^2
Derivation from the product rule: Write f/g = f * g^(-1) and apply the product rule:
d/dx [f * g^(-1)] = f' * g^(-1) + f * (-1) * g^(-2) * g'
= f'/g - f*g'/g^2
= (f'g - fg') / g^2
Example: d/dx [x/e^x] = (e^x - x*e^x) / e^(2x) = (1-x)/e^x

Mnemonic: "lo d-hi minus hi d-lo, over lo-lo" -- where "hi" is the numerator f and "lo" is the denominator g.

4. Chain Rule -- The Most Important Rule

(f o g)'(x) = f'(g(x)) * g'(x)

In Leibniz notation: dy/dx = (dy/du) * (du/dx), where u = g(x).

History: The chain rule is implicit in Leibniz's work -- his dy/dx notation makes the chain rule look like fraction cancellation, which is the genius of his notation system. The rigorous proof was refined by later mathematicians.
Why it works -- "rates multiply":
Suppose y depends on u, and u depends on x. If x changes by a tiny amount dx:
- u changes by du ~ (du/dx) * dx
- y changes by dy ~ (dy/du) * du = (dy/du) * (du/dx) * dx
Therefore dy/dx ~ (dy/du) * (du/dx). In the limit, this becomes exact.

Intuition: If temperature rises at 3 degrees C/hour, and metal expands at 0.01 mm/degree C, then the metal expands at 0.03 mm/hour. Rates multiply!

Examples: Simple to Complex

Example 1 (simple): d/dx [sin(x^2)] = cos(x^2) * 2x
Outer function: sin(u), inner function: u = x^2
Example 2 (moderate): d/dx [(3x+1)^5] = 5(3x+1)^4 * 3 = 15(3x+1)^4
Outer function: u^5, inner function: u = 3x+1
Example 3 (nested): d/dx [e^(sin(x^2))]
= e^(sin(x^2)) * cos(x^2) * 2x
Three-layer chain rule: outer e^u, middle sin(v), inner v = x^2

5. Trigonometric Derivatives

5.1 d/dx [sin x] = cos x

Proof sketch:
d/dx [sin x] = lim[h->0] (sin(x+h) - sin(x)) / h
Using the angle addition formula: sin(x+h) = sin x cos h + cos x sin h
= lim[h->0] (sin x (cos h - 1) + cos x * sin h) / h
= sin x * lim[h->0] (cos h - 1)/h + cos x * lim[h->0] sin h / h
= sin x * 0 + cos x * 1 = cos x

Key limits: lim[h->0] sin(h)/h = 1 (proved via the squeeze theorem) and lim[h->0] (cos h - 1)/h = 0.

5.2 d/dx [cos x] = -sin x

Why: cos x = sin(pi/2 - x). By the chain rule: d/dx [sin(pi/2 - x)] = cos(pi/2 - x) * (-1) = -sin x.

All Six Trigonometric Derivatives

f(x)f'(x)How to Derive
sin xcos xLimit definition + squeeze theorem
cos x-sin xcos x = sin(pi/2 - x) + chain rule
tan xsec^2 xtan = sin/cos + quotient rule
cot x-csc^2 xcot = cos/sin + quotient rule
sec xsec x * tan xsec = 1/cos + chain rule
csc x-csc x * cot xcsc = 1/sin + chain rule
Deriving d/dx [tan x] via the quotient rule:
d/dx [tan x] = d/dx [sin x / cos x]
= (cos x * cos x - sin x * (-sin x)) / cos^2(x)
= (cos^2(x) + sin^2(x)) / cos^2(x) = 1/cos^2(x) = sec^2(x)

6. Exponential and Logarithmic Derivatives

6.1 d/dx [e^x] = e^x

d/dx [e^x] = e^x
Why is e special?
The constant e ~ 2.71828 was systematically studied by Leonhard Euler (1748). e is the unique base for which d/dx [a^x] = a^x -- that is, e^x is the only exponential function that equals its own derivative. This property makes e ubiquitous in calculus, probability theory, and complex analysis.

Definition: e = lim[n->infinity] (1 + 1/n)^n

6.2 d/dx [a^x] = a^x * ln(a)

Derivation: Write a^x = e^(x*ln(a)) and apply the chain rule:
d/dx [e^(x*ln(a))] = e^(x*ln(a)) * ln(a) = a^x * ln(a)

6.3 d/dx [ln x] = 1/x

d/dx [ln x] = 1/x, x > 0
Proof (inverse function theorem):
Let y = ln x, so x = e^y. Differentiate both sides with respect to x:
1 = e^y * (dy/dx)
dy/dx = 1/e^y = 1/x

6.4 d/dx [log_a(x)] = 1/(x * ln(a))

Derivation: log_a(x) = ln(x)/ln(a), so d/dx [log_a(x)] = (1/ln(a)) * (1/x) = 1/(x*ln(a)).

Summary Table

f(x)f'(x)Note
e^xe^xOnly function that equals its own derivative
a^x (a > 0, a != 1)a^x * ln(a)Reduces to e^x when a = e
ln x1/xx > 0
log_a(x)1/(x * ln(a))a > 0, a != 1, x > 0

7. Inverse Trigonometric Derivatives

7.1 d/dx [arcsin x] = 1/sqrt(1-x^2)

Derivation (implicit differentiation):
Let y = arcsin x, so sin y = x. Differentiate both sides with respect to x:
cos y * (dy/dx) = 1
dy/dx = 1/cos y
Since sin^2(y) + cos^2(y) = 1, we get cos y = sqrt(1 - sin^2(y)) = sqrt(1 - x^2)
Therefore dy/dx = 1/sqrt(1 - x^2) (positive root because y is in [-pi/2, pi/2], where cos y >= 0)

Complete Table

f(x)f'(x)Domain
arcsin x1/sqrt(1 - x^2)|x| < 1
arccos x-1/sqrt(1 - x^2)|x| < 1
arctan x1/(1 + x^2)All real numbers
arccot x-1/(1 + x^2)All real numbers
arcsec x1/(|x|*sqrt(x^2 - 1))|x| > 1
arccsc x-1/(|x|*sqrt(x^2 - 1))|x| > 1

Note that arcsin and arccos have derivatives that are negatives of each other. This is because arcsin x + arccos x = pi/2 (a constant), so differentiating gives 0.

8. Implicit Differentiation

When y cannot be expressed explicitly as a function of x (e.g., the circle x^2 + y^2 = r^2), we use implicit differentiation: treat y as a function of x, differentiate both sides with respect to x, then solve for dy/dx.

Why this is valid: The implicit function theorem guarantees that under certain conditions (F has continuous partial derivatives and dF/dy != 0), the equation F(x,y) = 0 defines y = y(x) as a function of x near a given point, so differentiation with respect to x is justified.

Example: Circle x^2 + y^2 = r^2

Differentiate both sides with respect to x:
2x + 2y * (dy/dx) = 0
dy/dx = -x/y

Geometric meaning: The tangent line at (x, y) on the circle has slope -x/y. At (r, 0) the slope is undefined (vertical tangent); at (0, r) the slope is zero (horizontal tangent) -- exactly matching intuition.

Example: x^3 + y^3 = 6xy

This is the Folium of Descartes. Differentiate both sides:
3x^2 + 3y^2 * (dy/dx) = 6y + 6x * (dy/dx)
(3y^2 - 6x) * (dy/dx) = 6y - 3x^2
dy/dx = (6y - 3x^2) / (3y^2 - 6x) = (2y - x^2) / (y^2 - 2x)

9. Higher-Order Derivatives

The derivative of the derivative is called the second derivative, written f''(x) or d^2y/dx^2. It measures the rate of change of the rate of change.

OrderLagrangeLeibnizMeaning
Firstf'(x)dy/dxSlope / velocity
Secondf''(x)d^2y/dx^2Concavity / acceleration
Thirdf'''(x)d^3y/dx^3Jerk
n-thf^(n)(x)d^n y/dx^n--

Concavity and Inflection Points

f''(x) > 0: The function is "concave up" (bowl-shaped), the tangent line lies below the curve.
f''(x) < 0: The function is "concave down" (arch-shaped), the tangent line lies above the curve.
Inflection point: Where f''(x) = 0 and the concavity actually changes.

Example: f(x) = x^3
f'(x) = 3x^2, f''(x) = 6x
f''(0) = 0 and f'' changes sign at x=0, so x=0 is an inflection point.

10. Applications of Derivatives

10.1 Finding Maxima and Minima (Fermat's Theorem)

Fermat's Theorem (Pierre de Fermat): If f(x) has a local extremum at c and f is differentiable at c, then f'(c) = 0.

Method: Set f'(x) = 0 to find critical points, then use the second derivative test: f''(c) > 0 means local minimum, f''(c) < 0 means local maximum.

Example: f(x) = x^3 - 3x
f'(x) = 3x^2 - 3 = 0 -> x = +/-1
f''(x) = 6x -> f''(1) = 6 > 0 (local min), f''(-1) = -6 < 0 (local max)

10.2 Related Rates

When multiple quantities change over time and are linked by an equation, differentiate both sides with respect to t (implicit differentiation) to find the relationship between their rates.

Example: A sphere's radius increases at 2 cm/s. Find the rate of volume change when r = 5 cm.
V = (4/3)*pi*r^3 -> dV/dt = 4*pi*r^2 * (dr/dt) = 4*pi*(25)*(2) = 200*pi cm^3/s

10.3 Linear Approximation

f(x) ~ f(a) + f'(a) * (x - a)
History: This is the first-order case of the Taylor series, published by Brook Taylor in 1715. The full Taylor series f(x) = sum of f^(n)(a)/n! * (x-a)^n is one of the most powerful tools in calculus.
Example: Estimate sqrt(4.1). Let f(x) = sqrt(x), a = 4:
f(4) = 2, f'(x) = 1/(2*sqrt(x)), f'(4) = 1/4
sqrt(4.1) ~ 2 + (1/4)(0.1) = 2.025 (exact value ~ 2.02485)

10.4 L'Hopital's Rule

If lim f(x)/g(x) is 0/0 or inf/inf, then lim f(x)/g(x) = lim f'(x)/g'(x)
Historical fun fact: This rule is named after Guillaume de l'Hopital (1696), appearing in his textbook Analyse des Infiniment Petits -- the world's first calculus textbook. However, the rule was actually discovered by Johann Bernoulli. L'Hopital, a French marquis, paid Bernoulli an annual salary for the right to use his mathematical discoveries.
Example: lim[x->0] sin(x)/x -- this is 0/0 form
= lim[x->0] cos(x)/1 = cos(0) = 1

12. Frequently Asked Questions (FAQ)

Q1: What is the difference between the chain rule and the product rule?

The product rule is for the derivative of two functions multiplied together f(x)*g(x); the chain rule is for composed functions f(g(x)). For example, sin(x)*x^2 uses the product rule, while sin(x^2) uses the chain rule. The key distinction: are the functions multiplied or nested?

Q2: Why is d/dx [e^x] = e^x and not x*e^(x-1)?

The power rule d/dx [x^n] = n*x^(n-1) applies when the base is the variable and the exponent is constant. But e^x has a constant base and variable exponent -- a completely different situation. The correct approach is to write a^x = e^(x*ln(a)) and apply the chain rule.

Q3: When should I use implicit differentiation?

Whenever the equation cannot (or is difficult to) express y explicitly as a function of x. Classic examples: circles x^2+y^2=r^2, ellipses, and any curve of the form F(x,y) = 0.

Q4: What is the geometric meaning of the second derivative?

The second derivative f''(x) describes the direction of curvature (concavity). f'' > 0 means the curve bends upward (concave up), f'' < 0 means it bends downward (concave down). In physics, if f(t) is position, f'(t) is velocity, and f''(t) is acceleration.

Q5: Can L'Hopital's rule be applied repeatedly?

Yes, as long as each application still yields a 0/0 or infinity/infinity form. But you must re-verify the indeterminate form before each application. A common mistake is applying it when the expression is no longer indeterminate, which gives incorrect results.