概率论与数理统计 (Spring 2025)/Problem Set 4: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
Zhangxy (talk | contribs)
Zouzongrui (talk | contribs)
Line 52: Line 52:


<li>
<li>
[<strong>Uniform Distribution (II)</strong>] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
[<strong>Uniform Distribution (II)</strong>] Let <math>X,Y</math> and <math>Z</math> be independent and uniformly distributed on <math>[0,1]</math>. Find the joint density function of <math>XY</math> and <math>Z^2</math>, and show that <math>\textbf{Pr}[XY<Z^2] = 5/9</math>.
</li>
</li>



Revision as of 13:25, 28 April 2025

  • 每道题目的解答都要有完整的解题过程,中英文不限。
  • 我们推荐大家使用LaTeX, markdown等对作业进行排版。
  • Bonus problem为附加题(选做)。

Assumption throughout Problem Set 4

Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].

Without further notice, we assume that the expectation of random variables are well-defined.

The term [math]\displaystyle{ \log }[/math] used in this context refers to the natural logarithm.

Problem 1 (Continuous Random Variables, 30 points)

  • [Density function] Determine the value of [math]\displaystyle{ C }[/math] such that [math]\displaystyle{ f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R} }[/math] is a probability density function (PDF) for a continuous random variable.
  • [Independence] Let [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] be independent and identically distributed continuous random variables with cumulative distribution function (CDF) [math]\displaystyle{ F }[/math] and probability density function (PDF) [math]\displaystyle{ f }[/math]. Find out the density functions of [math]\displaystyle{ V = \max\{X,Y\} }[/math] and [math]\displaystyle{ U = \min\{X,Y\} }[/math].
  • [Correlation] Let [math]\displaystyle{ X }[/math] be uniformly distributed on [math]\displaystyle{ (-1,1) }[/math] and [math]\displaystyle{ Y_k = \cos(k \pi X) }[/math] for [math]\displaystyle{ k=1,2,\ldots,n }[/math]. Are the random variables [math]\displaystyle{ Y_1, Y_2, \ldots, Y_n }[/math] correlated? independent? You should prove your claim rigorously.
  • [Expectation of random variables (I)] Let [math]\displaystyle{ X }[/math] be a continuous random variable with mean [math]\displaystyle{ \mu }[/math] and cumulative distribution function (CDF) [math]\displaystyle{ F }[/math].
    • Suppose [math]\displaystyle{ X \ge 0 }[/math]. Show that [math]\displaystyle{ \int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx }[/math] if and only if [math]\displaystyle{ a = \mu }[/math].
    • Suppose [math]\displaystyle{ X }[/math] has finite variance. Show that [math]\displaystyle{ g(a) = \mathbb{E}((X-a)^2) }[/math] achieves the minimum when [math]\displaystyle{ a = \mu }[/math].
  • [Expectation of random variables (II)] Let [math]\displaystyle{ X, Y }[/math] be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) [math]\displaystyle{ F }[/math]. Furthermore, [math]\displaystyle{ X,Y \ge 0 }[/math]. Show that [math]\displaystyle{ \mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right) }[/math]
  • [Conditional distribution] Let [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] be two random variables. The joint density of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] is given by [math]\displaystyle{ f(x,y) = c(x^2 - y^2)e^{-x} }[/math], where [math]\displaystyle{ 0\leq x \lt \infty }[/math] and [math]\displaystyle{ -x\leq y \leq x }[/math]. Here, [math]\displaystyle{ c\in \mathbb{R}_+ }[/math] is a constant. Find out the conditional distribution of [math]\displaystyle{ Y }[/math], given [math]\displaystyle{ X = x }[/math].
  • [Uniform Distribution (I)] Let [math]\displaystyle{ P_i = (X_i,Y_i), 1\leq i\leq n }[/math], be independent, uniformly distributed points in the unit square [math]\displaystyle{ [0,1]^2 }[/math]. A point [math]\displaystyle{ P_i }[/math] is called "peripheral" if, for all [math]\displaystyle{ r = 1,2,\cdots,n }[/math], either [math]\displaystyle{ X_r \leq X_i }[/math] or [math]\displaystyle{ Y_r \leq Y_i }[/math], or both. Find out the expected number of peripheral points.
  • [Uniform Distribution (II)] Let [math]\displaystyle{ X,Y }[/math] and [math]\displaystyle{ Z }[/math] be independent and uniformly distributed on [math]\displaystyle{ [0,1] }[/math]. Find the joint density function of [math]\displaystyle{ XY }[/math] and [math]\displaystyle{ Z^2 }[/math], and show that [math]\displaystyle{ \textbf{Pr}[XY\lt Z^2] = 5/9 }[/math].
  • [Exponential distribution] Let [math]\displaystyle{ X }[/math] have an exponential distribution. Show that [math]\displaystyle{ \textbf{Pr}[X\gt s+x|X\gt s] = \textbf{Pr}[X\gt x] }[/math], for [math]\displaystyle{ x,s\geq 0 }[/math]. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
  • [Normal distribution(I)] Let [math]\displaystyle{ X,Y\sim N(0,1) }[/math] be two independent and identically distributed normal random variables. Let [math]\displaystyle{ Z = X-Y }[/math]. Find the density function of [math]\displaystyle{ Z }[/math] and [math]\displaystyle{ |Z| }[/math] respectively.
  • [Normal distribution(II)] Let [math]\displaystyle{ X }[/math] have the [math]\displaystyle{ N(0,1) }[/math] distribution and let [math]\displaystyle{ a\gt 0 }[/math]. Show that the random variable [math]\displaystyle{ Y }[/math] given by [math]\displaystyle{ \begin{equation*} Y = \begin{cases} X, & |X|\lt a \\ -X, & |X|\geq a \end{cases} \end{equation*} }[/math] has the [math]\displaystyle{ N(0,1) }[/math] distribution, and find an expression for [math]\displaystyle{ \rho(a) = \textbf{Cov}(X,Y) }[/math] in terms of the density function [math]\displaystyle{ \phi }[/math] of [math]\displaystyle{ X }[/math].
  • [Random process] Given a real number [math]\displaystyle{ U\lt 1 }[/math] as input of the following process, find out the expected returning value.
    Process
    Input: real numbers [math]\displaystyle{ U \lt 1 }[/math];

    initialize [math]\displaystyle{ x = 0 }[/math] and [math]\displaystyle{ count = 0 }[/math];
    while [math]\displaystyle{ x \lt U }[/math] do
    • choose [math]\displaystyle{ y \in (0,1) }[/math] uniformly at random;
    • update [math]\displaystyle{ x = x + y }[/math] and [math]\displaystyle{ count = count + 1 }[/math];
    return [math]\displaystyle{ count }[/math];
  • [Random semicircle] We sample [math]\displaystyle{ n }[/math] points within a circle [math]\displaystyle{ C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\} }[/math] independently and uniformly at random (i.e., the density function [math]\displaystyle{ f(x,y) \propto 1_{(x,y) \in C} }[/math]). Find out the probability that they all lie within some semicircle of the original circle [math]\displaystyle{ C }[/math]. (Hint: you may apply the technique of change of variables, see function of random variables or Chapter 4.7 in [GS])
  • [Stochastic domination] Let [math]\displaystyle{ X, Y }[/math] be continuous random variables. Show that [math]\displaystyle{ X }[/math] dominates [math]\displaystyle{ Y }[/math] stochastically if and only if [math]\displaystyle{ \mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)] }[/math] for any non-decreasing function [math]\displaystyle{ f }[/math] for which the expectations exist.

Problem 2 (Concentration of measure, 10 points)

  • [Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable [math]\displaystyle{ X }[/math] be the number of throws required to obtain a total of [math]\displaystyle{ n }[/math] heads. Show that [math]\displaystyle{ \textbf{Pr}[X \gt 2n + 2\sqrt{n\log n}]\leq O(1/n) }[/math].
  • [Chernoff vs Chebyshev] We have a standard six-sided die. Let [math]\displaystyle{ X }[/math] be the number of times a 6 occurs in [math]\displaystyle{ n }[/math] throws off the die. Compare the best upper bounds on [math]\displaystyle{ \textbf{Pr}[X\geq n/4] }[/math] that you can obtain using Chebyshev's inequality and Chernoff bounds.
  • [[math]\displaystyle{ k }[/math]-th moment bound] Let [math]\displaystyle{ X }[/math] be a random variable with expectation [math]\displaystyle{ 0 }[/math] such that moment generating function [math]\displaystyle{ \mathbf{E}[\exp(t|X|)] }[/math] is finite for some [math]\displaystyle{ t \gt 0 }[/math]. We can use the following two kinds of tail inequalities for [math]\displaystyle{ X }[/math]:

Chernoff Bound

[math]\displaystyle{ \begin{align} \mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}} \end{align} }[/math]

[math]\displaystyle{ k }[/math]th-Moment Bound

[math]\displaystyle{ \begin{align} \mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k} \end{align} }[/math]
  1. Show that for each [math]\displaystyle{ \delta }[/math], there exists a choice of [math]\displaystyle{ k }[/math] such that the [math]\displaystyle{ k }[/math]th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
  2. Why would we still prefer the Chernoff bound to the (seemingly) stronger [math]\displaystyle{ k }[/math]-th moment bound?
  • [Cut size in random graph] Show that with probability at least [math]\displaystyle{ 2/3 }[/math], the size of the max-cut in Erdős–Rényi random graph [math]\displaystyle{ G(n,1/2) }[/math] is at most [math]\displaystyle{ n^2/8 + O(n^{1.5}) }[/math]. In the [math]\displaystyle{ G(n,1/2) }[/math] model, each edge is included in the graph with probability [math]\displaystyle{ 1/2 }[/math], independently of every other edge.

Problem 3 (Strong Law of Large Numbers, 15 points, Bonus Problem)

    Throughout this problem, we assume [math]\displaystyle{ X_1,X_2,\ldots, }[/math] be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
  • [Kolmogorov maximal inequality] Let [math]\displaystyle{ S_n = \sum_{i=1}^n X_i }[/math]. Prove that [math]\displaystyle{ \mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{n \mathbf{Var}(X_1)}{t^2} }[/math].
  • [Convergence of random series] Suppose [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Var}(X_i) \lt \infty }[/math]. Prove that the series [math]\displaystyle{ \sum_{i=1}^{+\infty} X_i }[/math] is almost surely convergent.
  • [Strong law of large numbers] Prove the strong law of large numbers using previous propositions.

Problem 4 (LLN and CLT, 15 points + 10 points)

In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

  • [St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes [math]\displaystyle{ k }[/math] tosses to obtain a head, you will win [math]\displaystyle{ 2^k }[/math] dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
    • For each [math]\displaystyle{ n \ge 1 }[/math], let [math]\displaystyle{ X_{n,1}, X_{n,2},\ldots, X_{n,k} }[/math] be independent random variables. Furthermore, let [math]\displaystyle{ b_n \gt 0 }[/math] be real numbers with [math]\displaystyle{ b_n \to \infty }[/math] and [math]\displaystyle{ \widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n} }[/math] for all [math]\displaystyle{ 1 \le k \le n }[/math]. If [math]\displaystyle{ \sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| \gt b_n) \to 0 }[/math] and [math]\displaystyle{ b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0 }[/math] when [math]\displaystyle{ n \to \infty }[/math], then [math]\displaystyle{ (S_n-a_n)/b_n \overset{P}{\to} 0 }[/math], where [math]\displaystyle{ S_n = \sum_{k=1}^n X_{n,k} }[/math] and [math]\displaystyle{ a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}] }[/math].
    • Let [math]\displaystyle{ S_n }[/math] be the total winnings after playing [math]\displaystyle{ n }[/math] rounds of the game. Prove that [math]\displaystyle{ \frac{S_n}{n \log_2 n} \overset{P}{\to} 1 }[/math]. (Therefore, a fair price to play this game [math]\displaystyle{ n }[/math] times is roughly [math]\displaystyle{ n \log_2 n }[/math] dollars)
    • (Bonus problem, 5 points) Let [math]\displaystyle{ S_n }[/math] be the total winnings after playing [math]\displaystyle{ n }[/math] rounds of the game. Prove that [math]\displaystyle{ \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty }[/math] almost surely. (Hint: You may use Borel-Cantelli lemmas)
    • (Bonus problem, 5 points) Let [math]\displaystyle{ X_1,X_2,\ldots,X_n }[/math] are i.i.d. copies of a random variable [math]\displaystyle{ X }[/math] with the standard Cauchy distribution (i.e., the probability density function of [math]\displaystyle{ X }[/math] is [math]\displaystyle{ \frac{1}{\pi} \frac{1}{1+x^2} }[/math]), show that [math]\displaystyle{ \frac{S_n}{n \log n} }[/math] converges in probability to a constant [math]\displaystyle{ \frac{2}{\pi} }[/math] but is almost surely unbounded, where [math]\displaystyle{ S_n = \sum_{i=1}^n |X_i| }[/math].
  • [Monte Carlo integration] Let [math]\displaystyle{ f }[/math] be a continuous function on [math]\displaystyle{ [0,1] }[/math] and [math]\displaystyle{ U_1,U_2,\ldots,U_n \sim U(0,1) }[/math] be independent random variables. Show that [math]\displaystyle{ I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x }[/math] in probability. (Remark: This holds so long as [math]\displaystyle{ f }[/math] is a measurable function on [math]\displaystyle{ [0,1] }[/math] with [math]\displaystyle{ \int_0^1 |f(x)| \mathrm{d}x \lt \infty }[/math])
  • [Stirling's formula] By considering the central limit theorem for the sum of independent Poisson-distributed random variables, show that [math]\displaystyle{ n! \sim \sqrt{2\pi n} \cdot \left(\frac{n}{\mathrm{e}}\right)^n }[/math]