gnuplot で正規分布をσごとに塗りわける

正規分布関数

$$N(\mu, \sigma^2) \equiv \frac{1}{\sqrt{2\pi} \sigma} \exp\left\{ – \frac{(x-\mu)^2}{2\sigma^2}\right\}$$

簡単のために,$\mu = 0, \sigma = 1$ とした標準正規分布 $f(x)$ を使うことにする。

$$f(x) \equiv N(0, 1) = \frac{1}{\sqrt{2\pi}} \exp\left( – \frac{x^2}{2}\right)$$

誤差関数

$$\mbox{erf}(x) \equiv \frac{2}{\sqrt{\pi}} \int_0^x e^{-t^2}\, dt $$

正規分布の積分を誤差関数で表す

\begin{eqnarray}
F(x_1, x_2) &\equiv& \int_{x_1}^{x_2} f(t)\, dt \\
&=& \frac{1}{\sqrt{2\pi}} \int_{x_1}^{x_2}\exp\left( – \frac{t^2}{2}\right)\, dt \\
&& \quad\left(\frac{t}{\sqrt{2}} \equiv x, \ dt = \sqrt{2} dx \right) \\
&=& \frac{1}{\sqrt{\pi}} \int_{\frac{x_1}{\sqrt{2}}}^{\frac{x_2}{\sqrt{2}}} e^{-x^2} \,dx \\
&=& \frac{1}{2} \left(\mbox{erf}\left(\frac{x_2}{\sqrt{2}}\right)-\mbox{erf}\left(\frac{x_1}{\sqrt{2}}\right) \right)
\end{eqnarray}

gnuplot には組み込み関数として,誤差関数 erf(x) があります。

In [1]:
help expressions function erf
Out[1]:
 関数 `erf(x)` は引数の実部の誤差関数の値を返します。引数が複素数の場合
 は虚部は無視されます。以下参照: `erfc`, `inverf`, `norm`。
In [2]:
# 標準正規分布
f(x) = 1.0/sqrt(2*pi) * exp(-x**2/2.0)

# gnuplot の誤差関数 erf(x)
F(x1, x2)=0.5*(erf(x2/sqrt(2)) - erf(x1/sqrt(2)))

標準正規分布を plot

In [3]:
reset
set term svg size 640, 240 font 'Times,16'
set key off

plot [-4:4][0:0.41] f(x) lw 2 lc 'blue'

Out[3]:
Terminal type is now 'svg'
Options are 'size 640,240 fixed enhanced font 'Times,16' butt dashlength 1.0 '

σごとに塗りわける

In [4]:
plot [-4:4][0:0.41] sample \
  [-1: 1] f(x) with filledc y=0 fc '#0000ff', \
  [-2:-1] f(x) with filledc y=0 fc '#1e90ff', \
  [ 1: 2] f(x) with filledc y=0 fc '#1e90ff', \
  [-3:-2] f(x) with filledc y=0 fc '#87ceeb', \
  [ 2: 3] f(x) with filledc y=0 fc '#87ceeb', \
  [-4:-3] f(x) with filledc y=0 fc '#add8e6', \
  [ 3: 4] f(x) with filledc y=0 fc '#add8e6', \
  [-4: 4] f(x) lw 2 lc 'blue'

塗り分けた領域を白線で区切る

In [5]:
# with impulse で縦線を引く座標値のファイル作成
set print "temp.txt"
  do for [i=-3:3]{
    print i, f(i)
  }
set print
In [6]:
# ファイル "temp.txt" の中身を確認
! cat temp.txt
Out[6]:
-3 0.00443184841193801
-2 0.0539909665131881
-1 0.241970724519143
0 0.398942280401433
1 0.241970724519143
2 0.0539909665131881
3 0.00443184841193801
In [7]:
plot [-4:4][0:0.41] sample \
  [-1: 1] f(x) with filledc y=0 fc '#0000ff', \
  [-2:-1] f(x) with filledc y=0 fc '#1e90ff', \
  [ 1: 2] f(x) with filledc y=0 fc '#1e90ff', \
  [-3:-2] f(x) with filledc y=0 fc '#87ceeb', \
  [ 2: 3] f(x) with filledc y=0 fc '#87ceeb', \
  [-4:-3] f(x) with filledc y=0 fc '#add8e6', \
  [ 3: 4] f(x) with filledc y=0 fc '#add8e6', \
  "temp.txt" w impulse lw 2 lc 'white', \
  [-4: 4] f(x) lw 2 lc 'blue'

横軸の目盛ラベルを設定する

In [8]:
set xtics ("μ-4σ" -4, "μ-3σ" -3, "μ-2σ" -2, "μ-σ" -1, "μ" 0, \
           "μ+4σ"  4, "μ+3σ"  3, "μ+2σ"  2, "μ+σ"  1)
set ytics 0.1
replot

塗り分けた領域にラベルをつける

In [9]:
# 各領域のパーセントを表す文字列を配列として作成
array sig[4]
do for [i = 1: 4] {
  sig[i] = sprintf("%5.2f%", F(i-1, i)*100)
  print sig[i]
}
Out[9]:
34.13%
13.59%
 2.14%
 0.13%
In [10]:
# ラベルが見えるように
# 塗りつぶしの上 front に白抜き文字で
set label 1 center at first -0.5,0.2  sig[1] front tc 'white'
set label 2 center at first  0.5,0.2  sig[1] front tc 'white'
set label 3 center at first -1.5,0.05 sig[2] front tc 'white' 
set label 4 center at first  1.5,0.05 sig[2] front tc 'white' 
set label 5 center at first -2.5,0.05 sig[3] 
set label 6 center at first  2.5,0.05 sig[3] 
set label 7 center at first -3.5,0.02 sig[4] 
set label 8 center at first  3.5,0.02 sig[4] 

replot

枠と縦軸目盛を非表示に

In [11]:
# 枠を非表示に
unset border
# 枠だけ消しても目盛が残るのでそれも非表示に
unset ytics
set xtics nomirror
# x 軸のみを表示
set xzeroaxis lt 1 lw 2 lc 'black'
set xtics out

replot

参考:塗りつぶし領域を少し透かす例

set label ... front と設定しないと塗りつぶしの下にラベルが隠れてしまいますが,以下のように,

  • set style fill transparent solid 0.5 (0.5 は好みに調整)

と設定することで,塗りつぶし領域を少し透明にして(背後の)ラベルが見えるようにできます。

In [12]:
unset label
# ラベルが見えるように領域を少し透明化
set style fill transparent solid 0.5

set label 1 center at first -0.5,0.2  sig[1] 
set label 2 center at first  0.5,0.2  sig[1] 
set label 3 center at first -1.5,0.05 sig[2]  
set label 4 center at first  1.5,0.05 sig[2]  
set label 5 center at first -2.5,0.05 sig[3] 
set label 6 center at first  2.5,0.05 sig[3] 
set label 7 center at first -3.5,0.02 sig[4] 
set label 8 center at first  3.5,0.02 sig[4] 

replot

In [13]:
# いらなくなったファイルは削除しておく

! rm temp.txt

-nσ から +nσ までを塗りつぶす

In [14]:
# 各領域のパーセントを表す文字列を配列として作成
array sig2[4]
do for [i = 1: 4] {
  sig2[i] = sprintf("%5.2f%", F(-i, i)*100)
  print sig2[i]
}
Out[14]:
68.27%
95.45%
99.73%
99.99%
In [15]:
unset style fill
unset label

set xtics ("" -4, "" -3, "" -2, "μ-σ" -1, "μ" 0, \
           ""  4, ""  3, ""  2, "μ+σ"  1) out

set label 1 center at first 0,0.15 sig2[1] front tc "white"

plot [-4:4][0:0.41] sample \
  [-1: 1] f(x) with filledc y=0 fc '#0000ff', \
  [-4: 4] f(x) lw 2 lc "blue", \
  [-4:-1] f(x) lw 2 lc "gray", \
  [ 1: 4] f(x) lw 2 lc "gray"

In [16]:
set xtics ("" -4, "" -3, "μ-2σ" -2, "μ-σ" -1, "μ" 0, \
           ""  4, ""  3, "μ+2σ"  2, "μ+σ"  1) out

set label 1 center at first 0,0.15 sig2[2] front tc "white"

plot [-4:4][0:0.41] sample \
  [-2: 2] f(x) with filledc y=0 fc '#1e90ff', \
  [-4: 4] f(x) lw 2 lc "blue", \
  [-4:-2] f(x) lw 2 lc "gray", \
  [ 2: 4] f(x) lw 2 lc "gray"

In [17]:
set xtics ("" -4, "" -3, "μ-2σ" -2, "μ-σ" -1, "μ" 0, \
           ""  4, ""  3, "μ+2σ"  2, "μ+σ"  1) out

set label 1 center at first 0,0.15 sig2[3] front tc "black"

plot [-4:4][0:0.41] sample \
  [-3: 3] f(x) with filledc y=0 fc '#87ceeb', \
  [-4: 4] f(x) lw 2 lc "blue", \
  [-4:-3] f(x) lw 2 lc "gray", \
  [ 3: 4] f(x) lw 2 lc "gray"

In [18]:
set xtics ("μ-4σ" -4, "μ-3σ" -3, "μ-2σ" -2, "μ-σ" -1, "μ" 0, \
           "μ+4σ"  4, "μ+3σ"  3, "μ+2σ"  2, "μ+σ"  1) out

set label 1 center at first 0,0.15 sig2[4] front tc "black"

plot [-4:4][0:0.41] sample \
  [-4: 4] f(x) with filledc y=0 fc '#add8e6', \
  [-4: 4] f(x) lw 2 lc "blue"