Apakah perbedaan antara nomor yang terdistribusi secara merata terdistribusi secara merata?


22

Kami menggulung dadu 6 sisi beberapa kali.

Menghitung selisih (nilai absolut) antara roll dan roll sebelumnya, apakah perbedaan diharapkan didistribusikan secara merata?

Untuk menggambarkan dengan 10 gulungan:

roll num  result diff
1           1     0
2           2     1
3           1     1
4           3     2
5           3     0
6           5     2
7           1     4
8           6     5
9           4     2
10          4     0

Apakah diffnilai - nilai akan terdistribusi secara seragam?


13
Plot histogram untuk setidaknya mendapatkan arti
senjata


Ini terlihat seperti pekerjaan rumah ....
Manu H

@ Manu H, saya jamin pekerjaan rumah Anda jauh di belakang saya
HeyJude

Jawaban:


37

Tidak, itu tidak seragam

Anda dapat menghitung 36 kemungkinan yang kemungkinan sama untuk perbedaan absolut

     second 1   2   3   4   5   6
first                           
1           0   1   2   3   4   5
2           1   0   1   2   3   4
3           2   1   0   1   2   3
4           3   2   1   0   1   2
5           4   3   2   1   0   1
6           5   4   3   2   1   0

yang memberikan distribusi probabilitas untuk perbedaan absolut

0    6/36  1/6
1   10/36  5/18
2    8/36  2/9
3    6/36  1/6
4    4/36  1/9
5    2/36  1/18

27
@onurcanbektas Tabel dalam jawaban ini jelas bertentangan dengan pernyataan Anda: misalnya, ini menunjukkan bahwa hanya satu dari perbedaan yang mungkin adalah 5 sedangkan 6 dari mereka adalah 0. Karena semua 36 kemungkinan sama-sama memungkinkan, itu tidak seragam.
whuber

13
@onurcanbektas Saya mengundang Anda sekali lagi untuk merenungkan meja. Karena hanya memiliki dua perbedaan absolut dari 5, bukankah jelas bahwa tidak lebih dari dua perbedaan yang dapat sama dengan 5?
whuber

14
@onurcanbektas Untuk perbedaan sederhana (yaitu dengan tanda, jadi bilangan bulat dari -5 hingga +5), distribusinya adalah distribusi segitiga diskret simetris dengan mode (nilai kemungkinan besar) pada 0. Untuk perbedaan absolut seperti yang ditunjukkan dalam jawaban saya, mode adalah 1.
Henry

2
Mungkin perlu dicatat bahwa perbedaan modulo 6 yang ditandatangani terdistribusi secara merata.
Federico Poloni

2
@FedericoPoloni Apakah ini tidak begitu jelas? Maksud saya, saya tidak pernah benar-benar memikirkannya, sebelum membaca komentar, tetapi cukup jelas bahwa ini memang benar
Cruncher

21

Dengan hanya menggunakan aksioma paling mendasar tentang probabilitas dan bilangan real, orang dapat membuktikan pernyataan yang jauh lebih kuat:

Perbedaan dari dua nilai acak independen dan terdistribusi secara identik XY tidak pernahmemiliki distribusi seragam diskrit.

(Pernyataan analog untuk variabel kontinu dibuktikan pada Uniform PDF dari perbedaan dua rv .)

Idenya adalah bahwa peluang XY adalah nilai ekstrim harus lebih kecil daripada peluang bahwa XY adalah nol, karena hanya ada satu cara untuk (katakanlah) memaksimalkan XY sedangkan ada banyak cara untuk membuat perbedaan nol , karena X dan Y memiliki distribusi yang sama dan oleh karena itu dapat sama satu sama lain. Berikut detailnya.

Pertama amati bahwa dua variabel hipotetis X dan Y dalam pertanyaan masing-masing dapat mencapai hanya sejumlah n nilai dengan probabilitas positif, karena akan ada setidaknya n perbedaan yang berbeda dan distribusi yang seragam memberikan mereka semua probabilitas yang sama. Jika n tidak terbatas, maka akan menjadi jumlah perbedaan yang mungkin memiliki positif, probabilitas yang sama, di mana jumlah peluang mereka akan tak terbatas, yang tidak mungkin.

Selanjutnya , karena jumlah perbedaannya terbatas, akan ada yang terbesar di antara mereka. Perbedaan terbesar hanya dapat dicapai ketika mengurangi nilai terkecil dari Y sebut saja m dan anggap memiliki probabilitas q=Pr(Y=m) --dari nilai terbesar panggilan X biarkan bahwa satu M dengan p=Pr(X=M). Karena X dan Y bersifat independen, peluang perbedaan ini adalah produk dari peluang ini,

(*)Pr(XY=Mm)=Pr(X=M)Pr(Y=m)=pq>0.

Akhirnya , karena X dan Y memiliki distribusi yang sama, ada banyak cara perbedaan mereka dapat menghasilkan nilai 0. Di antara cara ini kasus-kasus di mana X=Y=m dan X=Y=M. Karena distribusi ini nonconstant, m berbeda dari M.Itu menunjukkan kedua kasus tersebut merupakan peristiwa terpisah dan oleh karena itu mereka harus berkontribusi setidaknya sejumlah p2+q2 untuk kemungkinan bahwa XYadalah nol; itu adalah,

Pr(XY=0)Pr(X=Y=m)+Pr(X=Y=M)=p2+q2.

Karena kuadrat angka tidak negatif, 0(pq)2, dari mana kita simpulkan dari () yang

Pr(XY=Mm)=pqpq+(pq)2=p2+q2pq<p2+q2Pr(XY=0),

menunjukkan distribusi XY tidak seragam, QED.

Edit dalam menanggapi komentar

Analisis serupa dari perbedaan absolut |XY|mengamati bahwa karena X dan Y memiliki distribusi yang sama, m=M.Ini mengharuskan kita untuk mempelajari Pr(XY=|Mm|)=2pq.Teknik aljabar yang sama menghasilkan hasil yang hampir sama, tetapi ada kemungkinan bahwa 2pq=2pq+(pq)2 dan2pq+p2+q2=1. Bahwa sistem persamaan memiliki unik solusip=q=1/2 sesuai dengan koin yang adil (a "dua sisi die"). Terlepas dari pengecualian ini, hasil untuk perbedaan absolut adalah sama dengan perbedaan, dan untuk alasan mendasar yang sama yang telah diberikan: yaitu, perbedaan absolut dari dua variabel acak iid tidak dapat didistribusikan secara seragam setiap kali ada lebih dari dua perbedaan perbedaan dengan probabilitas positif.

(akhir suntingan)


Mari kita terapkan hasil ini pada pertanyaan, yang menanyakan tentang sesuatu yang sedikit lebih rumit.

Model setiap roll independen dari die (yang mungkin mati tidak adil ) dengan variabel acak Xi, i=1,2,,n. Perbedaan yang diamati dalam n roll ini adalah angka ΔXi=Xi+1Xi. Kita mungkin bertanya-tanya seberapa merata angka-angka n1 ini. Itu benar-benar pertanyaan tentang ekspektasi statistik: berapa jumlah yang diharapkan dari ΔXi that are equal to zero, for instance? What is the expected number of ΔXi equal to 1? Etc., etc.

The problematic aspect of this question is that the ΔXi are not independent: for instance, ΔX1=X2X1 and ΔX2=X3X2 involve the same roll X2.

However, this isn't really a difficulty. Since statistical expectation is additive and all differences have the same distribution, if we pick any possible value k of the differences, the expected number of times the difference equals k in the entire sequence of n rolls is just n1 times the expected number of times the difference equals k in a single step of the process. That single-step expectation is Pr(ΔXi=k) (for any i). These expectations will be the same for all k (that is, uniform) if and only if they are the same for a single ΔXi.ΔXi has a uniform distribution, even when the die might be biased. Thus, even in this weaker sense of expected frequencies, the differences of the rolls are not uniform.


@Michael Good point: I answered the question as asked (which is about "differences"), rather than as illustrated (which clearly refers to absolute differences). The same technique applies--one just has to consider both the max and min differences. In the case where those are the only two possibilities (along with zero), we can get equality, which is where the Bernoulli(1/2) result comes from (showing it's the unique such example).
whuber

Another answer proving a particular version of this is here.
Reinstate Monica

Thanks, @Ben: I had forgotten that thread. Because it's a better reference, I now link directly to it in this answer.
whuber

12

On an intuitive level, a random event can only be uniformly distributed if all of its outcomes are equally likely.

Is that so for the random event in question -- absolute difference between two dice rolls?

It suffices in this case to look at the extremes -- what are the biggest and smallest values this difference could take?

Obviously 0 is the smallest (we're looking at absolute differences and the rolls can be the same), and 5 is the biggest (6 vs 1).

We can show the event is non-uniform by showing that 0 is more (or less) likely to occur than 5.

At a glance, there are only two ways for 5 to occur -- if the first dice is 6 and the second 1, or vice versa. How many ways can 0 occur?


1
+1 I think this gets to the heart of the matter. I have posted a generalization of the question that ultimately relies on the same observation.
whuber

5

As presented by Henry, differences of uniformly distributed distributions are not uniformly distributed.

To illustrate this with simulated data, we can use a very simple R script:

barplot(table(sample(x=1:6, size=10000, replace=T)))

enter image description here

We see that this produces indeed a uniform distribution. Let's now have a look at the distribution of the absolute differences of two random samples from this distribution.

barplot(table(abs(sample(x=1:6, size=10000, replace=T) - sample(x=1:6, size=10000, replace=T))))

enter image description here


6
Why does this have anything to do with the CLT, which concerns the asymptotic distribution of means of large numbers of iid values?
whuber

2
I like the connection you originally made with CLT. Let n be the number of samples to be added (or subtracted) from the original uniform distribution. CLT implies that for large n the distribution will tend toward normal. This in turn implies that the distribution cannot remain uniform for any n>1, such as n=2 which is what OP is asking. (If this isn't self-explanatory, consider that if the sum were uniformly distributed when n=2, reindexing would imply that it is also uniform when n=4, etc, including for large n.)
krubo

3
@Krubo The original question asks about the distribution of differences between successive rolls of a die. The CLT has nothing to say about that. Indeed, no matter how many times the die is rolled, the distribution of those differences will not approach normality.
whuber

Does this distribution tend to uniform as the number of die faces tends to infinity? Not sure how to go about showing that, but intuitively it feels like it heads in that direction, but I don't know if it get asymptotically "blocked" somewhere before flattening enough
Cruncher

@Cruncher you can easily change the number of die faces in the R-Code. The more faces there are, the more apparent the stairwairs nature of the distribution becomes. '1' is always the peak of that stair and with larger differences the probabilities approximate zero. Additionally, difference of '0' is distinctly rarer than '1'. (at least if the die's smallest value is '1')
LuckyPal

2

Others have worked the calculations, I will give you an answer that seems more intuitive to me. You want to study the sum of two unifrom r.v. (Z = X + (-Y)), the overall distribution is the (discrete) convolution product :

P(Z=z)=k=P(X=k)P(Y=zk)

This sum is rather intuitive : the probability to get z, is the sum of the probabilities to get something with X (noted k here) and the complement to z with -Y.

From signal processing, we know how the convolution product behave:

  • The convolution product of two uniform function (two rectangles) will give a triangle. This is illustrated by wikipedia for continuous functions:

enter image description here

  • You can understand what happen here : as z move up (the vertical dotted line) the common domain of both rectangle move up then down, which correspond to the probability to get z.

  • More generally we know that the only functions that are stable by convolution are those of the gaussian familly. i.e. Only gaussian distribution are stable by addition (or more generally, linear combination). This is also meaning that you don't get a uniform distribution when combining uniform distributions.

As to why we get those results, the answer lies in the Fourrier decomposition of those functions. The Fourrier transformation of a convolution product being the simple product of the Fourrier transformations of each function. This give direct links between the fourrier coefficients of the rectangle and triangle functions.


Please check the validity of your claims and the logic of your answer. The question isn't whether the convolution of two uniform distributions is uniform: it's whether the convolution of some distribution and its reversal can be uniform. And there are far more distributional families than the Gaussian that are stable under convolution (modulo standardization, of course): see en.wikipedia.org/wiki/Stable_distribution
whuber

You are right about stable distributions. For the question, I am pretty sure this is about the difference of two random values with uniform distribution (as indicated by the title). The question whether the convolution of some distribution and its reversal can be uniform is larger than what is asked here.
lcrmorin

1

If x and y are two consecutive dice rolls, you can visualize |xy|=k (for k=0,1,2,3,4,5) as follows where each color corresponds to a different value of k:

consecutive dice rolls difference visualization

As you can easily see, the number of points for each color is not the same; therefore, the differences are not uniformly distributed.


0

Let Dt denote the difference and X the value of the roll, then P(Dt=5)=P(Xt=6,Xt1=1)<P((Xt,Xt1){(6,3),(5,2)})<P(Dt=3)

So the function P(Dt=d) is not constant in d. This means that the distribution is not uniform.

Dengan menggunakan situs kami, Anda mengakui telah membaca dan memahami Kebijakan Cookie dan Kebijakan Privasi kami.
Licensed under cc by-sa 3.0 with attribution required.