Probability

mli · May 22, 2020, 3:45am

http://d2l.ai/chapter_preliminaries/probability.html

zplovekq · July 1, 2020, 1:19pm

I have the question on Exercises 4:Why not just run the first test a second time?
I think it just like roll a die twice. The two result is independent. So by the same formula, i have the result P(H = 1 | D1 = 1, D2 = 1)=0.0015/0.00159985=93%. This is better.
Looking forward for discuss!

goldpiggy · July 1, 2020, 9:11pm

Hi @zplovekq, $P(H = 1, D1 = 1, D2 = 1)$ is not equal to 0.0015, since in the second test some true positive cases are tested as negative.

zplovekq · July 2, 2020, 1:49am

Thanks for your reply!~
I compute P(H = 1, D1 = 1, D2 = 1) by this:
P(D1 = 1, D2 = 1 | H = 1)=P(D1 = 1 | H = 1)P(D2 = 1 | H = 1)=11=1 (Because it just like roll a die twice, i think the two diagnosis is independent).
Then P(D1 = 1, D2 = 1)=P(D1 = 1, D2 = 1 | H = 0)P(H = 0) + P(D1 = 1, D2 = 1 | H = 1)P(H = 1)
=0.010.01*(1-0.0015)+1*0.0015=0.00159985
So if the two diagnosis are all 1, then:
P(H = 1 | D1 = 1, D2 = 1)= $P(D1 = 1, D2 = 1 | H = 1)P(H = 1) / P(D1 = 1, D2 = 1)$
= 1 * 0.0015/0.00159985=93%
Thus i think use the first test twice will have more accurate.
Is there any thing wrong?
Thanks!

goldpiggy · July 3, 2020, 4:13pm

This is not equal to 1. It is 0.98, as the table says.

zplovekq · July 6, 2020, 3:37am

emm…Thanks for reply!
I mean use the first test twice…So i think P（D2 = 1 | H = 1）= P（D1 = 1 | H = 1）=1…

goldpiggy · July 6, 2020, 4:47pm

Hi @zplovekq, no problem at all!

HyuPete · August 7, 2020, 2:11pm

I have the same thought as you. Still don’t know the answer to this question

HyuPete · August 7, 2020, 2:18pm

The question suggests that D2 is the same as D1. So I think P(D2 = 1 | H = 1) = P(D1 = 1 | H = 1) = 1.
Is there any thing wrong with my intuition?

goldpiggy · August 7, 2020, 11:43pm

Hi @HyuPete, please see my previous response.

Let me know if it doesn’t make sense to you.

HyuPete · August 8, 2020, 3:10pm

I think I should give more details here so you can help me figure out my problem.
First, let me quote the question here:

In Section 2.6.2.6, the first test is more accurate. Why not just run the first test a second time?

Now here are my thoughts:

In the reading section, it states that D2 is different from D1.

The second test has different characteristics and it is not as good as the first one, as shown in Table 2.6.2.

I totally agree that in the reading section, P(D2 = 1 | H = 1) = 0.98, as you said.

I also want to briefly recall this reading section here:
- First time: run first test D1.
- Second time: run second test D2.
- The the probability of the patient having AIDS given both positive tests is:
  P(H = 1 | D1 = 1, D2 = 1) = … = 0.8307.
Go back to the question, it says “run the first test a second time”. So my understanding is D2 is now the same as D1. I think you have misunderstood what I said because of my abuse of notation. So let me correct it:
- First time: run first test D1, call this run as D^(1)_1.
- Second time: run first test D1 again, call this run as D^(2)_1.
- P(D^(1)_1 | H = 1) = P(D^(2)_1 | H = 1) = 1.
And my calculation for this case, “the probability of the patient having AIDS given both positive tests is”:
Since 0.9376 > 0.8307, I think “run the first test a second time” is a better choice here. But the question is “Why not just run the first test a second time?” which leads to a conflict.

Therefore, I still don’t know how to answer this question. Hope you could guide me to it.
Thanks in advance for your help.

StevenJokes · August 8, 2020, 3:17pm

Help me too.

Prateek_Vyas · August 13, 2020, 5:52pm

np.random.multinomial(10, fair_probs), is not working, its giving result array([ 0, 0, 0, 0, 0, 10], dtype=int64) also the code counts = np.random.multinomial(1000, fair_probs).astype(np.float32)
counts / 1000, Giving results array([ 0., 0., 0., 0., 0., 1000.]). There is a error in np lib of mxnet please check

StevenJokes · August 14, 2020, 3:47am

@Prateek_Vyas
You’re right. Win10?
Check: https://discuss.mxnet.io/t/probability-np-random-multinomial/5667/6
issue: https://github.com/apache/incubator-mxnet/issues/15383
Wait for fixing…@mli

goldpiggy · August 15, 2020, 12:48am

Thanks for reporting @Prateek_Vyas. The fix should be on 2.0 roadmap.

HyuPete · August 17, 2020, 11:02am

@goldpiggy Please help me.

rammy_vadlamudi · September 14, 2020, 1:11am

@goldpiggy, could anyone shed some light on @HyuPete view …

rammy_vadlamudi · September 14, 2020, 2:52am

@HyuPete is using table 1 values for both event’s D1 and D2 where as the reading section uses this second table to calculate p(H=1 | D1=1 , D2=1), which bring us at different answers …

So the question why the 2nd table differ from 1st, if both are completely independent event ?

@goldpiggy

sushmit86 · September 17, 2020, 4:58am

Is there a specific reason to typecast to np.float32 as below

cum_counts = counts.astype(np.float32).cumsum(axis=0)

Would it matter if we had it
cum_counts = counts.cumsum(axis=0)

astonzhang · September 18, 2020, 6:35pm

This is for float division in
estimates = cum_counts / cum_counts.sum(axis=1, keepdims=True)

Without typecase, it will be int division.