Lab 10: Association, Clustering, and Probability

Part 0: RAWR

At this point, you should most definitely NOT be working on Pathfinder anymore, unless you have completed everything else.

You should make sure that, at the bare minimum, the following important basics are complete, in this order, before you move on to anything else:

(1) Lab 6, Part I.

(2) Lab 6, Part III.

(3) Lab 9, Part I.

(4) Lab 9, Part 0, Problem 1.

(5) Part 1 and 2 of this Lab.

Don’t forget to submit!

Once these are done, you should be splitting your time between FaceThingy (Lab 7, Part II; Lab 8, Part II; and Lab 9, Part II) and your final project, with an emphasis placed on your project. Don’t hesitate to ask questions!

Part I: Clustering

  1. Show an example of a set of points and 3 initial random centroids where k-means fails to find the optimal clustering.
  2. Show an example of a set of points where k-means fails to find the optimal clustering using the centroid initialization method of picking the 3 points that is farthest from all other centroids so far picked.

Part II: Probability

  1. Suppose you work for the World Health Organization (WHO), and three drug companies present you with three HIV tests. Company A has a test with a specificity (probability that it will detect a non-HIV carrier as a not infected) of 0.9999, and sensitivity (probability that it will detect an HIV carrier as infected) of 0.9. Company B has a test with specificity 0.9, and sensitivity of 0.9999. Company C has a test with specificity of 0.99, and sensitivity of also 0.99. Your coworker, who hasn’t seen much statistics, suggests that 0.99 is good enough, so you should just pick the test from company C to use worldwide. Should you heed her advice? In what situations would each of these three tests be useful?
  2. The Civil Rights Act of 1964 was a huge accomplishment in the history of the United States. At the time, Democrats cited themselves as the main reason it was passed. They used the following facts to support their case: In the House of Representatives, 94% of Democrats from the North (145/154) voted for the bill, while only 82% of Republicans did (138/162). In the South, 7 of the 94 Democrats did, while none of the 10 Republicans voted yes. Similarly, in the Senate, 45 or 46 Democrats voted for the Act, while 27 of 32 Republicans did. In the South, 1 of 21 Democratic Senators voted for the bill, while the single Republican Senator voted “nay.” Is this a valid argument? Why or why not?
  3. (Challenge) Lekan is hosting a hat party! Lots of people (in fact, n people) arrive wearing hats. Oh, and it’s also one of those hat exchange parties. So, at a certain point in the night all n people take off their hats. Then they each select a random hat and put it on. New hats for everyone! Well, hopefully.
    1. What is the probability that no one selects their own hat?
    2. What is the probability that k partygoers select their own hat?

Part III: Fun, fun, fun, fun

  1. Reminder: for your project, you do NOT need to code anything. We just require doing some kind of significant implementation of your own ideas. If that is implementing something in code, applying the algorithm to a new problem on paper, modifying the algorithm to improve in some way, or proving it yourself, all of those count. The most important thing is NOT code. Rather, it is understanding your topic as best as you can, doing something novel or interesting with it, and then being able to talk about it to the class with some ease.
  2. By Monday, you should at least have most of your research complete. Yes, Monday. OMGWTFBBQROFLCOPTER. You should be able to sit down with someone who doesn’t know about the topic, and talk for ten minutes or so to thoroughly explain it just off the top of your head. That will give you three days to solidify your understanding, do some novel work, and prepare your talk.
  3. If you have any interesting code or interesting things you found, just post it on the blog. I’m going to go through all of it this weekend and give comments on anything interesting!
  4. Enjoy your last weekend at EPGY. :(
  5. Celebrate…Friday!


