## Social Network Risks

May 17, 2013

Yesterday’s Washington Post has a report on the concerns raised by parents and child advocates about the use of social networks by pre-teenagers.  The story focuses on the photo sharing service, Instagrambut the general issues are relevant to other sites as well: is the site collecting the personal information of susceptible children, and does it do enough to protect them from miscellaneous predators.

The Instagram service is an offshoot of Facebook, the social networking giant, which has about 1 billion users.  The company’s policy requires users to be at least 13 in order to open an account, but the Instagram site does not even ask the user’s age when (s)he signs up.  (The main Facebook site does require a bit of verification, requiring the user’s real name and age; however, the effectiveness of this is questionable, since there is no way to check the user’s answers.)  The result is that many children under 13 have set up Instagram accounts.

There is some reason for concern about this; looking at the site (or at Facebook, for that matter, where I have an account) shows that many users post a great deal of what might be regarded as fairly personal information.  Most readers are probably familiar with news stories of people whose employment or other prospects have been damaged by indiscreet posting and photos on Facebook and other social sites.  Even if one grants that adults have a right to behave like complete idiots if they wish to, it seems reasonable that children, who lack both mature judgment (such as it is) and experience, deserve some protection.

However, people need to realize that, outside the realm of science fiction, this is not a problem that has a technological solution.  Even if it were possible to develop a peripheral device that would automagically detect a persons age, it really wouldn’t solve the problem; all the server on the other end of the transaction can do is to verify that the bit pattern it receives indicates the user is 13 (or 18, or 21).   Were such a device to be developed, I would not expect it to be long before some enterprising teenage hacker produced a “spoofing” device.

Facebook and other social-media sites have said that authenticating age is difficult, even with technology. A Consumer Reports survey in 2011 estimated that 7 million preteens are on Facebook.

It’s not difficult; it’s effectively impossible.

The other thing that all of us, kids and adults, need to remember is how businesses like Facebook work.  It may seem, as you sit perusing your friends’ postings, that you are a customer of the service.  But the customers are actually the advertisers who buy “space” on the service, which has every incentive to provide the customer with as much personal information as possible, in order to make ad targeting more effective, thereby supporting higher ad rates.  When you use Facebook, or other similar “free” services, you are not the customer — you are the product.

## Interview with James Randi

March 28, 2013

I’ve written here before about James Randi, the retired professional magician and skeptic of the occult, and his  James Randi Educational Foundation, which investigate claims of paranormal, supernatural, and occult  ideas.

The self-described “News for Nerds” site, Slashdot, has an interview with Randi, in which he answers questions submitted by readers,   As one might expect, the discussion focuses on the work, by Randi and the Foundation, to combat irrational and magical thinking.  It’s a brief but entertaining read.  The page also contains comments from Slashdot readers, which are worth glancing through: there are some insightful ones, though there is, as usual, a lot of drek as well.

## Happy Pi Day, 2013

March 14, 2013

Today, March 14, is one of the days that is sometimes celebrated as “Pi Day”, in honor of the best-known irrational and transcendental number, the ratio of the circumference of a circle to its diameter, usually written as the Greek letter π (pi).  The date, 3/14, is chosen because the approximate value of π is 3.14159265…   Legend has it that the value was named π because pi is the first letter of the Greek word “περίμετρος”, meaning perimeter.

The New Scientist reports that this year, to observe Pi Day, Professor Marcus du Sautoy of the University of Oxford is sponsoring Pi Day Live, a project to “crowd source” the calculation of π (pi).  The value has, of course, alreay been calculated to trillions of decimal places; because it is an irrational number, it cannot be represented exactly by any finite decimal number.  (Pi is transcendental, also, of course.)  Pi Day Live is suggesting some relatively easy methods of getting an approximate value for π, including Buffon’s Needle.  I mentioned Buffon’s Needle in a Pi Day post back in 2010.  The New Scientist headline calls it an “ancient” method, which I think is a bit over the top for something described in the 18th century.

That earlier Pi Day post also tells a related story, of the Indiana state legislature’s attempt to set the value of pi by law, one of the all-time great accomplishments of legislative lunacy.

Finally, take a thought today for 134th anniversary of the birth of Albert Einstein.

##### Update Thursday, 14 March, 15:35 EDT

I’ve just noticed that there is a rendering error (at least in Firefox) on the “Find Pi” page I linked above.  The equation for the estimated value of pi is a bit garbled (where it reads 2L\over xp; the correct equation (using the “Find Pi” variable names) is:

$\pi = \dfrac {2 L}{x p}$

I’ve dropped the site a note with the correction.

## “Security Engineering” Available Online

March 11, 2013

If you have a serious interest in system and network security, one of the best reference books available is Security Engineering, by Ross Anderson.  The first edition of this book was published in 2001, and quickly became a standard text.  Under an agreement with the publishers, the complete text of that book was made freely available online, four years after its publication.  The second edition, which I think is even better, was published in 2008; as with the first edition, the second edition text is now available online.

Dr. Anderson is Professor of Security Engineering at the Computer Laboratory, University of Cambridge, and has for many years been regarded as one of the world’s top security experts.  He received his PhD from Cambridge, and is a Fellow of the Royal Society, the Institute of Physics, and the Royal Academy of Engineering.  His home page (linked above) has an overview of his very extensive research work.

My (paper) copy of the first edition of Security Engineering is one of the handful of reference books I use all the time.  I don’t have a paper copy of the second edition yet, but I’ll certainly be ordering one.  If you have a serious interest in the field, I recommend it without reservation.

## Language and Wikipedia

March 10, 2013

In addition to being, in my view, the finest news magazine published in English, The Economist has a number of interesting and highly literate blogs on its site, covering a wide range of topics.  I think I have occasionally mentioned the “Babbage” blog, which covers science and  technology.  Another of my favorites is “Johnson”, named for the writer of dictionaries, a harmless drudge, Samuel Johnson; it covers the “use and abuse” of language around the world.

A recent post discusses the multi-lingual character of Wikipedia, the Internet encyclopedia that is just over twelve years old.  Most readers probably know that Wikipedia has articles in a number of languages, but might be surprised to learn that are now official versions of Wikipedia in 285 languages.  There is of course a considerable amount of variation in the number of articles available in different languages.  It is no surprise that English has the most content, with 4,182,130 articles at present.  There are four other languages that have more than 1 million articles.  Three of these are not too surprising: German, French, and Italian.  But the other, Dutch, is a language that, as “Johnson” points out, has only about 20 million native speakers.  The post also points out that virtually every student in The Netherlands studies English; and I can confirm, from business and pleasure trips, that virtually everyone one meets, at least in cities, speaks excellent English.  Perhaps the number of articles in Dutch reflects the availability of a large group of potential translators.

The next group of languages, those which have more than 100,000 Wikipedia articles, presents an interesting assortment.  It includes some obvious candidates, “big” languages like Russian, Spanish, and Japanese; but it also has languages that I, at least, had never heard of, like Cebuano, a language spoken by about 20 million people (yes, about the same as Dutch) in the Philippines, with 273,316 articles.  The “made up” language, Esperanto, makes the cut with 177,002 articles.

There are further listings of languages with 10,000+, 1,000+, 100+, 10+, and 1+ Wikipedia articles.  When you get toward the bottom of the list, I’d wager that most of the entries will be unfamiliar to you, unless you are a professional linguist.  Some are local African languages, some are American Indian, and some come from Pacific islands, for example.  (One handy feature of the listing is that, if you click on the English name of the language, in the second column, you will get the Wikipedia page that describes that language.)

The listing also gives some other interesting statistics on the various Wikipedias, including the number of registered and active users, the number of administrators, and the number of edits.  It also includes a measure called “depth”, defined as:

Edits/Articles × Non-Articles/Articles × Stub-ratio

This gives a rough measure of how frequently articles are updated, and is one aspect of the articles’ quality.  Again, it is hardly surprising that English has the highest depth score, at 749; almost all other languages have scores less than half as large, although there are a few local languages (for example, Fijian, depth 451) that get relatively high depth scores despite having only 265 articles.  This probably reflects a small group of “hard core” enthusiasts.  Gothic also shows a high depth score at 394, with 431 articles, despite having no native speakers; the language was effectively extinct by about the ninth century AD.  (There is a considerable extant corpus of written material in Gothic.)

All of this is interesting to browse through, and speculate about; I imagine it could be a useful resource for students of linguistics.  The availability of articles in so many languages is a positive sign that the Internet is doing something useful to spread knowledge around the world.

## Watson Goes to College

March 9, 2013

Back in early 2011, I wrote a number of posts here about IBM’s Watson system, which scored a convincing victory over human champions in the long-running TV game show, Jeopardy!.   Since then, IBM with its partners has launched efforts to employ Watson in a variety of other fields, including marketing, financial services and medical diagnosis, in which Watson’s ability to assimilate a large body of information from natural language sources can be put to good use.

Now, according to a post on the Gigaom blog, Watson will, in a sense, return to its roots in computer science research.  IBM has supplied a Watson system to the Rensselaer Polytechnic Institute [RPI] in Troy, NY.  According to Professor James Hendler, author of the post, and head of the Computer Science department at RPI, one focus of the work with Watson will be expanding the scope of information sources the system can use.

One of our first goals is to explore how Watson can be used in the big data context.  As an example, in the research group I run, we have collected information about more than one million datasets that have been released by governments around the world. We’re going to see what it takes to get Watson to answer questions such as “What datasets are available that talk about crop failures in the Horn of Africa?”.

Some of the research work with Watson will also be aimed at gaining more understanding of the process of cognition, and the interplay of a large memory and sophisticated processing.

By exploring how Watson’s memory functions as part of a more complex problem solver, we may learn more about how our own minds work. To this end, my colleague Selmer Bringsjord, head of the Cognitive Science Department, and his students, will explore how adding a reasoning component to Watson’s memory-based question-answering could let it do more powerful things.

The Watson system is being provided to RPI as part of a Shared University Research Award granted by IBM Research.  It will have approximately the same capacity as the system used for Jeopardy!, and will be able to support ~20 simultaneous users.  It will be fascinating to see what comes out of this research.

The original IBM press release is here; it includes a brief video from Prof. Hendler.