Friday, April 29, 2011

Interdisciplinary Computing Meeting Number 2: Day 1, Part 1

Back in January I reported on a meeting on Interdisciplinary Computing I attended in San Diego. I am at a followup meeting in Tucson - we had our first jam packed mind stretching day today.

The group of people attending this meeting is extremely diverse which makes things interesting. We started off the day with an entire table full of people involved with creating a new area called computational journalism, and we have quite a few people at the intersection of physics and computer science. There are other interdisciplinary areas represented here as well, but perhaps one of most interesting observations I made today is that we have such a strong contingent from the arts and humanities.This has injected some fascinating perspectives into the conversation. They all soon split up and spread around, but it was impressive to walk in the door and see all the journalists and media people! I new this was going to get interesting.

We covered a lot of ground. Here are a few of the highlights from today - things that really jumped out at me:

We had several breakout sessions to discuss topics including exemplars of Interdisciplinary Computing (IC), lessons learned, what is it that motivates faculty (and industry professionals as well) to pursue IC, and given what we revealed among ourselves, what are effective supports for faculty to pursue IC? Action Items in other words.

One of the surprises for me right off the bat, was to learn that the field of Journalism (not "Computational Journalism", but "Journalism") is currently having serious discussions within the community about how to define themselves. Now, in CS, we have been having this discussion for a long time and the discussion evolves about as fast as the field. But Journalism - I would not have guessed. It is not a new field; it was an eye opener for me, and others I believe, to hear that another field is wrestling with the same question: "what does it mean to be a journalist?" ("what does it mean to be a computer scientist?"). There is something to be learned from this shared definitional wrestling for meaning.

After pondering the issue all day I asked a question about it at our end of afternoon discussion. The most interesting part of the response, for me, was when one of the faculty heavily involved in developing the area of computational journalism opined that this common "problem" was one of the reasons he felt his field and computer science were able to work together. Because (I'm paraphrasing here) both fields are working to define an identity (or redefine, or refine, choose your pov), they are able to come together and form something new and original. I took that to mean there was a fluidity and flexibility supporting interdisciplinary collaboration, in part because the disciplinary boundaries were not so rigid; those in each field who were interested in IC used the identity challenge as an opportunity to break new ground.

In another breakout session (discussing what motivates the faculty who are already doing IC) the table I was sitting at spoke about how for many people "it is in our blood" and those people will do IC whether they are new faculty, established faculty or somewhere in between. Our table at least, had universal agreement on this point, countering the point I have heard (and others have often made) that the people who do IC are either 1) those who are brand new and have nothing to lose - i.e. they want to break new ground and are full of enthusiasm or 2) those who are secure in tenured positions and feel that it is now "safe" to pursue this passion.

It was interesting to hear a table full of people focus not on funding as a primary motivator (although everyone agrees that funding is needed and critical) but on personality - passion, interest, determination, being a maverick (someone tossed out that phrase). It was nice, I have to admit, to be surrounded by a table full of people who felt as I do, that it is "in our blood" (I was not the one who popped out with that phrase but it certainly feels accurate to me).

A question of course comes from this last point: how to support those who want to pursue IC but who are not in either of the three categories: new and fearless, established and "safe", or ordained by virtue of personality. How to support those who will perform IC given an conducive environment? Because it became clear to all of us I would hazard to say, that a conducive supportive environment is absolutely vital for achieving critical mass and developing self sustaining IC initiatives and programs.

These are some of the strong impressions that lept out at me today as I left the indoors for the first time at 5:30 pm to unwind some kinks in the outdoor hotel pool. A good place to let things soak. In my next post, tomorrow hopefully, I'll talk about some of the actionable ideas we came up with.

Monday, April 25, 2011

Succeeding in Socially Beneficial Activities Requires Some Special Qualities

As I work my way through the final edits of my book on socially beneficial uses of computing I have been reviewing chapters one after another. Some of these chapters I had not looked at for several months. Reading them back to back like this I am struck by some common themes - technical and not technical.

First perhaps the least surprising: passion. Computing professionals working for social and environmental causes (and by no means assume none of these people are in traditional profit seeking corporations) are passionate about what they do. They get up in the morning ready to apply their computer science experience to something they truly care about beyond themselves.

Second: Many work on problems where there will never be (or is highly unlikely to ever be) "THE solution" or indisputable "proof" about the data and results they work with every day. These computing professionals have a flexibility and tolerance for ambiguity not often sufficiently lauded in curriculum I have experienced. When there is a monocular focus on proving things, using the scientific method to accept or reject hypotheses, it is possible to lose sight of the impact we can have on seemingly intractable real-world problems.

Third: I would be remiss not to include a technical / process issue. Perhaps this one will not surprise you so much; it didn't surprise me. However it did surprise me to learn, to experience, just how much achieving successful impact relies on this: not cutting corners. There are a million ways to cut corners in hardware, software and system development. It is true the culture of many institutions of learning doesn't sufficiently reward validation and verification. However, learning just how much it matters tells me we need to work harder at instilling the importance of these activities when future and new computer scientists are in their most formative stages.

These are just a few of the themes I have encountered; any one of them could make for a lengthy discussion. I must point out that the many projects I have investigated and researched all embody these themes; else they would not have succeeded in their socially beneficial activities and thus attracted my attention. 

I don't know what you think but I feel honored to have had the opportunity to meet (sometimes only over the phone, Skype or email) the people who have taught me so much about computing "doing good".

Tuesday, April 19, 2011

IFF Computing == Cabbage

I am periodically asked "what is it like to write a book?" If you have written a book you may be having a small laugh to yourself all of a sudden. A sort of evil chuckle. Because you know, .... well, I don't need to tell you. It takes you over in strange and wonderful and unexpected ways.

You probably also know that no matter what you say it may not convince the conversant that conceiving cognitively comprehensible and convivial concepts takes considered concentration. Even if you say "it is Computer Science!" (people often assume fiction for reasons I know not why) there is ofttimes a clinging to the concept that you must spend much of your time continuously cavorting. Come again? If we waited to compose for when we felt suitably inspired by The Muse ... my Editor could conceivably consider calling out the Costa Nostra (uh, just kidding...right, Randi and John?)

Perhaps my more-often-than-usual far away look, prompted not by my new glasses so much as by a looming deadline, brings on the question with greater frequency lately. However, I find myself contemplating a different question: "How do you know when you are really and truly becoming one with interdisciplinary computing?" (with which topic my book most certainly is concerned).

I have the answer. It came to me this evening as I took a break and sat looking out over a neighborhood canyon just breathing calmly. Even in stillness, everywhere I looked I saw things that reminded me of a computer. "Canyon - how lovely...oh, Canyon starts with the same letter as Computer." "Critter poo on the sidewalk leading up to the bench....Critter reminds me of Computer". "I Cannot see the stars because there are Clouds in the sky. Clouds? Computer!" "Concentrate on your breathing ...Concentrate. Concentrate. Computer Computer Computer".

A Colleague gave me a red Cabbage from their garden. Oh my gosh that was good - a fresh Cabbage tastes like Candy Compared to Cabbage from the grocery store". Candy? Compared? More "C" words! Cabbage. Is an excellent source of Vitamin C and beta-Carotene. Consuming large amounts of Cabbage reduces the Chances of getting Colon Cancer because it Contains Chemicals that protect Cells against free radicals. All those words Commence with the same letter as COMPUTER! Perhaps I have passed the threshold and am now officially (Crazy?) Coalescing with interdisciplinary Computing and Computer science and Computational thinking.

The Cabbage Convinced me. Computers are truly everywhere. All you have to do is Consider it.

Friday, April 15, 2011

Computing and Helping Sea Turtles with Social Media

The Annual Symposium on Sea Turtle Biology and Conservation was held here in San Diego this week. I was fortunate enough to be able to spend a few hours speaking with someone attending the conference. I have been doing research into computing used in sea turtle conservation work for some time as part of my book project so this was a really nice opportunity to share ideas and compare notes in person.

One of the messages that comes through loud and clear: there is an enormous opportunity for the computing discipline to work hand in hand with environmental groups. There are highly technical computing activities such as the use of GPS technology (tracking animal movement for example) and modeling (projections of complex scenarios and how different factors might affect outcomes) and then there is a sometimes overlooked opportunity for the effective use of social media. We hear about social media playing a critical role in reporting international political events, but we do not hear so much about leveraging social media as a serious and ongoing computing career option.

When the Deep Horizon oil spill occurred last spring, how did the public get their information? Some people listened to television and traditional media. However, many went elsewhere. Scientists of all kinds and other content experts, working round the clock, who had more to say than what fit in a 30 second sound bite used: social networking outlets. For example, last summer a blog maintained by the Sea Turtle Conservancy posted statistics about turtles impacted by the oil spill; the same organization used their website,  Facebook and other online venues to distribute solid data-supported information.

Professional computing skills can contribute enormously to the effective use of social networking. When to post, what to post, how to target the post. In the case of reporting on sea turtle issues this means computer science personnel can use their software engineering skills. These skills include principles of good UI design, and requirements gathering to provide a target audience the information they are interested in, in the most effective way online.

Additional computing skills are needed to build an effective public education and outreach website. html? cascading style sheets? content management system? sharepoint pages? Connections to databases of organized information - how should all that data be stored and revealed to the users through different media? ... There are many ways to build a site and it is critical to understand what approach makes the most sense and how to leverage technical capabilities in alignment with content goals. The biologists know their sea turtles. They know what the critical information is and what actions need to be done, be it with the public, government entities, other organizations and stakeholders.

There is another item worth mentioning. A computer scientist who works with an environmental group such as those supporting sea turtle conservation is most definitely not going to spend her or his life sitting in a cubicle all alone. She (or he) might well find herself (or himself) on a remote beach late at night watching for rare migrating turtles looking to crawl up on a beach to lay eggs. So I learned from my research - travel to interesting locations around the world and immersion in the natural environment come with the job.

Are you a computer scientist who wants to work for an environmental cause? Do you like to travel? Does the idea of getting dirty, muddy or bug bitten sound like not a big deal? This could be your calling.

Sunday, April 10, 2011

Security Mechanism Can Enhance Historical Record Keeping

Recently a friend showed me an incredibly creative application of computing technology with all sorts of possible uses. I am a language buff (that means Big Fan) so I immediately grabbed onto historical language issues. Now, this technology project has been around a little while so it isn't completely hot off the press. But I didn't know about it so you may not either.

The reCAPTCHA project. You probably (?) know what a CAPTCHA is: those squiggly tortured hard to read words you have to type in sometimes to gain access to a site, sign up for an online account or post a comment on a blog. If you don't type the word correctly you don't get in. Fortunately you get additional chances with new words. This technology exists because people can usually read the words but computers cannot. This keeps automated spamming out.

Here's the cool upgrade. Have you seen one of those where you have to type in TWO words next to one another? Why two? Isn't one enough? Well, this is what is going on. There are projects underway (which you probably *are* familiar with) to digitize books, newspapers, archival texts. The really old ones, the hand written ones, the faded or slightly crumpled ones - the OCR (Optical Character Recognition) software being used to scan and translate those texts cannot read many of the words.

Fortunately the computer identifies words it doesn't understand. Many of those words are being placed into those CAPTCHA screens. So you get two words: one is a known word, made to look hard to read, and the second is a word that was hard to read to start with (perhaps made more so, just to be safe :). That same unknown word is paired off and given to many many many people in CAPTCHA boxes. If you get the known word right, you get in (at least that is my understanding). If the vast majority of people get the known word right AND provide the unknown word with the same answer, the odds are the unknown word has been correctly identified and it is plugged into its spot in the original scanned text.

I realize the book scanning projects are controversial in some spheres, especially in regard to recent works. What captures (oh ow ow pun) my imagination is the thought of ancient Medieval texts written by monks with all those flourishes being scanned and translated. Ok, they are usually in Latin. Providing a Latin word in a CAPTCHA screen might be a giveaway, but the idea of being able to digitize previously inaccessible texts by harnessing the power of millions of people passing through online security checks is impressive. For now we may have to stick with Victorian English or Elizabethan English or whatever the native language is of the country where a site exists is, but eventually I suspect, computer scientists will find a way to put this to work with languages that are no longer in common use.

Meanwhile, and this might already be in the works, languages like Latin or other special-use languages might be applied to sub-populations of sites where they make sense. To get into a repository of ancient Sanskrit resources you have to type a two word Sanskrit CAPTCHA for example? Ancient texts can be made publicly accessible much faster than they would have been otherwise.

More information about the project: http://www.google.com/recaptcha/learnmore

Sunday, April 3, 2011

Internet Voting: Security, Networking, Politics and Some Heat

With social upheavals happening in countries across the middle east and northern Africa, the media and politicians have been talking a lot about democracy and the rights of individuals to freely express themselves politically. Although we are not in a similar situation here in the US - we are not risking civil war or overturning the government - we do have an issue that provokes heated debate and widely divergent perspectives among computer scientists: Internet Voting. The US Constitution guarantees the right to vote for its citizens. Most of us take this right for granted I suspect, whether we act on it or not. It amazes me when I hear how many people do not vote regularly, given that we can do so safely and without worrying about repercussions, but that is another issue.

For the past several years various groups and organizations have been debating, proposing legislation for, and running pilot projects using the Internet to vote. Why? A big motivator is to increase access. American citizens living overseas (civilian, military, government employees) often want to vote but if they have to rely on the postal services of several countries for ballot requests and return, or if they have to travel great distances to get to an approved polling place it can be a significant burden. State and county agencies in this country wonder if they can improve reliability and efficiency while reducing costs by implementing online voting. Given the current fiscal situation in this country that is a huge motivator. The problems of access, reliability and budgetary pressures just scream for consideration of a computing based solution. Maybe it will work, maybe it won't, but if it is going to happen it is going to be computer scientists who make it so (A Captain Picard reference in case you didn't pick up on it).

There are pros and cons and this is where opinions run hot. Let's take just one example: the need to guarantee the security and privacy of such sensitive data if it is sent over the Internet.

Why is a "guarantee" of correctness on the Internet so difficult? As many of you know, the Internet was originally designed by the US Department of Defense for use by a select number of trusted sites for the purpose of guaranteeing communications in the event of a national emergency such as a nuclear attack. Therefore, redundancy of data was the primary concern, not security of data. If Washington DC was attacked, multiple copies of sensitive data would be accessible in repositories across the country. This legacy lives on today, as the Internet has grown into a global network of diverse systems connected with everything from telephone lines to satellites, connecting state of the art computers and mobile devices to 25 year old legacy mainframes. As a result, it seems that we can communicate with almost anyone anywhere at anytime and because of built-in redundancy systems, data almost always gets through - unless someone intentionally interferes. And therein lies the problem. There will always be people who try to steal, or at least read and exploit, data that is not "theirs". Although not unique to electronic voting systems, the vulnerabilities of the Internet bring added attention to this old problem. We do *not* want anyone intercepting electronic ballots, or compromising a voting web site. How good is "good enough" when it comes to accountability and reliability?

How are these issues currently being tackled? I'm going to get technical for two paragraphs in order to provide a flavor of what computer scientists are currently doing in the realm of computer security for sensitive data.

The focus of much data security work is at the application and system levels. Onion routing is an interesting technique that supports anonymity across complex networks. As data is passed from an original source to a final destination it passes through many intermediate nodes (which may reach into the hundreds). Data that is “onion routed” has layers of encryption that are successively “peeled off” at each node, assuming the node in question has the correct cryptographic technique and knowledge to do so. Thus many successive security checks take place and a suspicious event at any node along the path will raise a warning flag. The action then taken is application dependent, and can include halting the process, rerouting the data or other customized response. Two other well known network level security technologies include strategic placement of firewalls and the use of Virtual Private Networks (VPNs). Various methods of authentication, encryption, verification, digital signature and hash functions find their way into this work. How well do state of the art data security efforts protect data and provide an audit trail?

Although the answer to that question invites a range of answers, computer security experts will generally agree that software alone will never be sufficient to detect or prevent tampering. Hardware level security checks are also needed. It is common practice in security intensive systems to include a Tamper Proof Module (TPM). This non-rewriteable chip contains check code to search for tampering. First the boot loader is checked, which then checks the Operating System, which then checks the application(s) for an integrity breach

There are high level issues to be addressed as well. Equally important requirements. An Internet voting system for US voters needs flexibility (to accommodate different state requirements), convenience for the user, training and education of both staff workers and voters. Particularly important is timeliness – the most flexible, friendly, convenient electronic voting system is pointless if the ballot arrives too late to be counted or has to be discarded due to suspicions of fraud.

There are always tricky issues when science and public policy intersect. Computing is particularly complicated because the field is so new, and technology changes so quickly. By the time consensus is reached on a contentious issue, the point may be moot. The technical issues alone for Internet voting are complex, but often come down to one issue: risk assessment. What degree of risk is tolerable in order to achieve a societal right guaranteed by the Constitution? In some cases entrenched opinion comes from holding a philosophical stance about whether or not the Internet can ever be an acceptable medium for any voting, overseas or otherwise.  Proponents argue that it is only a matter of time until Internet voting becomes reality, and also that the subject is a matter of morals, so we must address the problem directly. Critics counter that it is not possible with current technology to implement Internet voting at an acceptable level of security and privacy, so the risks of trying it outweigh the potential benefits.

If we can be successful, the payoffs are tangible: morally, ethically, fiscally. If we don't succeed the risks are serious: from disenfranchisement of citizens to interference with our democratic process in the worst case scenario.

What do we do? Move forward (how?), or sit it out and accept the current situation as good enough for now. What do you think?