Monday, June 04, 2007

Great Divides in the Internet

It's been almost a year since my last blog. I've been busy programming OpenEpi, Version 2, and writing a book on how to use and program Epi Info and OpenEpi. I plan to make drafts of the book, as it develops, available sometime this summer (US or Dominican) on my website, http://www.epiinformatics.com/, and OpenEpi, Version 2, is already up there at http://www.openepi.com/ .

But, most likely, you are here to read about the Great Divides or barriers to Internet use that lie like fallen trees across the road to informatics heaven. These include:
1. The gulf between browser content, and the local operating system and hard disk. Determined users can manage to save HTML pages to disk, or download selected files, after agreeing that they know how stupid this is, but there is no easy, non-proprietary method for an Internet program to operate both from the hard disk and from the Internet. We explore solutions below.
2. The reason for the Great Divide between Internet and Local Computer, or the Wall Around the Browser, is human evil, or, to put it more delicately, concern for security. Experience has shown that the anonymity of the Internet offers perfect refuge for both malware hackers and advertisers to invade one's computer without conscience or penalty. Hence the browser has been designed more or less like an isolation bubble, protected even from it's owner. If everyone who interacts with your computer could be correctly identified, as, for example, those who come to your front door, or write you checks (well, OK, better than that), many more possibilities for safe computing would arise.

Hence, I'm describing two "Great Divides," the Wall Around the Browser, and the Myth That Anonymity Is a Good Thing. In a public health data system, it is not a Good Thing, and we need simple, painless, and effective means of identifying each other when on the Net. As I mentioned in another blog, facial recognition seems to be a leading candidate, but I am not particular as long as it works better than the cumbersome machinations of most Internet banking and purchasing systems. As a busy professional, I don't have time to call Utah or Atlanta every time my card expires, or I lose a card or the 24th password I have acquired this year. I might be willing to do so if my face expires, due to injury or plastic surgery.

OpenEpi works in all major browsers in Windows, the Macintosh, and Linux. For saving data, however, we had only limited choices:
1) Cookies--limited to small chunks of data, and not portable between browsers
2) A Microsoft-only version of web pages called HTA, which lets Javascript in the Microsoft Internet Explorer access the FileSystemObject in Windows, and, with quite a bit of code, read and write files. This is the method we provide with OpenEpi in the program called OpenEpiSave to save output files. It works, but only in Windows, and only with IE.

3) Provide a web server, either remotely, or on the user's own computer. We don't really want to run a server for the world, but there are some early resources for doing this kind of work, such as Google's Docs and Spreadsheets, and Amazon S3. Doc and Spreadsheets is limited to particular kinds of data and S3 charges for storage; both are rapidly evolving. I've installed several servers on my laptop, and there are some systems that make installation more or less painless, but understanding the server and keeping it from serving your disk to the world are important.

In the past few days, I have been reading news articles about Google Gears, which is a browser plugin to allow Internet applications to operate either via the Internet and a server, or, when the Internet is not connected, with stored data on the local computer. Google is making the code available to the world, and this sounds like it might be what is needed to put data storage and access on a solid footing in OpenEpi, but preserve its cross-browser, cross-platform goals. Apparently what Google Gears does is to provide a way to save and retrieve data from a protected "cache" on the local computer and also from a small database that it installs called SQLite. In other words, the "browser bubble" now includes some local disk, and a credible database.

Time will tell what Google Gears means for epidemiology, but time passes rapidly, in Internet years. Since Google has released the source code and documentation to the public, it certainly cannot be regarded as the "Microsoft Killer" that the Press, from their gladiatorial perspective, would find most welcome. It may be possible that innovation in informatics does not require the "death" of a rival company, but that both companies may continue to fight ignorance, isolation, and some of the other worldwide problems that computing can help solve.

So it appears that one large chasm has been bridged, but we still need painless, complete, and continuous authentication/identification systems so that those on line are held responsible for their actions, and only those who should be, are accessing sensitive data.


Labels: