Electronic Publishing


John Unsworth

Institute for Advanced Technology in the Humanities
University of Virginia

Information about Electronic Publishing Elsewhere on the Web

Terms Defined and Acronymns Unpacked:

For the purposes of this discussion, "Electronic Publishing" refers to the dissemination of information, whether text-only or multimedia, via the internet or through some hybrid of local media and networked archives. Networked electronic publishing can be the basis for producing these other media, and indeed for producing print, but in this document I won't be discussing the production of stand-alone Compact Discs, Videodiscs, or other forms of electronic publishing that have no networked component. In particular, I will be focusing on what one can do with the World-Wide Web as a delivery medium, both through the native capabilities of the browsers and the markup language, and with server-side extensions that use the CGI (Common Gateway Interface) facilities of most Web servers. A server is any software program out on the internet which provides information to the client, on request (to make life a little more confusing, the machines on which such software runs are also often called servers).The client is another piece of software (such as Netscape or Mosaic or Lynx) which interprets that information for display. There are other kinds of client/server protocols aside from the Web, but Web clients can handle many of these other protocols (ftp, telnet, gopher) as well. In order to be an information consumer on the Web, all you need is the browser (though you'll probably want some helper applications too, for playing or displaying material that the browser can't show you by itself). If you want to be an information provider, you need access to space on an internet host that runs a World-Wide Web server. The markup language for the Web is, as most readers will already know, called HTML (HyperText Markup Language). Hypertext is a term that refers to computer-based information organized into a series of linked nodes--and hypermedia is mixed-media hypertext. Hypertext is not the name of a particular software product, nor is it the invention of a particular individual, nor is it something that can be created using only one set of tools or one data standard. HTML is a DTD (document type definition) of SGML (Standard Generalized Markup Language). The other acronymns that will come up in this discussion refer, for the most part, to data formats: .pdf (Acrobat) and postscript are page-formatting languages; GIF, JPEG, and TIFF are still image formats; MPEG and Quicktime are motion video formats; .au, .aiff, and .wav are audio formats; ISMAPs are a special class of inline images in Web documents, images which have hypertext links "mapped" to various areas within them, so that a user can click on a portion of the image itself and get a result.

Frequently Asked Questions:

At this point, you can use the Web to publish formatted text, still images, short film clips (with or without soundtrack), music or spoken word, and even information contained in and provided by other programs, as long as those other programs can be addressed on the comand line (rather than through the graphical user interface). Any Web browser with a current implementation of the HTML standard will support fill-out forms (pages on which you can type or select options), tables (groups of cells that retain their physical layout on the page, and that can contain text, images, or hypertext links), and ismaps (images with embedded hypertext links). Some browsers (especially Netscape) go beyond the standard to offer backgrounds (colors or images behind the text of the page), blinking text, and other idiosyncratic features. At the beginning (way back in 1991), the technology of the Web was such that a file had to be entirely received before it could be played or displayed by the software at the receiving end. Netscape users will be familiar with some current exceptions to this rule, for example the gradual dithering-in of images as they are being received; also, most browsers now will display the first parts of a text page while later parts are still coming in. It is now beginning to become possible to do what's called "streaming" of data through the Web: using programs like RealAudio, for example, ten or fifteen seconds of the file will be received and then the playback will begin, while the rest of the material is still arriving. Adobe's Postscript and Acrobat formats offer absolute control over page layout (though at the price of slower transmission of textual information). Macromedia Director--a widely used programming environment for CD-based multimedia--is soon to be incorporated as a native format in some Web browsers, and the Macromedia company is working on their own Web-oriented players and compression techniques.

No. Any Web server that you use to distribute your information, or that your service-provider uses, will allow you to restrict access to individual files, whole directories, or entire web sites, based on the network domain or userid and password of the user. This means that, with current technology, you can implement both institutional site-license and individual subscription models of cost recovery. Financial transactions themselves are still best conducted offline, but there are secure servers which, when they are communicating with the right sort of browser, provide a level of protection comparable at least to the security you have in a telephone transaction. Pay-per-use models can also be instituted with current technology, though these systems at the moment tend to rely on the user's establishing an account with the information provider, against which account his or her activities are charged. More immediate and more transparent charging and payment mechanisms, including secure electronic credit card transactions and secure electronic checking, are not far off. Having said that, though, I should add that the best current advice, supported by what empirical evidence there is to date, suggests that it makes sense to give away at least part of what you provide on the net, in order to attract users and in order to give them a sense of what the product actually is, and whether it's worth buying. It's also worth pointing out that, in the history of the computer and software industry, there are a number of counter-intuitive examples of business success founded on the free distribution of intellectual property: for example, Netscape giving away its browser software, Adobe making its postscript language public, IBM allowing the cloning of their PCs. Even in the area of information, those who have attempted to charge for everything, or to charge for things in this medium as they might in print or other media, have generally failed for lack of customers. One other thing worth bearing in mind is that the Web can easily be used to create unexpected marketing synergies: free distribution of information in electronic form over the network might well increase sales of that same information in print.

You have two options: you can rent space on a Web server that someone else runs, or you can set up and run your own. To run an effective Web site, you will need a machine that stays on the internet all the time, so that it can have its own permanent address. This means that you need a connection other than the kind you can make with a modem. The options open to you for this higher-speed line will depend on your local telecommunications and/or internet service providers. The cost of setting up such a connection may be quite high, depending on several factors: the speed of the line, whether or not you have to buy your own router, and method of calculating service costs. If you are setting up your own site, you'll also need to register an internet domain with Internic, and if you're registering a domain that could have other domains underneath it, then you'll need to provide Internic with evidence that you can responsibly administer and maintain this higher-level domain. Remember, too, that if you are maintaining your own internet host, you may need to hire someone to keep this new equipment running. By comparison to those costs, the actual server hardware is likely to seem cheap: you can run Web server software on Macs, Windows NT machines, and Unix workstations. My own recommendation, especially if you expect your site to grow, would be to use a Unix machine as the server: at the low end, these machines will not cost more than a high-range PC, and they are much better at serving several requests at once. Second to Unix, I would recommend Windows NT as the operating system of choice for running Web server software. You may well decide that you don't want to set up your own site: every major city in the U.S. has a growing number of I.S.P.'s (Internet Service Providers) who can rent you space on a Web server and provide you with a range of services that may include markup, programming, and other things. The I.S.P. business is a very new thing, so be sure that you are comfortable with the provider's abilities, services, and reliability. Also bear in mind that many of the more customized functions of the Web depend on custom CGI programs--often Perl scripts or C programs.

There are a number of software tools for creating HTML pages. Some of these are shareware HTML editors, such as HTMLAssistant (for DOS/Windows machines) or [what's it called?] (for the Mac). WordPerfect offers a free module for WP 6.1 called Internet Publisher, as well as a more elaborate SGML Edition of WP 6.1. Netscape Gold is a reasonably priced authoring package for the Web, available for Mac and Windows machines (Unix?).

Some shareware helper applications such as LviewPro (a Windows image viewer/editor) can also be useful in editing the graphical content of your pages--for example, in establishing a transparent background for inline images (the effect which allows images to appear transparently overlaid on the page, instead of appearing in a box). Many image-editing programs will allow you to produce interlaced GIFs (the kind of inline graphic that dithers into view as it is received), and JPEG images (Netscape, but not Mosaic, can display inline JPEGs)--the most economical format for color images. If you are trying to present an enlargable image of print or handwritten text, TIFF is a good format, though for most other purposes TIFFs hold more information (and thus take up more disk space and transfer time) than is really necessary. Adobe's Photoshop is probably most people's tool of choice for creating and refining Web graphics.

For sound files, shareware players like Wham (for Windows) or SoundStudio (for Macs) will offer some editing capabilities: you can produce audio files with any combination of hardware and software that can capture audio input and save it in a few standard formats (the Sun .au format and the Windows .wav format are the most commonly used on the Web). If you're working with audio for the Web, remember that an 8MHz or 11MHz sampling rate is almost always sufficient, and mono playback is all that most people will be able to produce: higher settings in the software that creates the file will substantially increase the file-size without providing the end-user with an appreciable increase in quality. The same goes for the soundtrack on Quicktime or MPEG2 video clips. As for the video content, fifteen frames per second provides a good compromise between file-size and video quality--and the smaller the frame size, the smaller the resulting file will be. Adobe Premier is a good (Mac-based) tool for video capture, but there are a number of others. A 30-second Quicktime clip, with a mono soundtrack, 15 fps, and a frame size of about 200 x 300 pixels will, after maximum compression (using something like CinePak) come out to about a 3-4 MB file--about the largest file size that's practical on the Web. MPEG-1 will provide superior compression, and much smaller file size, because it offers no audio; MPEG-2 files do have audio, and will still be smaller than Quicktime files, but MPEG-2 is hardware-dependent, meaning that the end-user will need an MPEG-2 video board in his or her computer in order to be able to play back the clip. For that reason, and because there are now Quicktime players for Windows and Unix machines, Quicktime is currently a very popular video format on the Web.

Elements of Successful Web Publishing

Every Web page, almost without exception, should have a few basic features: an anchor pointing back to the home page for the person, department, business, or other institution which sponsors the document. Every page should have a date tag at the bottom, showing (automatically) when the page was last modified. This tag looks like this, before it is parsed on the way out by the server: Last Modified: <!--#echo var="LAST_MODIFIED" -->