Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Dr.Dobb's journal.2005.12

.PDF
Скачиваний:
25
Добавлен:
23.08.2013
Размер:
9.06 Mб
Скачать

The OCAP Digital

Video Recorder

Specification

Building DVR applications

LINDEN DECARMO

The instant a DVR is released, hordes of hackers descend on it to figure out how to add additional hard-drive capacity and tweak the software. Hacking DVRs is popular because manufac-

turers don’t use open APIs and discourage software tweaking of their DVRs. Fortunately, CableLabs (the R&D consortium for the cable television industry) has published the OCAP Digital Video Recorder (DVR) specification that defines an open API that should minimize the need to hack future DVRs based on this technology. In this article, I examine the OCAP DVR spec and show how you use it to create DVR applications.

Monopoly Versus Competition

The software in virtually all cable DVR platforms is proprietary and highly secretive (the exception being TiVo’s HME; see “Building on TiVo”, DDJ, March 2005). To

Linden is a consultant engineer at Pace Micro Technologies Americas where he works on DVR software architecture. He is the author of Core Java Media Framework

(Prentice-Hall, 1999). He can be contacted at lindend@mindspring.com.

create applications for these proprietary platforms, you usually must sign a NonDisclosure Agreement (NDA) and pay DVR vendors tens of thousands of dollars before obtaining an SDK. Hackers typically don’t have thousands of dollars to spend and often won’t sign NDAs as a matter of principle. Consequently, they resort to other means of tweaking the software.

At the same time, the cable industry is trying to transition from legacy cable boxes with proprietary APIs to a Java-based API set called the “OpenCable Application Platform” (OCAP); see my article “The OpenCable Application Platform” (DDJ, June 2004). While the initial OCAP specification has a rich API set that can control High Definition (HD) and advanced captioning features found in modern settop boxes, it is sorely lacking Video Recording (VR) functionality. Fortunately, this was not an intentional omission. The OCAP spec was released before DVRs became mainstream devices and CableLabs did not try to cram a preliminary DVR specification into the standard before it was solidified.

In the years since the original OCAP spec was published, HD DVRs have exploded onto the market and became key revenue generators for many cable vendors. Consequently, CableLabs realized that the addition of a Java-based DVR API was an essential if OCAP wanted widespread acceptance. Consequently, it published the OCAP Digital Video Recorder (DVR) specification (http://opencable

.com/downloads/specs/OC-SP-OCAP- DVR-I02-050524.pdf).

Bare Minimum

CableLabs has three goals for OpenCable:

Define the feature set of next-generation hardware.

Foster competition based on open standards.

Enable boxes to be sold at retail locations.

“The first thing a DVR application typically does is detect the storage devices connected to the platform”

To further these goals, CableLabs created OCAP— a middleware API that is operating system, hardware, and network neutral. By eliminating all proprietary operating systems and conditional access technologies, OCAP ensures competition by defining an open standard and allowing vendors to innovate based on their particular expertise.

CableLabs has adopted a similar approach with the OCAP DVR extension. Rather than taking an authoritarian approach and forcing all vendors to create

http://www.ddj.com

Dr. Dobb’s Journal, December 2005

49

clone DVR products, the DVR spec defines the minimum set of standards that any OCAP DVR platform must support. Vendors can differentiate themselves via enhanced functionality, price, or time to market. CableLabs only insists that each OCAP DVR implement the API it defines and provide the hardware capabilities it requires. This ensures that OCAP-certified DVR applications run on any OCAP-compliant DVR platform.

Each OCAP DVR box has at least one tuner, one time-shift buffer per tuner, local storage for digital video playback/ recording, and local storage for a generalpurpose filesystem. As I described in “Inside Digital Video Recorders” (DDJ, July 2005), a tuner is used to obtain live content from the satellite or cable network and the time-shift buffer lets you perform trick operations (fast forward or pause, for instance) on the live content.

CableLabs has also wisely differentiated file operations on a general-purpose filesystem (NTFS, ext2, ext3, and the like) from recording and playing back content to a storage device. On some DVRs, a general-purpose filesystem (GPFS) is used to store both content and data files. By contrast, other DVRs use a specialized filesystem for recording and playing back content. Typically, these filesystems are highly optimized for large block reads and writes, and shouldn’t be cluttered with small data files (some may not even let you create small files or use traditional file I/O APIs). Therefore, the OCAP DVR API offers APIs to detect if the medium is capable of general-purpose file I/O and content storage or retrieval (or both).

In addition to these basic hardware features, every OCAP DVR is capable of recording live content, playing or watch-

ing TV while recording, obtaining a listing of all available recordings, performing resource management, attaching permissions to content, and enforcing rights management and copy protection.

Inverse Evolution

The first wave of OCAP specifications tweaked existing European-based Multimedia Home Platform (MHP) Globally Executable MHP (GEM) specifications for North American cable products (http:// www . mhp . org/mhp _ technology/ other_mhp_documents/tam0927-pvr-pdr- dvr-for-www-mhp-org.pps). By contrast, the initial iteration of the OCAP DVR spec was specifically designed for North American products and was not based on an existing European standard.

At the same time, DVB was adding its own flavor of DVR functionality to GEM. Because both efforts were based on MHP and contained many common elements, CableLabs and MHP decided to merge the core functionality in both DVR working groups to create the shared standard “Digital Recording Extension to Globally Executable MHP” (MHP document number A088; see http://www.mhp.org/ mhp_technology/other_mhp_documents/ mhp_a088.zip).

All A088 classes and interfaces are found in the org.ocap.shared namespace. However, if you try to create a Java-based DVR application using only A088, you’ll be bitterly disappointed when the compiler spews out an avalanche of errors when you compile it. This is because A088 provides common DVR interfaces and abstract classes that could be used on satellite, cable, or terrestrial products. It is missing key network and resource management classes needed to compile and link. CableLabs

Start of Movie

 

 

Movie Credits

End of Movie

 

 

(2:00)

 

2:15

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Start of Movie

Commercials

Movie Credits

End of Movie

 

 

(2:00)

2:15

Figure 1: Timelines map Normal Play Times to an OCAP DVR environment. In this illustration, even though commercials have been added to the content, the NPT still triggers at the appropriate time.

and MHP intentionally avoided these topics because this functionality varies dramatically between European (DVB) and North American cable environments. Consequently, A088 must be supplemented with concrete classes to create a viable DVR solution.

In OCAP, these concrete classes are located in the org.ocap.dvr namespace and are defined in the OCAP DVR spec. OCAP offers classes to handle features such as filesystems, resource, and network management. MHP’s implementation is found in the A087 spec (http://www.mhp.org/ mhp_technology/other_mhp_documents/- mhp_a087.zip). It permits DVR applications to access European-oriented DVB Service Information (SI) and TVAnytime Java classes (TVAnytime defines features such as metadata to describe content).

Storage Search

Given that DVRs focus on playing back content, the first thing a DVR application typically does is detect the storage devices connected to the platform. This information resides in the StorageManager, a system object that monitors the availability of all storage-related devices. To obtain a listing of mounted filesystems, you call StorageManager.getStorageProxies( ). This returns an array of StorageProxy objects.

StorageProxies implement a variety of interfaces, the most interesting of which are DetachableStorageOption, LogicalStorageVolume, and MediaStorageVolume. The DetachableStorageOption interface is implemented by a StorageProxy that can be hot-plugged (or dynamically added or removed). These type of StorageProxy objects will be found on DVR platforms that have IEEE 1394, SATA, or USB 2.0 port(s). It is important to test if the StorageProxy implements

DetachableStorageOption before attempting file I/O on it because the filesystem may need to be mounted before it is used. The filesystem may be mounted by calling DetachableStorageOption.makeReady( ) and unmounted via DetachableStorageOption.makeDetachable( ).

When a storage device is hot-plugged, the StorageManager generate events to all interested parties that have attached a listener via the addStorageManagerListener( ) method. These events are divided into three categories:

STORAGE_PROXY_ADDED. A storage device was added to the DVR.

STORAGE_PROXY_CHANGED. A storage device changed state (for example, the filesystem may be mounted or unmounted).

STORAGE_PROXY_REMOVED. A storage device was removed from the DVR.

50

Dr. Dobb’s Journal, December 2005

http://www.ddj.com

The second interface surfaced by StorageProxies is LogicalStorageVolume. Again, some DVR storage devices may not permit you to perform general-purpose file I/O on them. If the StorageProxy implements LogicalStorageVolume, then you can get path information via the LogicalStorageVolume.getPath( ) and set file attributes with LogicalStorageVolume.setFileAccessPermissions( ).

The third critical interface exposed by a StorageProxy is MediaStorageVolume. Storage devices that host digital video and multimedia content implement the MediaStorageVolume interface. It offers methods to report total space available to record content and the maximum bitrate of content it can record or playback. This is vital information because you don’t want to record a 20-MBps High Definition stream onto a USB Thumb drive that is only capable of writing 11 MBps.

Once you know all the active storage devices that can store content, the next step is to obtain a listing of available content. This is done by obtaining an instance of the RecordingManager from the RecordingManager.getInstance( ) method. For OCAP platforms, this returns an org.ocap.dvr.OcapRecordingManager object (DVB platforms will return a org.dvb.tvanytime.pvr.CRIDRecordingManager).

The OcapRecordingManager has these responsibilities:

Managing recorded services.

Managing resources.

Scheduling recordings.

Resolving resource and recording conflicts.

Starting and stopping recordings.

The first responsibility of the OcapRecordingManager is to maintain a list of recordings (or recorded services) that can be replayed. To obtain this listing of content, you call the OcapRecordingManager.getEntries( ) method. This returns a RecordingList object that can be navigated with a RecordingListIterator.

When you arrive at the content you wish to play, you call the RecordingListIterator.getEntry( ) method to obtain a

RecordingRequest object (for OCAP platforms this will be an OcapRecordingRequest). At first, it may seem confusing to have to navigate a list of RecordingRequest objects just to play content. However, DVR programming requires a change in mindset from playing traditional digital video files. Normally, in nonDVR environments, you must wait until the recording completes before you can play video files. By contrast, DVRs let you play content that is still actively being recorded. This is how DVR applications can per-

form trick operations such as pause, rewind, and fast-forward on live TV.

Nestled within the RecordingRequest is a RecordedService. To obtain this service, you call RecordingRequest.getService( ). Once you have a RecordedService, you can play it with the Java Media Framework (JMF). Simply feed the locator returned from RecordedService.getMediaLocator( ) into Manager.createPlayer( ) and the content is presented like any other MPEG content.

While you can use traditional JMF APIs to play DVR content, you can’t create a robust solution without taking advantage of OCAP’s DVR-specific JMF extensions. These enhancements are defined in the org.ocap.shared.media classes and enable your applications to obtain time-shift buffer

attributes, monitor timelines, and receive DVR-specific events.

Time-shift buffers are circular buffers that a DVR uses to enable trick modes (see “Inside Digital Video Recorders”

DDJ, July 2005). A TimeShiftControl represents a moving window in the overall content where you can perform trick operations on live content and it offers methods to query the size and starting and ending positions of the time-shift buffer.

Timelines integrate Normal Play Times (NPT) into JMF’s media time concept (that is, the playback duration of the content). An NPT is a bookmark in the content and this bookmark will be valid no matter how the content is edited. For example, in Figure 1, an application wants to display a

I M A G I N G T O O L K I T S

On Time, Under Budget,

 

 

EB

R

 

L

Out the Door

E

 

A

C

 

G

 

 

I

 

 

 

T

15

N

 

 

1

 

Y

5

 

 

E

 

A

 

 

!SR

 

 

200+ Image Processing Filters | Fast Diplay & Compression Huge Set of Image Annotations | Fast TWAIN Scanning OCR/ICR/OMR | Expanded Barcode

Advanced PDF Compressor | 150+ Image Formats Advanced Bitonal Compression - JBIG2, ABIC & LEAD ABC

LEAD Technologies, Inc. is the world-leading supplier of imaging development sdks. The LEADTOOLS family of toolkits is designed to help programmers integrate color, g r a y s c a l e , d o c u m e n t , medical, multimedia, Internet and vector imaging into their applications quickly and easily. LEAD's award-winning imaging technology has been chosen by Microsoft, Hewlett Packard, Intel, Boeing, Xerox, Kodak, Ford Motor Company and thousands of other companies for use in their high volume applications and

internal systems.

www.leadtools.com

sales@leadtools.com or call: 800-637-4699

LEAD and LEADTOOLS are registered trademarks of LEAD Technologies, Inc.

http://www.ddj.com

Dr. Dobb’s Journal, December 2005

51

menu on the screen exactly when the movie credits are being played (say, two hours into the movie). Even if the broadcaster inserts commercials into the content, the NPT triggers exactly when the credits are displayed (in this case, 150 minutes into the movie). When content is edited (that is, information is added or removed), an NPT discontinuity is generated and this discontinuity is represented by a single timeline. In OCAP, a JMF media time is the summation of all timelines in the content. In Figure 1, even though commercials have been added to the content, the NPT still triggers at the appropriate time.

The third JMF enhancement provided by the OCAP DVR spec are additional events and exceptions to monitor the state of the time shift and timeline attributes. For instance, if you rewind to the beginning of the content, you will receive a BeginningOfContentEvent. Similarly, if you fast forward or play past the end of the content, you’ll get a RateChangeEvent. In addition, if you’re playing back live content and fast forward past the end, you will not only receive a RateChangeEvent, but you may also get an EnteringLiveModeEvent (see Listing One). EnteringLiveModeEvent lets you know that you forwarded past the end of the time-shift buffer and are now displaying live video at normal speed.

Digital Video Recordings

While playback supervision is a critical element of the RecordingManager’s functionality, its primary charter is to supervise recordings. The RecordingManager’s responsibilities involve scheduling recordings, resource management, and resolving conflicts.

You initiate a recording by calling

RecordingManager.record( ). This method takes one parameter, a RecordingSpec. A RecordingSpec is an abstract base class that describes the content you wish to record. Because it’s an abstract class, you must pass in a class that inherits from this class to record content. For example, if you wish to record the content that is currently being time shifted, you can use the

ServiceContextRecordingSpec class. The ServiceContextRecordingSpec lets you specify when to start recording and how long

to record the content. Typically, your application would call this API if the user hits the record button while watching TV.

Unfortunately, if the viewer decides to tune away from the current channel, the RecordingManager terminates any recording that was initiated by a ServiceContextRecordingSpec. If you want to ensure that

“Modern DVRs offer at least two tuners that let viewers make two simultaneous recordings”

your recording isn’t aborted by a channel change, then you must use a ServiceRecordingSpec. A ServiceRecordingSpec lets the RecordingManager know that you want to record content associated with a specific channel (service) and is not tied to what program the viewer currently is watching at the time the recording is initiated. This type of RecordingSpec is useful when you want to schedule a recording ahead of time (say, if you want to record a specific football game or a concert).

Resource Monitor

First-generation cable and satellite DVRs had a single tuner (or source of content). This meant users couldn’t watch one program while recording another or simultaneously record two programs. More modern DVRs offer at least two tuners that let viewers make two simultaneous recordings or surf on one tuner while recording on another. Alas, although a dual-tuner product has fewer recording restrictions than a single-tuner solution, there are still limitations (for instance, they aren’t capable of three or more simultaneous recordings).

Thus, no matter how many tuners are in the box, users will eventually request too many recordings and something must perform arbitration (that is, an object must decide which recording requests will be accepted or rejected based on hardware resources). This conflict resolution process is very network specific and consequently, it is not defined in org.ocap.dvr.shared. Rather, in an OCAP DVR, the OcapRecordingManager cooperates with the Multiple Service Operatorspecific monitor app to resolve conflicts.

Recall that in “The OpenCable Application Platform” (DDJ, June 2004), I mentioned the monitor application has access to privileged network resources and is responsible for resolving all resource conflicts on the platform. The OCAP DVR spec extends the monitor application’s responsibilities to include resolution of DVR resource conflicts. For instance, if users have scheduled two recordings at 8:00 PM and an Emergency Alert System (EAS) is broadcast at 8:02 PM, the OcapRecordingManager alerts the monitor application of the conflict and the monitor application would then abort the lower priority recording to ensure that the EAS broadcast has access to the tuner so that the emergency alert could be broadcast.

Conclusion

The OpenCable Application Platform has long promised that it would break the stranglehold that proprietary platforms have on the U.S. cable market. However, since it didn’t offer the DVR capabilities that viewers crave, cable companies have been forced to use proprietary DVR solutions to satisfy consumer demand. Thankfully, the release of the OCAP Digital Video Record specification removes the last hindrance to widespread acceptance of OCAP. Finally we will be able to write Java applications to time-shift DVR content, record our favorite TV programs, and play these programs back with trick controls. Clearly, the release of this specification and boxes that will soon follow is a significant milestone in the evolution of interactive television.

DDJ

Listing One

import org.ocap.shared.media; import org.ocap.dvr;

//sample controllerUpdate() processing for DVR applications

//All JMF applications implement a controllerUpdate() method

//The OCAP DVR specification adds new events that DVR applications

//should listen to. In this illustration, the listener

//monitors EnteringLiveModeEvent

public synchronized void controllerUpdate(ControllerEvent event)

{

//this event will be received when the DVR JMF player is playing live

//content from a tuner. Typically, this event will be received when

//the user does a trick operation (such as a fast forward) that causes

//the player to run out of recorded digital video and automatically

// start playing live content.

if (event instanceof EnteringLiveModeEvent)

{

// your application work would be done here...

}

//this event will be received when the DVR JMF player is

//playing is playing recorded content. This typically

//is generated when the application performs a trick operation

//(i.e. pause or rewind) on live content.

else if (event instanceof LeavingLiveModeEvent)

{

// your application work would be done here...

}

}

DDJ

52

Dr. Dobb’s Journal, December 2005

http://www.ddj.com

XML-Binary

Optimized Packaging

XML and nontext data can work together

ANDREY BUTOV

For several years now, the development community has followed the emergence of XML-based solutions to common data-representation problems. As a metalanguage, XML is a success. A cur-

sory search reveals several of the more popular custom languages utilizing XML as a metalanguage, including MathML for representing mathematical concepts [1], Scalable Vector Graphics (SVG) for representing twodimensional vector and raster graphics [2], and Really Simple Syndication (RSS) [3] for distributing news headlines and other relatively frequently updated data such as weblog postings, for consumption by, among other things, RSS aggregators.

While most XML-derived languages are still used mostly for specialized representations of textual data, not all data domains are suitable for text-based representation. Consequently, there is a growing need to embed binary data into XML documents — in most cases, mixing binary data with plain-text XML tags for contextual description. While there is no standard for doing this as of yet, several approaches have had more success than others. In particular, SOAP [4] has long been used as a means of exchanging structured information, but there are numerous other proprietary implementations of technologies meant to overcome the binary-data dilemma.

In January 2005, the World Wide Web Consortium released a document presenting the latest version of its recommendation of a convention called “XML-

Andrey is a software developer in the Fixed Income division of Goldman Sachs & Company. He can be contacted at andreybutov@ yahoo.com.

binary Optimized Packaging” (XOP). The recommendation presents a way of achieving binary data inclusion within an XML document. There are several common arguments against pursuing the path laid out in the recommendation that are worthy of exploration; however, in its recommendation of XOP, the W3C presents a valid and interesting piece of technology worthy of examination.

The Current State of Things

The W3C’s XOP proposal stems from a long-standing need to encode binary data in XML documents, which are, in nature, text-based animals. Before exploring the W3C proposal itself, take a look at some of the problems surrounding the inclusion of binary data in XML documents, and how these problems are currently being circumvented.

XML is usually presented as a way of describing text data within the context of a well-defined document containing metainformation (which is also text based) meant to bring some structure and form to that text data. There are, however, several domains that do not lend themselves nicely to being represented with textual data only. Most of these domains include the need to represent sound or video data in one form or another. In fact, the W3C itself has outlined several use cases that display a need to encode binary data in an XML document [5].

Assume that you want to somehow include Figure 1 in a transmission of an XML file describing a context in which the picture plays some role. You cannot simply place the binary data composing this picture in between two tags, as in Example 1(a), then transmit the XML file. The problem stems from the “text-centric” nature of XML. Binary data such as this can contain not only null characters (0x00 bytes), but can contain other string sequences, such as </ (0x3C, 0x2F), which can be detrimental to XML specifically. Both the null byte and the string sequence representing an XML closing tag would cause the XML parser on the other side of the transmission to improperly interpret the file. If you’re lucky, the parser would catch

the error because the XML file itself would no longer be properly formed; otherwise, the binary data itself could be improperly interpreted, causing a relatively easy- to-spot problem of improper syntax turn into an issue dealing with data integrity (such as in the case of transmitting binary financial data rather than photographs).

“There are, in fact, several approaches to circumventing the binary inclusion problem”

By the way, the 2-byte sequences previously mentioned are only two culprits. They are enough to make the point, but there are others.

There are, in fact, several approaches to circumventing the binary inclusion problem. One popular (albeit naïve) approach is to place the binary segment into a <!CDATA]]> tag [6]. Developers new to XML tend to think of this tag as a panacea, when in reality, using CDATA only mitigates the problem caused by directly embedding binary data between two arbitrary tags. In this case, while it is true that the data inside the CDATA section will not be validated as XML, CDATA sections cannot be nested, and thus, instead of crossing your fingers and hoping your binary data does not contain a null byte, you cross your finders and hope that the binary data does not contain the sequence ]]>. (Early in my career, I was part of a group of developers who used this exact method to implement a piece of technology responsible for delivering binary financial data inside an XML document via a TCP connection to be displayed in a Flash applet through a web browser. At the risk of nullifying my point, I do not remember ever having an incident of the

http://www.ddj.com

Dr. Dobb’s Journal, December 2005

53

XML parser throwing a fit over an accidental ]]> sequence — but are you willing to take the chance with your application?)

Another approach is to encode the binary data into some string representation. In fact, the XML Schema [7] defines the base64Binary type that can be used for this purpose, as in Example 1(b). This approach is widely used, not only because of the fact that it effectively circumvents the “bad bytes” problem of directly embedded binary data, but also because of the simplicity of implementation. There are many free and shareware utilities that can be used to Base64-encode data. It does, however, have its drawbacks — one issue dealing with

Figure 1: Cleo the puppy.

(a)

<puppy>

<name>Cleo</name>

<color>Black</color>

<photo>

q^@/0?%5</??

...

0t????

</photo>

</puppy>

(b)

<puppy>

<name>Cleo</name>

<color>Black</color>

<photo xsi:type="base64Binary"> 6f9cd3e5(...)

</photo>

</puppy>

(c)

<puppy>

<name>Cleo</name>

<color>Black</color>

<photo>

http://www.somehost.com/cleo.jpg

</photo>

</puppy>

Example 1: (a) Bad idea;

(b)base64Binary encoding;

(c)referencing the binary data.

space, and one dealing with time (and all this time you thought sacrificing one would bring benefits from the other). Encoding binary data in base64Binary typically produces output that is about one-third larger than the original binary data [8], and at the same time, it takes time to both encode the data at the source, and decode the data at the destination, although the cost there would depend on a number of factors.

Yet another approach to the binary inclusion problem is to simply not include the binary data at all. Instead, a link (typically in the form of a URI reference) is presented instead of the binary data, while the data itself is kept at the referenced location. Example 1(c) presents an example of this. This approach has the advantage of not only bypassing all the issues of direct binary data embedding and encoding size/time performance hits, it also allows for the binary data itself to be modified at the referenced source without retransmitting the XML file containing the reference (although this could also be a bad thing, depending on context).

Indeed, there are various current technologies that implement one or, more typically, a combination of these approaches to enable inclusion of binary data in XML documents. Probably the most common technology that addresses this issue is Simple Object Access Protocol (SOAP) [4], or more accurately, SOAP with Attachments (SwA) [9], which utilizes a system of references in conjunction with MIME to deal with binary data in XML. An example of another similar technology is Microsoft’s Direct Internet Message Encapsulation (DIME) [10]. Both of these technologies warrant pages (books?) of additional description, and fall outside the scope of this article.

W3C Recommendation

The first document in the chain of XOPrelated publications was a Working Draft published in February 2004. Public comments on the first W3C Working Draft of XOP were requested at the time of publication [11] and a W3C public archive of the related messages is available for browsing at http://lists.w3.org/Archives/Public/ xml-dist-app/. Subsequent comments on the topic resulted in one more Working Draft [12], a Candidate Recommendation [13], a Proposed Recommendation [14], and finally, the W3C Recommendation serving as the topic of this article [15].

The recommended convention proposes a more efficient way to package XML documents for purposes of serialization and transmission. An XML document is placed inside an XOP package (see Figure 2). Any and all portions of the XML document that are Base64 encoded are extracted and optimized. For each of these chunks of extracted and optimized data,

an xop::Include element is placed to reference its new location in the package. The recommendation makes a note that the encoding and eventual decoding of the data can be part of managing the XOP package — meaning that upon encoding, the internal links are automatically generated, and upon extraction, the links are automatically resolved, and the data can even be presented in its original intended format (sound, picture, and so on [15]).

At its core, the recommendation proposes a rather simple idea. Binary data can be included alongside plain-text XML data, and the XOP package processing would take care of the optimization (which promises to result in a much smaller dataset than the equivalent Base64encoded data), the internal references would be resolved, and all levels of software above the XOP package management would not have to worry about managing the binary data either on the encoding or the decoding side.

An additional benefit, which I believe would drive this technology forward in comparison with DIME or SwA, is that existing XML technologies would continue to work with the XOP package data. This includes already popular XML-based technologies such as XQuery [16] for managing collections of data, XSLT [17] for content transformation, and even XML Encryption. In fact, XML Encryption [18] is referenced in the XOP recommendation itself as the solution concerning issues of security in XOP.

Community Feedback

The W3C recommendation (as well as its predecessors) raised some common arguments in the development community, including one stemming from a misunderstanding of the recommendation:

What the W3C recommendation is NOT. The recommendation does not propose a binary format for XML documents. Although the W3C XML Binary Characterization Working Group does exist for the purpose of exploring the benefits of forming a binary protocol for XML, the recommendation of XOP focuses on a different issue entirely. XOP is merely a way of serializing XML in a way that would make it more efficient to include Base64-encoded data.

What about the purity of XML? This argument is akin to people arguing that ANSI C ruined the purity of K&R C. A programming language (and in this case, metalanguage) remains alive partially due to its ability to adapt to the needs of its users. Any piece of technology that remains too rigid to evolve with those needs eventually dries up. It is precisely the flexibility of XML that has put it in the place it stands today.

54

Dr. Dobb’s Journal, December 2005

http://www.ddj.com

What about a simpler solution? Gzip anyone? Using gzip or some other compression scheme on XML data only has the potential of mitigating the size of the data. However, with this approach, the data is no longer human/parser readable, as would be the case with XOP, where only the Base64-encoded data is optimized and stored away, leaving the rest of the content of the XML well formed, self describing (in as much as XML is self describing), and in a way still able to be manipulated by various XML-focused tools. A compression solution would also not allow large chunks of binary data to be placed at the end of the file, leaving the parser with no choice but to deal with the entire data set upon decompression.

Optimization of Base64 types only? This is actually a valid point. The recommendation does imply that XOP is currently enabled to only optimize Base64encoded data. If the recommendation does evolve into some sort of a formal specification, and the development community picks it up as a practical solution, it would not be at all surprising if the demand grows for XOP to remove this limitation. This remains to be seen.

Conclusion

The existence of SwA, DIME, and countless proprietary technologies is evidence that there is a need to solve a relatively long-standing problem in the development community. The W3C Recommendation attempts to address this problem in a way that would put the burden of dealing with binary data inclusion into the specification of XOP rather than keep it at the application level, which in the long run results in various proprietary solutions, such as the case with SwA and DIME, and produces many forked roads. XML-binary Optimized Packaging is a sound, well thought-out idea that, in spite of common arguments, deserves consideration.

References

[1]David Carlisle, Patrick Ion, Robert Miner, and Nico Poppelier, Editors. Mathematical Markup Language (MathML) Version 2.0 (Second Edition). World Wide Web Consortium, W3C Recommendation, October 21, 2003. http:// www.w3.org/TR/MathML2/.

[2]Jon Ferraiolo, Fujisawa Jun, and Dean Jackson, Editors. Scalable Vector Graphics (SVG) 1.1 Specification. World Wide Web Consortium, W3C Recommendation, January 14, 2003. http://www.w3

.org/TR/SVG11/.

[3]Gabe Beged-Dov, Dan Brickley, Rael Dornfest, Ian Davis, Leigh Dodds, Jonathan Eisenzopf, David Galbraith, R.V. Guha, Ken MacLeod, Eric Miller,

Original Infoset Extraction

XOP Infoset +

Reconstituted

Extracted

Reconstitution

Infoset

Content

Serialization/Deserialization

XOP Package

XOP Document

Extracted Content

Figure 2: XOP Overview.

Aaron Swartz, and Eric van der Vlist. RDF Site Summary (RSS) 1.0, 2001., http://web.resource.org/rss/1.0/spec.

[4]Martin Gudgin, Marc Hadley, Noah Mendelsohn, Jean-Jacques Moreau, and Henrik Frystyk Nielsen, Editors. SOAP Version 1.2 Part 1: Messaging Framework. World Wide Web Consortium, W3C Recommendation, June 24, 2003. http://www.w3.org/TR/soap12.

[5]Mike Cokus and Santiago PericasGeertsen, Editors. XML Binary Characterization Use Cases. World Wide Web Consortium, W3C Working Draft, February 24, 2005. http://www.w3.org/TR/ xbc-use-cases/.

[6]Tim Bray, Jean Paoli, C.M. SperbergMcQueen, Eve Maler, and Frangois Yergeau, Editors. Extensible Markup Language (XML) 1.0 (Third Edition). World Wide Web Consortium, W3C Recommendation, February 4, 2004. http:// www.w3.org/TR/REC-xml/.

[7]Paul V. Biron, Kaiser Permanente, and Ashok Malhotra, Editors. XML Schema Part 2: Datatypes Second Edition. World Wide Web Consortium, W3C Recommendation, October 28, 2004. http:// www.w3.org/TR/2004/REC-xmlschema- 220041028/datatypes.html#base64Binary.

[8]Sameer Tyagi. “Patterns and Strategies for Building Document-Based Web Services.” Sun Technical Articles, September 2004. http://java.sun.com/developer/ technicalArticles/xml/jaxrpcpatterns/

index.html.

[9]John J. Barton, Satish Thatte, and Henrik Frystyk Nielsen, Editors. SOAP Messages with Attachments. World Wide Web Consortium, W3C Note, December 11, 2000. http://www.w3.org/TR/ SOAP-attachments.

[10]Jeannine Hall Gailey. “Sending Files, Attachments, and SOAP Messages Via Direct Internet Message Encapsulation,” MSDN Magazine, December 2002. http://msdn.microsoft.com/msdnmag/ issues/02/12/DIME/default.aspx.

[11]Noah Mendelsohn, Mark Nottingham, and Hervi Ruellan, Editors. XML-binary

Optimized Packaging. World Wide Web Consortium, W3C Working Draft, February 9, 2004. http://www.w3.org/ TR/2004/WD-xop10-20040209/.

[12]Noah Mendelsohn, Mark Nottingham, and Hervi Ruellan, Editors. XML-binary Optimized Packaging. World Wide Web Consortium, W3C Working Draft, June 8, 2004. http://www.w3.org/TR/2004/ WD-xop10-20040608/.

[13]Martin Gudgin, Noah Mendelsohn, Mark Nottingham, and Hervi Ruellan, Editors. XML-binary Optimized Packaging. World Wide Web Consortium, W3C Candidate Recommendation, August 26, 2004. http://www.w3.org/TR/ 2004/CR-xop10-20040826/.

[14]Martin Gudgin, Noah Mendelsohn, Mark Nottingham, and Hervi Ruellan, Editors. XML-Binary Optimized Packaging. World Wide Web Consortium, W3C Proposed Recommendation, November 16, 2004. http://www.w3.org/ TR/2004/PR-xop10-20041116/.

[15]Martin Gudgin, Noah Mendelsohn, Mark Nottingham, and Hervi Ruellan, Editors. XML-binary Optimized Packaging. World Wide Web Consortium, W3C Recommendation, January 25, 2005. http://www.w3.org/TR/xop10/.

[16]Scott Boag, Don Chamberlin, Mary F. Fernandez, Daniela Florescu, Jonathan Robie, and Jerome Simeon, Editors. XQuery 1.0: An XML Query Language. World Wide Web Consortium, W3C Working Draft, February 11, 2005. http:// www.w3.org/TR/xquery/.

[17]James Clark. Editor. XSL Transformations (XSLT) Version 1.0. World Wide Web Consortium, W3C Recommendation, 16 November 1999. http://www

.w3.org/TR/xslt.

[18]Joseph Reagle. XML Encryption Requirements. World Wide Web Consortium. W3C Note, March 4, 2002. http:// www.w3.org/TR/xml-encryption-req.

DDJ

http://www.ddj.com

Dr. Dobb’s Journal, December 2005

55

A Mac Text Editor Migrates to Intel

Porting a commercial Mac application

TOM THOMPSON

In June of this year, Apple Macintosh developers discovered that they had a formidable challenge set before them. The bad news was that to continue in the Macintosh software market, they were going to have to migrate their Power PCbased applications to an Intel x86-based Macintosh. The good news was that the operating system, Mac OS X, had already been ported and was running on prototype x86-based systems. Furthermore, Apple Computer offered crosscompiler Xcode tools that would take an application’s existing source code and generate a “universal binary” file of the application. The universal binary file contains both Power PC and x86 versions of the program, and would therefore execute on both the old and new Mac platforms. (For more details on how Apple has engineered this transition, see my article “The Mac’s

Move to Intel,” DDJ, October 2005.) Apple provided developer reports that

describe the migration of an existing Mac to the new platform as relatively quick and easy. I don’t dispute those reports. There are situations where a particular application’s software design dovetails nicely with the target platform’s software. For example, if the application is written to use Cocoa, an object-oriented API whose hardware-independent frameworks implement many of Mac OS X’s system services, the job is straightforward.

However, not all Mac applications are in such an ideal position. Many commercial Mac applications can trace their roots back to when the Mac OS API was the only choice (Mac OS X offers four), and

Tom is a technical writer providing support for J2ME wireless technologies at KPI Consulting Inc. He can be contacted at tom_thompson@lycos.com.

most code was written in a procedural language, instead of an object-oriented one. When migrating an application to a new platform, developers are loath to discard application code that’s field-tested and proven to work. Using such “legacy” code potentially reduces the cost and time to port an application because the process starts with stable software. The downside to this strategy is that code idiosyncrasies or subtle differences in the implementation of the legacy APIs on the new platform may hamper the porting process. So, for the majority of Mac developers, the move to Intel might look more like a leap of faith than an easy transition.

This brings us to the $64,000 (or more) question: For most existing Mac applications, how difficult and costly is it to migrate a Power PC Mac application to the x86-based Mac platform? To see where the truth lies, I present in this article a case study of the porting process for a commercial Mac application.

An Editor with

Rich Features and History

The application in question is Bare Bones Software’s BBEdit, an industrial-strength text editor (http://www.barebones.com/). “Text editor” is perhaps a bit of a misnomer here, because over the years, BBEdit’s text-processing capabilities have evolved to suit the needs of several categories of user. Besides being a powerful programming editor for writing application code, web wizards use it to write HTML web pages, while professional writers use BBEdit to produce all sorts of text documents, ranging from manuals to novels.

BBEdit offers a wide variety of features and services to meet the demands of these three disparate groups of users. For programming work, the editor displays the source code of C, C++, Objective-C, Java, Perl, Python — in total, about 20 programming languages with syntax coloring. It features integration with the Absoft tools and interoperates with the Xcode tools as an external editor. For HTML coding, menu choices let you quickly generate HTML tags. The HTML code is also syntax colored, and syntax checks can be run on the

HTML to spot coding errors. Furthermore, BBEdit can render the HTML that you just wrote so that you can preview the results immediately and make changes. When the web page’s code is ready, BBEdit lets you save it to the server using an FTP/SFTP connection. For just plain writing, the editor is fast, allowing you to quickly search, change, and scroll through large documents

“BBEdit draws upon many of the APIs available in Mac OS X to implement its feature set”

consisting of 7 million words or larger. A builtin spellchecker lets everyone — programmers, web wizards, and writers alike — check for spelling errors. Finally, it provides a plug-in interface so that new features and tools can be added to the application in the future.

The current 8.2 version of BBEdit, while changed in many ways from the original version (the World Wide Web didn’t even exist when the first version of BBEdit was written), still owes a lot to the code design of its ancestors. To appreciate how the Bare Bones Software team managed the current transition to the x86 platform, it helps to review the editor’s rich and complex history. This is due to the fact that the program has been revised numerous times and completely rewritten several times. As you’ll see, certain design decisions, some made years ago, profoundly affected the current (and third) transition to the x86 platform.

BBEdit is a classic Mac application in every sense of the word, being 15 years old. It was written in 1990 to provide a lightweight— yet fast— code editor for the then-extant 68K Mac platform. The bulk of BBEdit was originally written in Object Pascal, while performance-critical sections of the program were written in C and 68K

56

Dr. Dobb’s Journal, December 2005

http://www.ddj.com

assembly. Mac old-timers will recall that the preferred programming language for Mac applications was Pascal, and that you accessed the Mac APIs with Pascal stackbased calling conventions. C compilers of that era used glue routines to massage the arguments of a C-based API call so that they resembled a Pascal stack call.

The first major code change BBEdit underwent was when Apple transitioned the Mac platform from the Motorola 68K processor to the Motorola-IBM Power PC processor in 1994. The software engineers considered what was required to port the program, and the decision was made to revamp portions of the editor, in particular using C to replace the 68K assembly code.

During the 1999–2000 timeframe, BBEdit was rewritten entirely in C++. The reason for the programming language change was that since the early ’90s, Pascal was on the wane, as many tool vendors and universities embraced C and C++ as the programming languages of choice. The shift to C++ was made due to the dwindling support for Pascal compilers.

The compilers of that transition era (notably Metrowerks’ CodeWarrior) were able to take an application’s source code and generate a “fat binary,” which was an application that contained both 68K code and Power PC code. Such an application, while larger in size, could run at native speeds on either a 68Kor Power PCbased platform during the transition period. The current universal binary scheme is similar in concept.

The next code change occurred when BBEdit was migrated from Mac OS 9 (which is considered the “Classic” environment today) to Mac OS X. Work began in 1999, and was completed by Mac OS X’s launch in 2001. Although the new OS offered several different APIs (Cocoa, POSIX UNIX, and Java), Apple had to support the Classic APIs, or else the platform would lose a huge body of existing Mac OS software. To this end, Apple provided a fourth set of APIs, the Carbon APIs. On the front end, Carbon resembles the Classic Mac OS APIs. On the back end, Carbon provided existing applications access to all of Mac OS X’s system services, such as memory protection and preemptive multitasking. However, Carbon lacked a handful of the Classic APIs that caused compatibility problems, and some of the capabilities of the retained APIs were modified slightly so that they would work within Mac OS X’s frameworks.

For the migration to Mac OS X, C, C++, and Carbon were used to handle the core application services, such as text manipulation. However, where deemed appropriate, the other Mac OS X APIs were used to implement specific application features. For example, Cocoa provided access to the Webkit framework, which renders the

HTML for BBEdit. These sections of the ap-

For the move to Mac OS X, a complete

plication were thus written in Objective-C

rewrite of BBEdit to Cocoa was consid-

and Objective-C++. Other services, such as

ered. Given the company’s finite resources,

SFTP, utilized the POSIX UNIX API. Table

it was decided that BBEdit’s customers

1 shows how BBEdit draws on different

would be best served by adding features

Mac OS X APIs to implement its features.

that they wanted, rather than rewriting the

Endian Issues

he infamous “Endian issue” occurs be-

a Little-Endian processor. Consequently,

cause the Power PC and Intel proces-

the wrong values are fetched, which pro-

Tsors arrange the byte order of their

duces disastrous results.

data differently in memory. The Power

The Endian issue manifests itself an-

PC processor is “Big Endian” in that it

other way when multibyte data is accessed

stores a data value’s MSB in the lowest

piecewise as bytes and then reassembled

byte address, while the Intel is “Little En-

into larger data variables or structures.

dian” because it places the value’s LSB in

This often occurs when an application

the lowest byte address. There is no tech-

reads data from a file or through a net-

nical advantage to either byte-ordering

work. The bad news is that any applica-

scheme. When you access data by its nat-

tion that performs disk I/O on binary data

ural size (say using a 16-bit access for a

(such as reading a 16-bit audio stream),

16-bit variable), how the bytes are ar-

or uses network communications (such

ranged doesn’t matter and programs op-

as e-mail or a web browser), can be

erate as they should. However, when you

plagued by this problem. The good news

access multibyte variables through over-

is that each set of Mac OS X APIs provide

lapping unions, or access data structures

built-in methods that can reorganize the

with hard-coded offsets, then the posi-

bytes (this rearrangement is termed “byte

tion of bytes within the variable do mat-

swapping”) for you. For more informa-

ter. Although the program’s code executes

tion on these routines, see the Apple Core

flawlessly, the retrieved data is garbled

Endian Reference document (http://

because of where the processor stores the

developer.apple.com/documentation/

bytes in memory. Stated another way,

Carbon/Reference/CoreEndianReference/

code written for a Big-Endian processor

index.html).

accesses the wrong memory locations on

—T.T.

 

 

Best Programming Practice Rules

1. Follow the guidelines. Apple pro-

by building it with a second (and even

vides documents that describe safe

third) compiler. The other compiler,

coding practices for both Mac OS X

through its warning messages, can

and for tailoring applications to be

point out subtle coding problems

universal binaries. You’ll save your-

overlooked by the first compiler. Rich

self a lot of effort by studying and fol-

Siegel puts it concisely: “More com-

lowing the coding recommendations

pilers result in higher quality code.”

that these documents provide.

4. Examine the code, line by line.

2. No warning messages discipline.

This is the hardest process to follow,

When you build your application, the

but it is probably the most important.

compiler shouldn’t generate warning

During each migration of BBEdit, the

messages. While some of the warn-

engineers scrutinized every line of code

ings may seem trivial, they also hint

in the application. Such careful exam-

of trouble lurking in the code. All

ination of the code not only identifies

problematic code should be exam-

what code needs to be revised to sup-

ined carefully and revised to elimi-

port the port, but it also discovers sub-

nate the warning messages. The BB-

tle code issues not uncovered by rules

Edit engineers have a strict policy on

two and three. It’s also valuable for

not allowing any warning messages

winnowing code whose purpose has

during a code build.

been taken over by a new API or sys-

3. When compiling, more is better.

tem service. The change in BBEdit to

Like a patient seeking a second opin-

save its preferences using an API rather

ion for a subtle malady, often a de-

than a custom resource was the result

veloper will seek a second opinion

of this examination.

on the quality of an application’s code

—T.T.

 

 

http://www.ddj.com

Dr. Dobb’s Journal, December 2005

57

application. According to Rich Siegel, BB-

tures, then the feature code would use the

Edit’s creator, “New capabilities will be

Carbon APIs.

added to the editor as needed, and in do-

The Final Frontier: The Shift to Intel

ing so, we’ll use the right tool for the right

job.” In other words, if the Carbon APIs

As has become obvious, BBEdit draws

offered the best way to implement the fea-

upon many of the APIs available in Mac

 

 

 

Feature/Function

API Used

Programming Language

Editor

Carbon

C/C++

Screen display

Carbon

C/C++

FTP transfer

Carbon

C/C++

OSA scripting

Carbon

C/C++

Webkit framework

Cocoa

Objective-C/Objective-C++

Spelling framework

Cocoa

Objective-C/Objective-C++

SFTP

POSIX

C/C++

Authentication

POSIX

C/C++

UNIX scripting

POSIX

C/C++

 

 

 

Table 1: Mac OS X APIs that BBEdit uses to provide features.

NAVIGATE • EXTRACT • REPURPOSE

MAKE THE

WORLD WIDE WEB

YOUR DATABASE

WebQL — advanced data

 

extraction integrated

 

with IBM’s Unstructured

 

Information Management

 

Architecture (UIMA).

316 Occidental Ave S. Ste 410

Available as installed

Seattle, Washington 98104

software or hosted service.

TOLL FREE: 800-750-8830

Download free trial

MAIN: +1-206-443-6836

software or request a

FAX: +1-206-269-0694

proof of concept.

www.ql2.com

OS X to implement its feature set. The result is that the application consists of an amalgam of C/C++ code and Objective- C/Objective-C++ code. On the surface, this mix of APIs and programming languages might complicate the shift to the x86 platform. However, the earlier port of BBEdit to the Power PC version of Mac OS X worked in BBEdit’s favor here, because it limited the problems the team had to deal with to just those brought about by the OS’s behavior on the new platform.

First, with Mac OS X 10.4, the OS itself was no longer a moving target. In earlier releases of the OS, its underlying architecture — notably the kernel APIs — was in a state of flux and subject to change. And change these low-level APIs they did, as Apple refined the kernel and underlying frameworks to make improvements. As a consequence, each new release of the OS left broken applications in its wake, an unpleasant outcome that dissuaded many Mac users from switching to Mac OS X. Mac users, after all, want to get work done, not futz with the software.

In 10.4, the kernel programming interfaces (KPIs) have been frozen, and a versioning mechanism lets drivers and other low-level software handle those situations when the KPIs are changed to support new features. The result is an underlying infrastructure for the OS that’s stable and consistent across different platforms. This, in turn, makes the porting process manageable.

According to the BBEdit engineers, Mac OS X 10.4 does a good job at hiding the hardware details, while still providing lowlevel services (such as disk I/O). In addition, its APIs are mostly platform neutral, which means no special code is required to counter side-effects when invoking the APIs on each platform. Put another way, the code to call an API is identical for both platforms, and the results of the API call are identical; no glue code is necessary.

The one side-effect of the x86 processor that Mac OS X can’t counter is the Endian issue. The Endian issue arises because of how the data bytes that represent larger values (such as 16and 32-bit integers, plus floating-point numbers) are organized in memory. (For details, see the accompanying textbox entitled “Endian Issues.”)

According to Siegel, the Endian issues “were minor, but present.” An earlier design decision actually side-stepped what could have been a major problem for the editor in this area. In 2000, BBEdit 6.0 began using Unicode to support the manipulation of multibyte text for Asian, Arabic, and certain European languages. Unicode is a standard for encoding complex language characters and describing their presentation. For example, Unicode not only specifies the characters in Arabic text, it

58

Dr. Dobb’s Journal, December 2005

http://www.ddj.com