C++ and Objective C – Section 3: The Objective C Language

excerpt notice

This is a hypertext version of chapter 14 of the book Handbook of Software for Engineers and Scientists, published by CRCPress in 1996. Please read the Introduction section to understand the context in which this was written.

C++ and Objective C – Introduction

C++ and Objective C – Section 1: Common Ideas

C++ and Objective C – Section 2: The C++ Language

C++ and Objective C – Section 3: The Objective C Language

C++ and Objective C – Section 4: Summary and Comparison

3.0 The Objective C Language

Objective C is an object-oriented extension to the ANSI C language developed by Brad Cox at The Stepstone Corporation [Cox, 1991]. The singular design goal for Objective C was to add support for object-oriented programming to C. This is in marked contrast to the multiple goals for C++ (see earlier section). The inspiration for the OO support in Objective C was Smalltalk. The influence of Smalltalk can be seen in fundamental ways, such as the dynamic nature of the language, and in less important ways, such as the messaging and class definition syntax.

The differences between Objective C and C++ can be summarized very simply. Compared to C++, Objective C is:

less feature rich, less complex
more dynamic

These differences have both good and bad implications. For instance, a language which is simple is also easier to learn and use. On the other hand, a rich language is more likely to have exactly the one feature which happens to be a perfect solution to a particular programming problem. Philosophically, the largest difference between the two is the static nature of C++ versus the dynamic nature of Objective C. Both philosophies have their strengths and weaknesses. The final section of this chapter directly addresses the question of C++ versus Objective C.

There are currently four Objective C compilers commonly available: the original product from Stepstone including a library of foundation classes, the compiler and class libraries for NEXTSTEP development from NeXT Computer Inc. [NeXT, 1993], the Free Software Foundation's GNU C compiler is also an Objective C compiler, and the Berkeley Productivity Group has a product which can be used with the Borland C compiler to support Objective C in a Microsoft Windows environment. There is no standardizing effort underway for the Objective C language. Each compiler supports different features. The NeXT and GNU compilers are very similar, as NeXT started from GNU C, then returned their Objective C extensions to the Free Software Foundation. A summary of the differences between compilers can be found in the advanced features section below.

3.1 Relation to ANSI C

ANSI C is a subset of Objective C. Nothing special need be done to use ANSI C functions within Objective C programs. Objective C does not improve upon ANSI C in any way other than adding support for OO programming.

3.2 Objective C Support for OO Programming

As a C programmer familiar with OO concepts, learning Objective C is a very simple matter. As a C programmer without much OO experience, Objective C is considerably easier to learn than C++, as it minimizes the number of new language features needed for OO programming. The extensions made to Objective C are in two areas: support for messaging, and a syntax for class definition.

3.2.1 Classes

The concept of class is the principle means of data encapsulation and abstraction in Objective C. The relation of class to object is that of a data type to its variable. A class consists of members: data and methods. Instance variables (sometimes abbreviated as "ivars") are the data members of a class. Objective C methods may be of two types, class or instance. This section will use a common convention for the names of instance variables, methods, objects, and classes. Names are concatenated, descriptive words. In the case of classes, the first letter of each word is capitalized, like Employee, GraphNode, Animal. For ivars, methods, and objects the first letter of each word except the first word is capitalized, like printFirstName, fedTaxWithholding, and newNode.

Objective C class definitions consist of two parts, an interface section, and an implementation section. In general, the interface to a class is stored in a header file, while the implementation is in a separate file with a .m extension. Following the example used in the C++ section of this chapter, here's our minimal Employee class interface declaration:

@interface Employee : Object
{
 int empId;
 char* fName;
 char* lName;
}
@end

Objective C compiler directives begin with an @ symbol. Since every Objective C class is part of a hierarchy with the root class known as *Object*, custom classes which are "standalone" (i.e. not subclasses of some other class in the hierarchy) are subclasses of Object. We declare the interface to the functionality of our Employee class by adding prototypes for the methods of the class.

@interface Employee : Object
{
 int empId;
 char* fName;
 char* lName;
}
- init: (int)eid;
- (int) empId;
- (int) lengthOfService: date;
- (void) printName;
@end

The prototype of each instance method is preceded by a "-". Much like the default type in C is int, the default type for Objective C is id. The id data type is used to hold the identity of an Objective C object. Methods which don't return type id, or have arguments which are not of type id, must be explicitly denoted as such with a type cast notation.

The implementation file for this Employee class would have four methods defined in it, and would include the header file for the class, as well as any other header files the implementation required. Objective C pre-processors support a smart version of #include called #import which won't include a file twice. This means the programmer doesn't have to worry about and guard against this possibility.

#import "Employee.h"
#import "stdio.h"
@implementation Employee
- init: (int)eid
{
 hireDate = [[Date alloc] init];
 [hireDate setDate: hire_date_from_DB(eid)];
 return self;
}
- (int) empId
{
 return empId;
}
- (int) lengthOfService: date
{
 if( [hireDate isBefore: date] )
 return 0;
 else
 return [hireDate differenceInDays: date];
}
-(void) printName
{
 printf("%s, %s
", lName, fName);
}
@end

Instance methods have full access to the instance variables of the class. Control over instance variable access is provided via the @public, @protected, and @private directives. Instance variables declared with the @protected directive are accessible to instances of the class and subclasses of the class. This is the default level of protection. Instance variables which are declared @private, are not inherited by subclasses. Instance variables declared @public are accessible to the world. Since @public variables defeat the idea of encapsulation, they are almost never used. The following declaration of the Employee class interface would make empId public, and would hide fName and lName from subclasses of Employee.

@interface Employee : Object
{
@public
 int empId;
@private
 char* fName;
 char* lName;
}

Objective C does not support class data members (i.e. data shared by all objects of the same class), however, they can be emulated through the use of the static modifier. For example, declaring a static int count variable in the implementation file of the Employee class would result in all Employee objects sharing a single integer variable called count. This is in contrast to instance variables, where each object has its own personal copy. A static data member provides a means of having a "class global" variable while preserving data encapsulation.

Objective C does support class methods. Each class has a class object associated with it which is constructed by the compiler. The class object does not have instance variables, but can be messaged just like any other object. Class methods, also known as factory methods, may be part of a message sent to the class object, rather than the instances of a class. Since the main role of a class object is to create instances of its class type, class methods most often have to do with the creation of objects. The declaration of class methods in a class interface is distinguished from instance methods by being preceded by a "+", instead of a "-".

3.2.2 Objects

Classes play the role of data type to objects. In Objective C, however, classes are not passive; they may be messaged to perform such tasks as object creation. Even with class methods and active classes, a class doesn't get all your work done. Objects are active. An object is instantiated from, or is an instance of, its class. Objects get your work done. Objective C added the data type id to represent objects. A variable of type id can hold the identity of an object of any class.

Objects get work done by messaging each other. Each message has at least two components: a receiver and a method. The method must be a member of the receiver's class definition, or it must be inherited from a superclass. The optional third component of a message is the set of parameters required by the method being invoked. The syntax for messaging is:

[receiver method: arg];

where receiver is the identity of an object, method is the name of a method that the receiver responds to, and arg is the parameter supplied to method.

Parameters to the method, if any, are separated by colons with optional descriptive names for each parameter. The colons are part of the actual method name, and may distinguish like-named methods from each other. For example, a method init with no arguments is distinct from a method init: which requires one parameter. In the examples below, myObject, window, employee, and square are all variables of type id which have been initialized to objects:

[window display];
[myObject aMethod: 1 andSecondArgument: "two";
[employee calculateEarningsFor: 1995];
[square moveTo: 0.0 :0.0];

Messaging in Objective C has a clean syntax because of the id data type and dynamic binding. In effect, id is equivalent to the void" type in ANSI C; it can hold a pointer to an object of any type. The class of the receiver of a message is not determined until run-time. Messages may be nested, provided the return value of all but the outermost message is the identity of an object:

[[square moveTo: 0.0 :0.0] display];

The following example shows a typical message sent to a class object. The class method alloc asks the Employee class to allocate a new Employee object:

[Employee alloc];

The GNU and NeXT compilers do not allow for static object allocation; the Stepstone compiler does. All Objective C compilers support dynamic object allocation, most often using the class method alloc. The return value of the alloc method is the identity of the newly allocated object. This return value is then used as the receiver in a second message, where the init method initializes the new Employee object. Since an un-initialized object is a dangerous object, it is safer to nest the init message.

The init method is responsible for initializing the instance variables of the object. Since an object inherits instance variables from its superclass, the first line of code in an init method is usually a message to the superclass init method:

- init
{
 [super init];
 // initialize ivars here
 return self;
}

Objects can refer to themselves with the special instance variable self. This allows an object to message itself, which facilitates the decomposition of complex methods into smaller pieces. In this example, the parsing of a line of input (oftentimes ugly, detailed C code), is isolated in another method to simplify the takeActionFor: method and possibly share the parsing algorithm with other methods in the class:

- takeActionFor: (const char* line)
{
 int command;
 command = [self parseLine: line];
 // take action for command
 return self;
}

To facilitate the nesting of messages, methods often return self when they don't have anything else to return. Objects may also invoke methods in their superclasses, even when their own class has overridden inherited an method. An object does this by messaging super, for example:

[super draw];

Unlike self, super is not an instance variable, and thus cannot be changed at run-time. Super is a flag to the compiler which changes where the run-time search begins for a method to execute.

Objective C objects can be of either method, class or global scope. For compilers which support them, statically allocated objects may be scoped to the method in which they are defined. Such objects are created automatically when a method is entered, and destroyed when the method returns. As with all instance variables, objects declared as class members have the scope of that class. The dynamic creation of objects allows for objects of global scope; they are visible in any portion of the program at which their identity is available, and they are not automatically destroyed when the method in which they were created returns. Objects created dynamically must be explicitly freed in order to return their memory to the free store:

id date = [[Date alloc] init]; // new Date object
[date free]; // return memory to free store

The NeXT Foundation Kit class library provides a means of semi-automatic garbage collection via reference counting. This scheme represents a compromise between performance and programmer convenience.

3.2.3 Inheritance

One of the most characteristic means of representing relationships between abstractions in an OO program is with inheritance. Inheritance means forming subclass/superclass relationships between classes. A subclass inherits members from its superclass. Many classes together form an inheritance, or a class, hierarchy. Classes at the top of a hierarchy represent the data and functionality common between the classes which inherit from them. Classes at the top tend to be more abstract; objects are usually not instantiated from such classes. Classes at the bottom of the hierarchy are more concrete, and are more likely to be instantiated into objects. Classes which inherit from only one superclass exhibit single inheritance. Classes which inherit from more than one superclass exhibit multiple inheritance. In keeping with its clean and simple design, Objective C does not support multiple inheritance, though features have been developed to replace some of the functionality provided by multiple inheritance (see run-time section below).

The root of the Objective C class hierarchy is the Object class. While this name causes endless confusion in teaching neophyte programmers about OO programming, it is descriptive of the primary responsibilities of this class. On first blush, the idea of a common class from which all classes inherit is odd. After all, what could possibly be common to every class ever defined? The answer lies in the name: support for creating, initializing, releasing, copying, and comparing objects, and the interface of objects to the run-time system are the main responsibilities of this class.

The superclass of a class defines where in the hierarchy a class fits in. The syntax for declaring a class' superclass is simply a colon, followed by the name of the superclass in the class interface declaration. The skeleton declarations shown in Figure 6 are illustrative:

All class and instance methods of a class are inherited by subclasses of that class. Unlike with the compiler directives for instance variables, there is no control over the inheritance of methods. One convention for private methods is simply to not declare them in the interface of the class. Such "private by convention" methods may still be used by other methods of the class (i.e. internally), but they will not be visible to the programmer clients who use the class, either directly as an object, or indirectly via subclassing. If however, a client learns the name of these private methods, they may be used; this form of access control is not supported by the compiler or run-time, it is merely a statement that a method is not considered part of the public interface of the class.

3.2.4 Polymorphism

Polymorphism is the idea that the code which is executed when a message is sent to an object depends on both the receiver's class and the name of the method in the message. In traditional procedural languages, the code which is executed by a function call is determined by the name of the function only. Objective C's support for polymorphism is simple: the type of the receiver of a message is not determined until run-time, so the method which is executed is not decided until run-time, and that decision is based on the receiver's class.

For example, suppose Window is a subclass of View, and they both implement a method called flush.

id window = [[Window alloc] init];
id view = [[View alloc] init];
id anObject;
[view flush]; // the flush method in View
[window flush]; // the flush method in Window
anObject = view;
[anObject flush]; // the flush method in View
anObject = window;
[anObject flush]; // the flush method in Window

Since the compiler does not try to statically bind methods to messages, the run-time system must decide which code gets executed for a given message. The decision is based on the type of the receiver when the message is sent. The "universal object identity" type, id, allows this code to compile without warning; the compiler does not try to check that the type of the receiver matches the method being invoked (static typing). It is important to note that in Objective C, dynamic binding is not limited to classes which are related to each other. In other words, the example above will work even if View and Window have no hierarchical relationship. For example, suppose you had a class Toilet, which also defined a method flush. Then we can extend the example above by adding:

id toilet = [[Toilet alloc] init];
anObject = toilet;
[anObject flush]; // the flush method in Toilet

and the correct method is still invoked.

The id data type in Objective C is a good match to dynamic binding, since it can represent the identity of objects of any class, and the compiler doesn't attempt to statically bind code to messages. In this scheme the programmer loses the advantage of having the compiler check that the method being sent to an object is one which the object responds to. Performing this check is known as static type checking. Objective C can support static type checking by declaring variables which represent objects to be of type pointer to the object's class, rather than id. For example, in the following code fragment, the compiler could check that the method display was understood by the Window class:

Window *help;
[help display];

If the variable help had been defined as id type, the compiler could not do this static type check. Declared as a pointer to Window, the compiler will only allow the variable help to be assigned to the identity of a Window class object, or an object of a type which is a subclass of Window. However, the compiler will only generate a warning when an assignment is done to another statically typed object. If the variable help is assigned the value of an id variable, the compiler will not (cannot) check that the types are compatible. Static typing does not affect dynamic binding, so if help was assigned the identity of an object of a subclass of Window at run-time, the code for the subclass' display method would still be executed.

Objective C does not support the operator overloading form of polymorphism.

3.2.5 Run-time Features

The run-time system in Objective C is a critical part of the language. The run-time provides the behavior of dynamic binding, as well as some other very powerful language features, such as dynamically loading classes into a running program, providing for persistence of objects, and supporting some of the features of multiple inheritance. The capabilities and implementation of the run-time component of Objective C tends to vary between compilers more than other features of the language.

Dynamic type checking means that it is possible for message to be sent to an object with a method which the object does not implement. The run-time system will detect when this will occur, and will instead send a forward:: message to the object. The forward:: method is part of the Object class, and is thus understood by all classes. The two arguments to forward:: are the name of the method originally invoked, and the arguments to that method. By default, the implementation in Object class of the forward:: method is simply to print an error message and exit. But a programmer can override this behavior and do any of a number of things. For example, all messages sent to an object with an unknown method may be caught and sent to an error handling object. Another use of forwarding is the ability to have one object perform some action for another. This can be used to mimic multiple inheritance. For example, suppose a Professor class object is sent a method which it does not implement, say writeRealCode. A Professor object may know of an ex-student Programmer object working in industry who writes real code and is willing to help him out. Objects of the Programmer class implement the writeRealCode method. The Professor object can then implement a forward:: method to have a Programmer object execute the method on its behalf. The Professor forward:: method might look like this:

- forward:(SEL)aMethod :(marg_list)args
{
 if( [exStudent respondsTo: aMethod] )
 return [exStudent performv: aMethod : args];
 else
 return [super forward: aMethod :args];
}

As implemented above, the forward:: method first checks to see if the exStudent object responds to the writeRealCode method. It does this by asking the exStudent object about itself. The respondsTo: method is one of several methods in Object which are concerned with the capabilities of the class of an object. If this particular Professor object has no exStudent to turn to, exStudent will be null, and the message is a no-op. If exStudent is not null, and if it responds to the writeRealCode method, then exStudent will perform the writeRealCode method for the Professor object. The performv:: method has exStudent execute the writeRealCode method just as if it had been messaged directly. The Professor class has used the writeRealCode method from the Programmer class just as it is defined in the Programmer class. If a particular Professor object does not have a willing Programmer object in the guise of exStudent, then it will instead invoke the inherited behavior for forward::, which, if not overridden in a superclass would be the default implementation in Object. Professors without anywhere to turn to write real code would print an error message and exit.

How does forwarding replace multiple inheritance? Suppose Programmer and Professor are both subclasses of Person, the Professor class has a single method, teach and Programmer has a single method writeRealCode. Figure 7 illustrates the inheritance hierarchy. They both inherit common functionality from Person.

Suppose we want to give some Professor objects the ability to write real code, in addition to the ability to teach. One way to do this would be to have the Professor class inherit from the Programmer class. But this violates the "isa" relationship between super and subclass. If every professor was not a programmer, then it is not true that every instance of Professor "isa" Programmer. Another way to have Professor objects know how to write real code would be to use multiple inheritance; a Professor could inherit both from Person and Programmer, as in Figure 8a. This might make Professor a complex class, since it would inherit everything (instance variables and methods) defined by Programmer. Additionally, their is immediate ambiguity concerning the methods and ivars which would be inherited by Professor from Person via two paths. For example, the ivar char "name declared in Person would be inherited twice by Professor. Multiple inheritance would make the otherwise simple Objective C language complex. A third strategy is required. The forward:: method of the Object class, and support for it in the Objective C run-time, provide this third way. With forwarding, the Professor class need only use what functionality it wants from the Programmer class. Note that, as written above, the Professor class is not selective about what it forwards to Programmer; all unknown methods would be forwarded. Figure 8b illustrates the forwarding solution.

The multiple inheritance solution suggested for Professor objects who can code requires that every Professor "isa" Programmer. The forwarding solution allows for some Professor objects to employ the services of a Programmer object, or for some Professor objects to also be programmers, via composition.

One consequence of the dynamic nature of Objective C is the ability to dynamically load classes at run-time. This capability offers incredible flexibility to programmers, as they can construct programs which use classes not even anticipated when the program is written. The original programmer provides a framework, and subsequent programmers can supply classes which are dynamically loaded, extending the framework in some unforeseen fashion. The Preferences application in NEXTSTEP is a good example of dynamic class loading. The Preferences application is used to customize various environmental factors to suit a particular user. There are modules in Preferences.app to tailor the speed of the mouse, the flavor of the keyboard, the color of the screen background, etc. When launched, the Preferences application searches in known locations for dynamically loadable modules. If any are found, they are seamlessly integrated with the application. The custom classes dynamically loaded from a module are no different at run-time than the classes compiled into the Preferences.app itself. The author of Preferences specified only an interface, subsequent programmers added functionality to Preferences.app never anticipated by the author by extending the Objective C class hierarchy.

In the NeXT compiler, object persistence is supported by the run-time capability of archiving an object's instance variables to a stream. This allows the state of an object to be saved and later restored. The Object class defines read: and write: methods to restore an object from an archive, and to write an object to an archive. Archiving is also useful for the exchange of objects via a pasteboard, or a network. The read:/write: methods must be overriden for custom classes, since Object obviously can't anticipate the form of the ivars in every class. Each class which declares instance variables (and wishes to be archiveable) must implement read: and write:. These methods first invoke their superclass' corresponding method, so that archiving actually happens from the top of the class hierarchy down to the class of the object being archived.

3.2.6 Defeating OO Principles

As a hybrid OO language, Objective C inherits features from its base language which can be used to subvert various OO principles. In contrast to C++, however, there are no features (like friend functions) which are purposeful additions to the language and can be subversive to OO programming. Instead these features are artifacts of its hybrid nature.

Instance variables may be declared in an @public section of a class. This allows statically typed references to object of such a class to access the instance variables directly, with the C structure pointer dereference operator. The only possible motivation for this sacrifice of data hiding is the increased efficiency of access to the ivar; no messaging overhead is incurred.

@interface MyClass : Object
{
@public
 int wideOpen;
}
MyClass *myObject;
myObject->wideOpen = 1; // no messaging, no hiding

Functions not part of the class hierarchy may be used within methods. Using the C library is a common example of such free functions. Nothing special needs to be done to compile or link standard C functions. While these functions don't directly attack any OO principles, they are also not part of the object-orientedness of a solution.

The final two techniques, getting and using the address of a function, and getting and using the data structure behind a class, can be used to defeat dynamic binding and encapsulation. The syntax for these techniques is tortuous and is not covered here.

3.2.7 Advanced Features

Since there is not a standard for the Objective C language, there are differences, some small, others not so, between the compilers. The features discussed in this section are highly compiler dependent. A summary of differences is shown in Table 3.

Distributed objects

The NeXT and GNU compilers support, in a mutually incompatible fashion, what are known as distributed objects. A distributed object is one which can be messaged by another object outside of its address space. Lifting this restriction on address space means that an object can message another object residing in a different application, perhaps on a different computer altogether. Distributed objects are a means for extending OO programming to client/server applications. Objects associated with user interface elements can reside in the client, while objects associated with database resources, or heavy computational demands, can be placed on special purpose servers. Rather than connecting the two sets of objects via a traditional interprocess communication mechanism, e.g. sockets or RPCs, distributed objects allow for the standard OO messaging model to work between the client and server objects. Distributed objects also provide the infrastructure for applications to provide services to other applications via application program interfaces (APIs). For example, a data graphing application could provide the service of graphing and presenting collections of data for custom programs via a distributed object interface.

Protocols

Inheritance allows for the reuse of interface and implementation. Single inheritance limits this reuse to a tree structure. Every class can appear at only one location in the class hierarchy. That location defines what is inherited by the class. The inheritance hierarchy is not good at imposing interface requirements on a group of unrelated classes. Protocols allow for an interface specification (a set of methods) to be defined which can be adopted and adhered to by otherwise unrelated classes. Protocols have nothing to do with sharing implementation; they strictly define an interface. A formal protocol is declared with the @protocol directive. Consider a protocol to define the interaction between people over the telephone.

@protocol TelephoneEtiquette
- saySomething: (char *)msg;
- respond: (char *)msg;
- initiateGoodbye;
- acknowledgeGoodbye;
@end

A class could adopt this protocol by including it in the @interface declaration for the class. Suppose objects of the Customer class should be able to talk on the phone:

@interface Customer : Object 
{
 // ivars
}
// methods

The compiler will now warn you if you fail to implement all of the methods in the protocol. Clients of the Customer class are guaranteed to be able to send the messages defined in the protocol to Customer class objects. There is no guarantee about the implementation of the protocol methods, simply that an implementation is provided. Protocols are natural means of documenting APIs, as they define the methods a class implements without revealing anything else about the class. In the reverse direction, a protocol can serve to document the requirements of a custom class to interact with a given object. For example, a module in the NeXT Preferences application (see earlier discussion in the section on run-time) can be developed by anybody, providing they agree to meet a protocol which the Preferences application defines for communication with the modules.

Mixed Objective C and C++

The NeXT compiler supports the simultaneous use of C++ and Objective C. Features of the two languages may be used together in almost any manner. For example, C++ functions can send messages to Objective C objects. Objective C methods can create and message C++ objects. C++ retains its strong typing and static binding; Objective C its dynamic nature.

Defining Objective C Terms

protocol: The declaration of a set of methods implemented by a class which is independent of class hierarchy.
category: A means of extending the functionality of a class by adding methods.
distributed object: An object running in a different address space which can be messaged.
dynamic binding: Code executed by a message is determined at run-time, based on class of receiver.
interface: The declaration of the interface of an Objective C class. Starts with @interface directive.
implementation: The implementation of an Objective C class. Starts with @implementation directive.
id: A data type which can represent the identity of an object of any class.
object: The root class of the Objective C class hierarchy.
self: An object's self-referential instance variable of id type.
super: Reference to a class' superclass.

‍

