Thread started by davidlu on Tuesday, April 01, 2014.

Hungarian Notation

Hungarian Notation has greatest utility in large software projects that might define 1000 or more different data types and might have 100,000 or more procedure or class method definitions defined. Charles Simonyi, while a researcher at Xerox PARC , introduced the notation in an attempt to describe how a Software Factory might work.

The Microsoft Applications Group, where Simonyi was lead developer fthroughout the 1980s, utilized the form of Hungarian Notation described below in the development of Microsoft Word, Multiplan, Chart, File, Excel, and Access.

That development organization, the predecessor of today's Information Worker Business Unit which is responsible for the development of the Microsoft Office products, could be viewed as a kind of software factory similar to that imagined in Simonyi's thesis.

When Hungarian is used in such a project, a unique small 2-4 character unique alphanumeric tag is generated for every data type defined in the project. That tag name, spelled in all capitals, is used to name a type in its type definition. Any data instances of that type used in the project, are named using that tag name spelled in lower case.

There are predefined primitive tags in Hungarian that are used in code that performs generic processing in a system such as memory management code, but more frequently more specialized tags are coined to distinguish the hundreds or thousands of different ways an integer, an array, or a hashtable might be used in a large programming system, such as Microsoft Word, Excel or a Microsoft Office component.

Hungarian's primitive tags correspond to the primitive types provided by a particular programming language. The primitive tags commonly used in Hungarian are frequently one or two character abbreviations of the type names used in the C language (eg. b, ch,w, l)

Primitive tags ARE predominantly employed in code that performs generic processing on data where more detailed type information is obscured (heap managers, string handling routines, etc).

HOWEVER, when primitive types are invoked in more restrictive contexts (eg. short integers used to encode screen pixel coordinates, color palette indexes, indexes to arrays of a particular type, file identification numbers, document identification numbers) more restrictive Hungarian tags are coined that describe the meaning and usage of that more restrictive type (eg. xp and yp for horizontal and vertical pixel coordinates, co for a color palette index, ifoo for the index to an array of type FOO, FN for a file ID, DOC for a document id).

If it is necessary to reimplement a restrictive type using a different primitive type implementation (eg. change type declaration from short to long in C) this is done with no great clamor or consequence.

When a record or class type is defined in the project, a unique Hungarian tag is coined for it. In a large project, usage of these record/class tags nearly always predominates over that of the primitive tags and their restrictions.

When Hungarian is used in a multi-programmer development project, the Hungarian tags become the nouns that the programmers use when they describe their algorithms in specifications or in conversation with other project members. The tags names are pronounced as words when they are pronounceable and otherwise are spelled out.

The Wikipedia HN article talks about lower-case mnemonics being used to express the type or purpose of an instance variable, without mentioning the role that composable prefixes play in the notation. Prefixes are used to label different compositions of data types and to generate a small number of standard derived types from an existing type.

Most modern programming languages provide pointer and array constructs that allow programmers to traverse links from one data structure to another and to pick out a particular data item aggregated into an array. Hungarian notation predefines prefix types that can be added to Hungarian type tags to describe the meanings of pointers and arrays defined within a project.

Common composable prefixes:

p - pointer to data of a particular type. A pfoo instance would be a pointer to data of type FOO.

pp - pointer to a pointer that points to data of a particular type. A ppfoo instance would be a pointer, which points to a pointer that points to data of type FOO.

h - a pp pointer that resides in a memory managed heap.

rg - an unstructured array called a range in Hungarian. A rgfoo instance would be an array which stores instances of type foo. An rgfoo would be indexed via an instance named ifoo (an index to foo)

mp - a specialized array which maps instances of one type into instances of another type. For instance, if the DOC hungarian type were a small integer doc number, a mpdochdod type indexed by a particular doc would produce a hdod, a handle to a document descriptor that corresponds to that doc number.

These prefixes can be composed and prefixed to a Hungarian type tag to name data instances that are the starting points of complex data structure traversals.

Example:

A rghrgpfoo would be a range (rg) of handles (h) which each contain a range (rg)

of pointers (p) to data items of type FOO.

Common prefixes for derived types:

i - ifoo would be an index in an array rgfoo that contains elements of type FOO

c - cfoo would be a count of elements of type FOO

d - dfoo would be the difference between two elements of type FOO

b - bfoo would be the relative offset to a type X. This is used for field displacements

in a data structure with variable size fields.

cb - cbFOO would be the size of an item of type FOO measured in bytes

cw - cwFOO would be the size of an item of type FOO measured in words

Modifying descriptive suffixes cab be added to the end of Hungarian names to distinguish different usages of the same type within a system.

In a large routine, several or many different instances maybe be used to keep track of data that shares the same Hungarian type. Suffixes are added to distinguish and document the importance of the different instances used.

Descriptive suffixes in Hungarian can invoke a developer coined hint, which describes how the instance may be used in a routine

eg. In the Hungarian hplcbteChp, the Chp suffix indicates that this structure might be queried to produce data elements of type CHP

Hungarian provides a set of standard modifying suffixes that can be used to document invariants which restrict the usage of an instance .

Some of the standard suffixes which express invariants are:

Max - added to an index data instance which records the actual size of an array or data block.

eg. the declaration of a an array of type FOO in C would be: FOO rgfoo[ifooMax];

When ifoo is any index of the rgfoo array, it is invariant that ifoo < ifooMax

Mac - added to an index data instance to indicate the current limit of a actual usage within an array.

Mac stands for current maximum.

When a Mac suffix is affixed to an index name, ifoo, for an array named rgfoo, it is invariant that ifooMac <= ifooMax.

ifooMac == ifooMax is the condition which is true when all entries within an array are in use.

First - added to an index or pointer which designates the first element within a range that may be validly processed

When ifoo is any index of the rgfoo array, it is invariant that ifoo >= ifooFirst

Last - added to an index or pointer which designates that last entry within a range that may be validly processed.

When ifoo is any index of the rgfoo array, it is invariant that ifoo <= ifooLast

Lim - is an abbreviation for limit.

When ifoo is any index of the rgfoo array, it is invariant that ifoo < ifooLim

eg. An ifooLim is equal to ifooLast + 1 and designates the location where a new foo item could be recorded in an

array or data block.

Min - affixed to an index or pointer name, which designates the minimum element within a range of entries

When ifoo is any index of the rgfoo array, it is invariant that rgfoo[ifoo] >= rgfoo[ifooMac]

Hungarian Rules for Procedure and Function Names

The capitalization scheme for procedure names is that each word or type declaration used in the name is written with its initial letter capitalized. The name of the procedure defined in the previous example has been given a well-formed Hungarian name adhering to this rule. The capitalization rule used is different from what is used for type definitions and data instance declarations.

If the routine name is a function that returns a variable, the type of the data is placed first in the functions name. A routine which returns the truth or falsity of a predicate would be named so that the function's name begins with the F (boolean) Hungarian tag followed by a description of the meaning of the predicate.

Example:

Boolean FCpLineFirst(doc, cp)

{

};

might be the declaration of a predicate (function that returns true or false) that tests whether the CP (character position) passed in for the given doc (document number) marks the beginning of a text line that will be displayed onscreen.

Functions rather frequently are given names that include the Hungarian tags of the data passed unto the function as the last words within the function name.

The declaration

DL DlDisplayedForDocCp(DOC doc, CP cp);

might name a function that when passed the character position (CP) of a particular character within a specified document (DOC), it returns the index of the display line (DL) upon which that character would be displayed on the display terminal.

If a procedure is being defined (function with no return values), the words used in the procedure name might generally describe the operation that is being performed without specifically mentioning the parameters passed in.

The declaration

void FormatLine(doc, cp)

might describe a routine which calculates the horizontal display position for each character within a line of text to be displayed which resides within a specified document (doc) when the line begins at a specified character position (CP)

The author of the article, David Luebbert, is a software developer who worked under Simonyi's direction on the Macintosh Word project for 5 years (1984-1989) and who has employed Hungarian Notation in his development work for more than 20 years.

XML