Techniques for detecting coding incompatibilities
Described are techniques for detecting incompatibilities. A first contents of a data item is determined in accordance with a first set of conventions associated with a first processor architecture. A second contents of said data item is determined in accordance with a second set of conventions associated with a second processor architecture and including at least one convention that is not included in said first set. An actual difference between the first contents and the second contents is determined. It is determined whether the actual difference is expected. If the actual difference is not expected, the data item is flagged as an incompatibility candidate. Code referencing the data item is examined to determine any coding incompatibilities due to coding dependencies.
Latest EMC Corporation Patents:
- Combining explicit and implicit feedback in self-learning fraud detection systems
- System and method to enhance phrase search with nested thesaurus parsing
- Systems and methods for bi-directional replication of cloud tiered data across incompatible clusters
- Method and system for adaptive wear leveling in solid state memory
- Handling deletes with distributed erasure coding
1. Technical Field
This application generally relates to code and processor architectures, and more particularly to techniques used in connection with detecting incompatibilities and coding dependencies.
2. Description of Related Art
A computer system may include one or more central processing units (CPUs) coupled to a memory and other components, such as I/O devices. A computer system may be used to perform a variety of processing tasks and operations. Binary images or machine executable programs may include instructions and data used in connection with performing a particular task. The instructions may be executed by the CPU and may cause the CPU to access the data at one or more locations. The instructions and/or data associated with an executable program may be produced specifically for use with a particular CPU architecture or family of processors. The CPU architecture may also follow certain conventions, for example, when handling memory storage such as accessing the data.
The executable program may be produced from source code written in a programming language. The source code may be produced by a programmer or other automated coding technique and used in connection with generating a first machine executable program for execution on a first CPU architecture. The first CPU architecture may operate in accordance with a first set of conventions. The source code may be written in such a way that there are dependencies on one or more of the first set of conventions. Problems may arise when the same source code is used to produce a second machine executable program for execution on a second CPU architecture having a different second set of conventions. The coding dependencies upon the first set of conventions may result in the first machine executable program operating as expected for the first CPU architecture and associated conventions, but may result in the second machine executable program, associated with the second CPU architecture and conventions, operating in an incompatible manner and producing unexpected results.
Thus, it may be desirable to detect such occurrences of incompatibilities with different architectures and/or conventions as may be associated with different computing environments.
SUMMARY OF THE INVENTIONIn accordance with one aspect of the invention is a method for detecting incompatibilities comprising: determining a first contents of a data item in accordance with a first set of conventions associated with a first processor architecture; determining a second contents of said data item in accordance with a second set of conventions associated with a second processor architecture and including at least one convention that is not included in said first set; determining an actual difference between said first contents and said second contents; determining whether said actual difference is expected; and if said actual difference is not expected, determining said data item as an incompatibility candidate. The method may also include: determining an expected difference using one of said first contents or said second contents; and comparing said expected difference to said actual difference. The first set of conventions may include at least a first convention specifying that data is stored in a memory accordance with a first byte ordering and said second set of conventions includes at least a second convention specifying that data is stored a memory in accordance with a second different byte ordering. The first convention may specify that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item. The second convention may specify that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item. The method may also include: determining a source code statement including at least one reference to said data item wherein said source code statement includes code written in accordance with one of said first convention or said second convention causing said actual difference to vary from said expected difference. The first processor architecture may be included in a component of a first type in a first data storage system, and said second processor architecture may be included in component of said first type in a second data storage system. The method may also include determining at least one of a first address associated with a first memory location of said first contents or a second address associated with a second memory location of said second contents using debug symbol table information. The method may also include: preparing a first code set including debug information for execution by said first processor architecture; and preparing a second code set including debug information for execution by said second processor architecture, said first and second code sets being produced using at least a same portion of source code, said portion of source code including at least one source code statement referencing said data item, said at least one source code statement being written in accordance with a first convention included in only one of said first or said second sets of conventions, said at least one source code statement causing said actual difference to be unexpected.
In accordance with another aspect of the invention is a system comprising: a first data storage system including a first processor architecture operating in accordance with a first set of conventions; a second data storage system including a second processor architecture operating in accordance with a second set of conventions including at least one convention that is not included in said first set; a host comprising code that: determines an actual difference between a first contents of a data item stored in said first data storage system and a second contents of said data item stored in said second data storage system; determines whether said actual difference is expected; if said actual difference is not expected, determining said data item as an incompatibility candidate. The first set of conventions may include a first convention specifying that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item, and said second set of conventions may include a second convention specifying that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item, and said host may further comprise code that: determines an expected difference using one of said first contents or said second contents and compares said expected difference to said actual difference.
In accordance with another aspect of the invention is a computer program product that detects incompatibilities comprising code that: determines a first contents of a data item in accordance with a first set of conventions associated with a first processor architecture; determines a second contents of said data item in accordance with a second set of conventions associated with a second processor architecture and including at least one convention that is not included in said first set; determines an actual difference between said first contents and said second contents; determines whether said actual difference is expected; and if said actual difference is not expected, determines said data item as an incompatibility candidate. The computer program product may also include code that: determines an expected difference using one of said first contents or said second contents; and compares said expected difference to said actual difference. The first set of conventions may include at least a first convention specifying that data is stored in a memory accordance with a first byte ordering and said second set of conventions may include at least a second convention specifying that data is stored a memory in accordance with a second different byte ordering. The first convention may specify that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item. The second convention may specify that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item. The computer program product may further comprise code that: determines a source code statement including at least one reference to said data item wherein said source code statement includes code written in accordance with one of said first convention or said second convention causing said actual difference to vary from said expected difference. The first processor architecture may be included in a component of a first type in a first data storage system, and said second processor architecture may be included in component of said first type in a second data storage system. The computer program product may also include code that determines at least one of a first address associated with a first memory location of said first contents or a second address associated with a second memory location of said second contents using debug symbol table information. The computer program product may also include code that: prepares a first code set including debug information for execution by said first processor architecture; and prepares a second code set including debug information for execution by said second processor architecture, said first and second code sets being produced using at least a same portion of source code, said portion of source code including at least one source code statement referencing said data item, said at least one source code statement being written in accordance with a first convention included in only one of said first or said second sets of conventions, said at least one source code statement causing said actual difference to be unexpected.
Features and advantages of the present invention will become more apparent from the following detailed description of exemplary embodiments thereof taken in conjunction with the accompanying drawings in which:
Referring now to
Each of the host systems 14a-14n and the data storage system 12 included in the computer system 10 may be connected to the communication medium 18 by any one of a variety of connections as may be provided and supported in accordance with the type of communication medium 18. The processors included in the host computer systems 14a-14n may be any one of a variety of proprietary or commercially available single or multi-processor system, such as an Intel-based processor, or other type of commercially available processor able to support traffic in accordance with each particular embodiment and application.
It should be noted that the particular examples of the hardware and software that may be included in the data storage system 12 are described herein in more detail, and may vary with each particular embodiment. Each of the host computers 14a-14n and data storage system may all be located at the same physical site, or, alternatively, may also be located in different physical locations. Examples of the communication medium that may be used to provide the different types of connections between the host computer systems and the data storage system of the computer system 10 may use a variety of different communication protocols such as SCSI, Fibre Channel, iSCSI, and the like. Some or all of the connections by which the hosts, management component(s), and data storage system may be connected to the communication medium may pass through other communication devices, such as a Connectrix or other switching equipment that may exist such as a phone line, a repeater, a multiplexer or even a satellite.
Each of the host computer systems may perform different types of data operations in accordance with different types of tasks. In the embodiment of
Referring now to
Each of the data storage systems, such as 20a, may include a plurality of disk devices or volumes, such as the arrangement 24 consisting of n rows of disks or volumes 24a-24n. In this arrangement, each row of disks or volumes may be connected to a disk adapter (“DA”) or director responsible for the backend management of operations to and from a portion of the disks or volumes 24. In the system 20a, a single DA, such as 23a, may be responsible for the management of a row of disks or volumes, such as row 24a.
The system 20a may also include one or more host adapters (“HAs”) or directors 21a-21n. Each of these HAs may be used to manage communications and data operations between one or more host systems and the global memory. In an embodiment, the HA may be a Fibre Channel Adapter or other adapter which facilitates host communication.
One or more internal logical communication paths may exist between the DA's, the remote adapters (RA's), the HA's, and the memory 26. An embodiment, for example, may use one or more internal busses and/or communication modules. For example, the global memory portion 25b may be used to facilitate data transfers and other communications between the DA's, HA's and RA's in a data storage system. In one embodiment, the DAs 23a-23n may perform data operations using a cache that may be included in the global memory 25b, for example, in communications with other disk adapters or directors, and other components of the system 20a. The other portion 25a is that portion of memory that may be used in connection with other designations that may vary in accordance with each embodiment.
The particular data storage system as described in this embodiment, or a particular device thereof, such as a disk, should not be construed as a limitation. Other types of commercially available data storage systems, as well as processors and hardware controlling access to these particular devices, may also be included in an embodiment.
Also shown in the storage system 20a is an RA 40. The RA may be hardware including a processor used to facilitate communication between data storage systems, such as between two of the same or different types of data storage systems.
Host systems provide data and access control information through channels to the storage systems, and the storage systems may also provide data to the host systems also through the channels. The host systems do not address the disk drives of the storage systems directly, but rather access to data may be provided to one or more host systems from what the host systems view as a plurality of logical devices or logical volumes (LVs). The LVs may or may not correspond to the actual disk drives. For example, one or more LVs may reside on a single physical disk drive. Data in a single storage system may be accessed by multiple hosts allowing the hosts to share the data residing therein. The HAs may be used in connection with communications between a data storage system and a host system. The RAs may be used in facilitating communications between two data storage systems. The DAs may be used in connection with facilitating communications to the associated disk drive(s) and LV(s) residing thereon.
The DA performs I/O operations on a disk drive. In the following description, data residing on an LV may be accessed by the DA following a data request in connection with I/O operations that other directors originate.
Referring now to
The representation of
Referring back to
Referring now to
As known to those of ordinary skill in the art, Big Endian and Little Endian describe an ordering or sequence in which multi-byte data is stored in memory. Byte order storage may impact the compatibility between devices within and outside of a system. The order in which the data is stored into memory, such as memory 52 of a particular DA or other component in the data storage system, may vary in accordance with the particular hardware. Big Endian and Little Endian each refer to a particular ordering in which bytes are stored in memory. Little Endian formatting specifies that the least significant byte is stored in the lowest memory address. Examples of Little Endian processor architecture include, for example, IA32 and IA64 architecture, and the like, used by Intel, AMD and other CPU vendors. In contrast, Big Endian formatting takes the most significant byte and stores it in the lowest memory address. Examples of a Big Endian processor architecture include, for example, the PowerPC and MIPS architecture, used by, IBM, Motorola, PMC, and other CPU vendors.
Referring now to
As previously described, it may be the case that a processor architecture of a first data storage system operates in accordance with a Big Endian representation for handling memory storage and a second different data storage system may operate in accordance with a Little Endian byte ordering when accessing locations in memory.
A problem may arise, for example, when code written to execute in accordance with assumptions made for a Big Endian format is ported for execution and use in an environment which operates in accordance with the Little Endian format. Code written in accordance with assumptions or dependencies for a Little Endian environment may operate inconsistently when executed by a processor architecture that operates in accordance with the Big Endian environment. Similarly, code written in accordance with assumptions or dependencies for the Big Endian format may operate inconsistently when executed in a Little Endian environment. It may be desirable to detect such inconsistencies associated with handling memory storage associated with code which operates in a Big Endian environment and a Little Endian environment. What will now be described are techniques that may be used in detecting data incompatibilities associated with code written in accordance with a set of dependencies or assumptions causing the code to operate properly only in one of the Big Endian or Little Endian environments. Thus, when the code is ported to operate in the other of the Big Endian or Little Endian environment, the code and data accesses may not operate as expected due to these coding dependencies or assumptions.
In one embodiment as will be illustrated herein, a first data storage system may operate in accordance with a Big Endian architecture and a second data storage system may operate in accordance with a Little Endian architecture. It may be desirable to have a common set of source code modules used to produce both a first set of executable code for execution in the Big Endian environment as well as a second set of executable code for execution in the Little Endian environment. The techniques that will now be described may be used in connection with detecting data anomalies or incompatibilities when comparing the data accesses for a same data item in the Big Endian and Little Endian environments.
It should be noted that although the techniques described herein refer to two data storage systems each operating in accordance with one of a Big Endian and Little Endian architecture, the techniques described herein may be used to identify data incompatibilities for processor architectures included in components other than data storage systems.
Referring now to
The techniques described herein examine and compare the contents of memory used by the first DA of data storage system 20a with the contents of memory used by the second DA of data storage system 20b. For a particular data item, a first address of that data item in 20a and a second address of that data item in 20b are determined. The contents of the first address are compared to the contents of the second address to determine if any data incompatibility exists. In other words, a determination is made as to whether the difference between the contents of both locations is an expected difference in accordance with the Big Endian and Little Endian data formatting. If the difference is as expected, then the source code associated with accessing this data item is not a candidate for a coding incompatibility.
What will now be described is a representation of the expected difference between a data item accessed in the Little Endian environment and the same data item accessed in the Big Endian environment. If LEM represents the particular data item representation in the Little Endian environment, then the expected format of that data item in the Big Endian environment may be represented as BEM (expected) so that generally the following should hold true:
(f(LEM(actual))−1)=BEM(expected)
where f(x)−1 represents the byte swap of the data element x. In other words, if a first actual data item is in the Little Endian format (e.g., LEM (actual)), the first data item's byte ordering may be swapped to determine what the value of the first data item is expected to be in accordance with a Big Endian representation (e.g., BEM (expected)). The data value corresponding to the foregoing expected result (e.g., BEM (expected)) can be compared to another data value of the first data item actually read from the memory associated with a Big Endian architecture (e.g., BEM (actual)). If the two values (e.g., BEM (actual) and BEM (expected)) are not the same, then the current data item is flagged as an incompatibility candidate. The source code statement associated with the current data access of the data item may be examined based on this detected data incompatibility to determine if the source code represents a coding incompatibility. In other words, the associated source code may be written in accordance with data dependencies or assumptions which are not valid in both the Big and Little Endian environments. Thus, the null hypothesis, H0, may represent the instance where there is no incompatibility associated with a current data access and associated code and the following holds true:
(f(LEM)−1)=BEM (expected) and
BEM(expected)=BEM (actual)
wherein
“BEM (expected)” is the expected data value produced from the actual Little Endian formatted data value read from the memory of data storage system 20b, and
“BEM (actual)” is the actual Big Endian formatted data value as may be read from data storage system 20a.
H1 may represent the instance where H0 evaluates to false such that a possible incompatibility is detected.
It should be noted that the following also holds true:
(f(BEM)−1)=LEM (expected) and
LEM(expected)=LEM (actual)
wherein
“LEM (expected)” is the expected data value produced from the actual Big Endian formatted data value read from the memory of data storage system 20a, and
“LEM (actual)” is the actual Little Endian formatted data value as may be read from data storage system 20b.
The host 12a may execute code which controls the detection of data and coding incompatibilities. As will be described in more detail in following paragraphs, the host 12a may perform processing which controls the execution of code in the data storage systems 20a and 20b and the examination of the contents of a particular data item in both the Big Endian and Little Endian environments. Although not explicitly stated in connection with the following description, communications may be made between the host 12a and each of the data storage system 20a and 20b in order to transmit commands from the host to the data storage systems to control the execution of the code on each of the data storage systems. Data may also be transmitted from the data storage systems to the host, for example, in order to examine a value of a data item as may be stored within each of the data storage systems. In one example illustration, the techniques described herein may be used in connection with detecting data incompatibilities associated with code executed by a DA in 20a and a DA in 20b. An incompatibility candidate may be determined by examining the contents of memory associated with each DA, such as a memory element 52 that may be local to each of the DAs included in 20 and 20b.
Data incompatibilities may result from coding as may be associated with, for example, type casting as may be performed in C and C++. The following represents what may be characterized as one example of coding causing a data incompatibility between the Big Endian and Little Endian environments because the same source code will produce different results in each environment:
int *p;
int j;
p=& j;
*(short *p)=0x1234;
*((short *p)++)=0xABCD;
Following are some additional code examples causing data incompatibilities and different results on Big Endian and Little Endian architectures.
The following example illustrates an incompatibility caused by the coding dependency for reading or writing only part of a number:
UINT32 value;
UINT16 hi, lo;
value=0x12345678′
hi=((UINT16*) &value) [0];
lo=((UINT16*) &value) [1];
The following example illustrates an incompatibility caused by code that may read or write multiple numbers at once:
UINT16 block_range[2];
*((UINT32*) block_range)=0x00080010;
The following example illustrates an incompatibility caused by code that may read or write a struct as an integer:
struct {
UINT8 cmd;
UINT8 flags;
UINT16 dev;
} rec;
*((UINT32*) &rec)=0x28004567;
The following example illustrates an incompatibility caused by code that may read or write values in protocol structures or device registers:
UINT8 cdb [32];
*((UINT16) &cdb[0])=lun;
*((UINT16*) &cdb[2])=siz;
*(UINT32*) &cdb[4])=block number;
The following example illustrates an incompatibility caused by code that has a dependency on sizes of different types in an architecture. Additionally, language processors, such as compilers processing C or C++ code, may also vary sizes associated with certain data types. As an example, the following code may produce different results in accordance with the sizes of the data types that may vary with processor architecture and/or the selections made by a particular compiler or other processor of code:
typedef struct {
USHORT device;
USHORT target_number;
ULONG record_offset;
ULONG record_size;
-
- } T_RECORD_INFO;
The size of the foregoing struct may vary with processor architecture and/or language processor. For example, if data types of int, long, and all pointers are 32 bits, the C sizeof function returns 12. If the data type of int is 32 bits and long and pointer are 64 bits, then the sizeof function returns 24.
- } T_RECORD_INFO;
As another example, the size of a pointer variable may vary as well as whether data is aligned, the particular alignment boundary requirements, and the like.
It should be noted that coding dependencies may be dependent on one or more aspects of a computer architecture making the code non-portable. Although Big Endian vs. Little Endian formatting (e.g., byte ordering) is an example of one such aspect of a processor architecture described herein in more detail, it should be noted that CPU architectures may also vary in accordance with other aspects such as, for example, different word sizes, alignment requirements, and the like, some of which are illustrated above. The techniques described herein may be used in connection with detecting coding dependencies made in accordance with one or more of these any other aspects as may exist in code.
Techniques described in following paragraphs can be used in connection with flagging data items which have unexpected differences in the Big Endian environment and the Little Endian environment, and examining the code where the data items are referenced, such as when the data items are being initialized or otherwise assigned values.
The techniques described herein may be used in connection with detecting data incompatibilities by examining the data value associated with a particular data item in two different environments, such as the Big Endian and the Little Endian environment described herein. The actual difference between the data items in the Big Endian and Little Endian environments is compared to an expected difference of the particular data item. In the event that the expected difference is not the same as the actual difference of a data item, the data item may be characterized as a data incompatibility candidate. The one or more source code statements at which this particular data item is referenced, such as, for example, where a variable may be initialized or otherwise assigned a value, may be examined. The particular source code statements corresponding to the data item flagged as a data incompatibility candidate may be examined to determine if the source code includes a coding incompatibility due to the source code being written in accordance with assumptions or dependencies of one particular environment. The source code written in accordance with the dependencies may cause the resulting executable code for each of the two environments to produce unexpected differences. Accordingly, such source code statements may be flagged and examined to determine if such statements should be rewritten to be Endian independent.
Referring now to
At step 304, both the Big Endian and Little Endian data storage systems may be configured such that there is preferably only a difference related to the CPU architecture and its associated conventions. In other words, the number of differences between the two data storage systems upon which the two code versions will be executed should have minimal differences. Preferably, the only difference should be related to the CPU architecture upon which the code executes. Accordingly, differences such as data incompatibilities may attributed to the CPU architectural differences. At step 306, the debug versions of the symbol tables for both the Big Endian and Little Endian code versions are parsed and used to produce symbol table analysis information for data items such as variables and data structures. It should be noted that in connection with step 306, one embodiment may have the host 12a request information in connection with the debug symbol tables from each of data storage systems 20a and 20b. In an alternate embodiment, a copy of the debug symbol table information may be made available to the code currently executing on the host 12a using other techniques. The symbol table information used in connection with producing symbol table analysis information of the step 306 is described in more detail elsewhere herein. Data obtained from the debug symbol table information may include, for example, data item names, addresses, data type and/or size information, references to other data items used to determine addresses, and the like. As known to those of ordinary skill in the art, an address of a data item may be determined in accordance with when values for symbols referenced in connection with the address are known. The foregoing name-to-address binding for a data item may occur at a variety of different times in accordance with what types of address expressions are allowed, when forward referencing is resolved, and the like. The name-to-address binding may occur, for example, at compile time, load time, or runtime/execution time. The symbol table analysis information may include information used in connection with resolving the address of each data item as may be allowed within a particular embodiment. At step 308, the host system 12a may issue commands, such as, for example, in connection with a debugger to execute corresponding code on each of the Big Endian and Little Endian data storage systems. In one embodiment, the code executed on each of the data storage systems in connection with step 308 may exercise a large number of logical code paths through a same set of module or modules on each of the data storage systems. Both of the data storage systems may have their code execution stop at a same point in order to examine memory contents of each of the data storage systems. At step 310, any run time information needed to complete runtime address resolution for any data items may be determined. The code execution on each of the data storage systems may be stopped after a particular point in time. The values of different data items on each of the data storage systems 20a and 20b may be examined by traversing each of the data elements as specified in the symbol table analysis information. The symbol table analysis information as described in connection with other figures includes an entry for each data item or variable. At step 312, current data item is assigned the next data item as identified in accordance with the symbol table analysis information. At step 314, a determination is made as to whether all data items have been examined. If so, processing stops. Otherwise control proceeds to step 316 to read the values for the current data item from each of the data storage systems stored in accordance with both the Big Endian and Little Endian data formats. At step 318, a determination is made as to whether the difference between the actual data values is an expected difference. If not, control proceeds to step 320 to store information about the particular incompatibility detected and control proceeds to step 312 to examine the next data item. In the event that no incompatibility is detected, control proceeds from step 318 directly to step 312. It should be noted in step 320 that the information stored about a particular incompatibility detected may include, for example, the entry and associated information for the data item in the symbol table analysis information, the expected difference, and the like.
Referring now to
Each entry 410 may include the following information about a particular data item: name 412, type information 414, address information 416, and other information 418. A name 410 may be, for example, a programmer specified variable name such as may be included in the source code. Type information 414 may include, for example, data type information. The particular data types and associated sizes of each may vary in accordance with an embodiment. Address information 416 may include the actual addresses on both data storage systems which result from address resolution and binding. An address may be represented, for example, by an address expression as illustrated in entries 420 and 422 of the table 400. Entry 420 indicates that the address of data item “A” is the value of the symbol “LOC1”. In the event that LOC1 may be determined at load time, for example, the entry 420 may include a numeric value represented the address of LOC1. Entry 422 includes information about the data item “a.b.c” which may correspond, for example, to a field in a C structure. The address of “a.b.c” may be represented by the address expression “LOC2+10”. If the value of LOC2 is not known until a particular point at runtime, the address field of 422 may include a representation of the expression illustrated in
Data included in the other information field 418 may be used in connection with, for example, address resolution, linking together entries including references to a same data item, and the like, and may vary with each embodiment. For example, as known to those of ordinary skill in the art, address resolution may be performed in one or more passes over the table 400 and may depend, for example, on whether forward-referencing is allowed or in accordance with the complexity of the particular expressions that may be used in forming an address 416.
The execution of the steps of flowchart 300 of
The processing of flowchart 500 of
It should be noted that the processing steps of flowchart 500 of
The foregoing describes a technique for determining data incompatibilities between two different environments for handling memory accesses. In the example described herein, the incompatibility may be related to data byte ordering caused by code written in accordance with coding dependencies particular to one environment. However, the incompatibility may be related to other computing environmental differences.
While the invention has been disclosed in connection with preferred embodiments shown and described in detail, their modifications and improvements thereon will become readily apparent to those skilled in the art. Accordingly, the spirit and scope of the present invention should be limited only by the following claims.
Claims
1. A computer-implemented method for detecting incompatibilities comprising:
- determining a first contents of a data item, wherein said first contents is a first formatted data value of the data item assigned to the data item by a statement of a program during execution of the program by a first processor having a first processor architecture, said first formatted data value having a representation in accordance with a first set of conventions associated with the first processor architecture;
- determining a second contents of said data item, wherein said second contents is a second formatted data value of the data item assigned to the data item by a statement of the program during execution of the program by a second processor having a second processor architecture, said second formatted data value having a representation in accordance with a second set of conventions associated with the second processor architecture and including at least one convention that is not included in said first set; determining an expected data value using one of said first contents and said second contents;
- determining whether said expected data value and another of said first contents and said second contents are different; and
- if said expected data value and said another are different, determining said data item as an incompatibility candidate, wherein said determining an expected data value includes swapping an ordering of bytes of said one of said first contents and said second contents, and said determining whether said expected data value and another of said first contents and said second contents are different includes comparing said expected data value and said another of said first contents and said second contents.
2. The method of claim 1, wherein said first set of conventions includes at least a first convention specifying that data is stored in a memory accordance with a first byte ordering and said second set of conventions includes at least a second convention specifying that data is stored a memory in accordance with a second different byte ordering.
3. The method of claim 2, wherein said first convention specifies that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item.
4. The method of claim 3, wherein said second convention specifies that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item.
5. The method of claim 4, further comprising:
- determining a source code statement including at least one reference to said data item wherein said source code statement includes code written in accordance with one of said first convention or said second convention causing said another of said first contents and said second contents to vary from said expected data value.
6. The method of claim 1, wherein said first processor architecture is included in a component of a first type in a first data storage system, and said second processor architecture is included in component of said first type in a second data storage system.
7. The method of claim 1, further comprising:
- determining at least one of a first address associated with a first memory location of said first contents and a second address associated with a second memory location of said second contents using debug symbol table information.
8. A computer-implemented method for detecting incompatibilities comprising:
- determining a first contents of a data item in accordance with a first set of conventions associated with a first processor architecture;
- determining a second contents of said data item in accordance with a second set of conventions associated with a second processor architecture and including at least one convention that is not included in said first set;
- determining an actual difference between said first contents and said second contents;
- determining whether said actual difference is expected; and
- if said actual difference is not expected, determining said data item as an incompatibility candidate;
- determining at least one of a first address associated with a first memory location of said first contents and a second address associated with a second memory location of said second contents using debug symbol table information;
- preparing a first code set including debug information for execution by said first processor architecture; and
- preparing a second code set including debug information for execution by said second processor architecture, said first and second code sets being produced using at least a same portion of source code, said portion of source code including at least one source code statement referencing said data item, said at least one source code statement being written in accordance with a first convention included in only one of said first or said second sets of conventions, said at least one source code statement causing said actual difference to be unexpected.
9. A system comprising:
- a first data storage system including a first processor architecture operating in accordance with a first set of conventions;
- a second data storage system including a second processor architecture operating in accordance with a second set of conventions including at least one convention that is not included in said first set;
- a host comprising code that: determines an expected data value using one of a first contents of a data item stored in said first data storage system and a second contents of said data item stored in said second data storage system, wherein said first contents is a first formatted data value of the data item assigned to the data item by a statement of the program during execution of the program on the first data storage system, said first formatted data value having a representation in accordance with the first set of conventions associated with the first processor architecture, wherein said second contents is a second formatted data value of the data item assigned to the data item by a statement of the program during execution of the program on the second data storage system, said second formatted data value having a representation in accordance with the second set of conventions associated with the second processor architecture; determines whether said expected data value and another of said first contents and said second contents are different; and if said expected data value and said another are different, determines said data item as an incompatibility candidate, and wherein the code that determines an expected data value includes code that swaps an ordering of bytes of said one of said first contents and said second contents, and the code that determines whether said expected data value and another of said first contents and said second contents are different includes code that compares said expected data value and said another of said first contents and said second contents.
10. The system of claim 9, wherein said first set of conventions includes a first convention specifying that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item, and said second set of conventions includes a second convention specifying that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item.
11. A computer readable medium comprising code stored thereon that detects incompatibilities, the computer readable medium comprising code stored thereon that:
- determines a first contents of a data item, wherein said first contents is a first formatted data value of the data item assigned to the data item by a statement of a program during execution of the program by a first processor having a first processor architecture, said first formatted data value having a representation in accordance with a first set of conventions associated with the first processor architecture;
- determines a second contents of said data item, wherein said second contents is a second formatted data value of the data item assigned to the data item by a statement of the program during execution of the program by a second processing having a second processor architecture, said second formatted data value having a representation in accordance with a second set of conventions associated with said second processor architecture and including at least one convention that is not included in said first set;
- determines an expected data value using one of said first contents and said second contents;
- determines whether said expected data value and another of said first contents and said second contents are different; and
- if said expected data value and said another are different, determines said data item as an incompatibility candidate, and wherein the code that determines an expected data value includes code that swap an ordering of bytes of said one of said first contents and said second contents, and the code that determines whether said expected data value and another of said first contents and said second contents are different includes code that compares said expected data value and said another of said first contents and said second contents.
12. The computer readable medium claim 11, wherein said first set of conventions includes at least a first convention specifying that data is stored in a memory accordance with a first byte ordering and said second set of conventions includes at least a second convention specifying that data is stored a memory in accordance with a second different byte ordering.
13. The computer readable medium of claim 12, wherein said first convention specifies that a most significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item.
14. The computer readable medium of claim 13, wherein said second convention specifies that a least significant byte of data of said data item is stored in a lowest memory address of a storage location associated with said data item.
15. The computer readable medium of claim 14, further comprising code stored thereon that:
- determines a source code statement including at least one reference to said data item wherein said source code statement includes code written in accordance with one of said first convention or said second convention causing said another of said first contents and said second contents to vary from said expected data value.
16. The computer readable medium of claim 11, wherein said first processor architecture is included in a component of a first type in a first data storage system, and said second processor architecture is included in component of said first type in a second data storage system.
17. The computer readable medium of claim 11, further comprising code stored thereon that:
- determines at least one of a first address associated with a first memory location of said first contents and a second address associated with a second memory location of said second contents using debug symbol table information.
18. A computer readable medium comprising code stored thereon that detects incompatibilities, the computer readable medium comprising code stored thereon that:
- determines a first contents of a data item in accordance with a first set of conventions associated with a first processor architecture;
- determines a second contents of said data item in accordance with a second set of conventions associated with a second processor architecture and including at least one convention that is not included in said first set;
- determines an actual difference between said first contents and said second contents;
- determines whether said actual difference is expected;
- if said actual difference is not expected, determines said data item as an incompatibility candidate;
- determines at least one of a first address associated with a first memory location of said first contents and a second address associated with a second memory location of said second contents using debug symbol table information;
- prepares a first code set including debug information for execution by said first processor architecture; and
- prepares a second code set including debug information for execution by said second processor architecture, said first and second code sets being produced using at least a same portion of source code, said portion of source code including at least one source code statement referencing said data item, said at least one source code statement being written in accordance with a first convention included in only one of said first or said second sets of conventions, said at least one source code statement causing said actual difference to be unexpected.
19. The method of claim 1, further comprising:
- executing a portion of code on said first processor architecture, wherein said first contents are produced as a result of said executing said portion of code on said first processor architecture; and
- executing said portion of code on said second processor architecture, wherein said second contents are produced as a result of said executing said portion of code on said second processor architecture, wherein said determining an expected data value and said determining whether said expected data value and another of said first contents and said second contents are different are performed after said executing a portion of code on said first processor architecture and after said executing said portion of code on said second processor architecture.
20. The computer readable medium of claim 11, wherein one of the first processor architecture and the second processor architecture uses little endian data formatting and another of the first processor architecture and the second processor architecture uses big endian data formatting.
5432795 | July 11, 1995 | Robinson |
5488714 | January 30, 1996 | Skidmore |
5666519 | September 9, 1997 | Hayden |
5764947 | June 9, 1998 | Murphy et al. |
5774719 | June 30, 1998 | Bowen |
5819252 | October 6, 1998 | Benson et al. |
5845064 | December 1, 1998 | Huggins |
5926636 | July 20, 1999 | Lam et al. |
6249822 | June 19, 2001 | Kays et al. |
- Pericom, Application Brief AB34. “Big Endian to Little Endian Data Conversion Using 3.3V Bus Switches,” by Refugio Jones. Aug. 18, 1999.
- Explanation of Big Endian and Little Endian Architecture http://support.microsoft.com/kb/q102025, Nov. 20, 2003.
Type: Grant
Filed: May 3, 2005
Date of Patent: May 11, 2010
Assignee: EMC Corporation (Hopkinton, MA)
Inventors: Ofer E. Michael (Newton, MA), Josef Ezra (Ashland, MA), Dar S. Efroni (Ashland, MA)
Primary Examiner: Wei Y Zhen
Assistant Examiner: Phillip H Nguyen
Attorney: Muirhead and Saturnelli, LLC
Application Number: 11/120,602
International Classification: G06F 9/44 (20060101);