DECODING DEVICE AND COMPUTER PROGRAM PRODUCT
According to an embodiment, a decoding device includes: a holding unit including a first level cache to hold a result of decoding binary data into a structured document by a decoder with the binary data, and a second level cache to hold partial data pieces into which the binary data is divided in predetermined units of events of the structured document and the result of decoding corresponding to the partial data pieces; and a retention determiner to divide the binary data in the predetermined units of events, and store the partial data pieces and the result of decoding corresponding to the partial data pieces into the second level cache. When the binary data that is input includes a part matching a partial data piece held by the second level cache in the predetermined units of events, the decoding unit outputs the decoding result corresponding to the matching partial data piece.
Latest KABUSHIKI KAISHA TOSHIBA Patents:
- INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, COMPUTER PROGRAM PRODUCT, AND INFORMATION PROCESSING SYSTEM
- SEMICONDUCTOR DRIVE DEVICE AND SEMICONDUCTOR MODULE
- ARTICLE MANAGEMENT APPARATUS, ARTICLE MANAGEMENT METHOD, ARTICLE MANAGEMENT SYSTEM, AND COMPUTER-READABLE STORAGE MEDIUM
- SEMICONDUCTOR DEVICE AND METHOD FOR MANUFACTURING THE SAME
- INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT
This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2013-060587, filed on Mar. 22, 2013; the entire contents of which are incorporated herein by reference.
FIELDEmbodiments described herein relate generally to a decoding device and a computer program product.
BACKGROUNDThere has been an increasing trend in the data amount of structured documents in XML and the like, and the structured documents are thus not suitable for high-speed data processing and processing handling a large amount of XML documents. Efficient XML Interchange (EXI) is therefore proposed as a standard for efficient and high-speed data processing. The EXI converts an XML document to an EXI stream that is a binarized representation according to the XML schema. This can contribute to efficient data communication and processing since binarized data are dramatically reduced in data volume.
Furthermore, for actually checking data binarized as described above by a user, the user inputs the EXI stream to a decoding device having the same logic as that of a state machine used to binarize the XML document, and the original XML document is output therefrom. Since the output XML document is written in a natural language, the user can thus check the content thereof.
The EXI stream is encoded or decoded bit-by-bit. Typically, reading and writing of data bit-by-bit cause heavy loads and tend to decrease the processing speed. When a decoding device installed in a server that receives all EXI streams output from numerous devices or a decoding device installed in a low-processing-speed device is assumed, processing by reading and writing data bit-by-bit may not be fast enough.
According to an embodiment, a decoding device includes a decoder, a holding unit, and a retention determiner. The decoder decodes binary data into a structured document according to a state machine that has been used to convert the structured document into binary data. The holding unit includes a first level cache and a second level cache. The first level cache holds a result of decoding the binary data into the structured document by the decoding unit with the binary data. The second level cache holds partial data pieces into which the binary data held by the first level cache is divided in predetermined units of events of the structured document and the result of decoding that corresponds to the partial data pieces. The retention determiner generates the partial data pieces by dividing the binary data held by the first level cache in the predetermined units of events, and storing the generated partial data pieces and the result of decoding corresponding to the partial data pieces into the second level cache. When the binary data that is input includes a part matching a partial data piece held by the second level cache in the predetermined units of events, the decoding unit outputs the result of decoding corresponding to the matching partial data piece held by the second level cache.
First EmbodimentAn embodiment in which a decoding device is embodied as a smart meter will be described below.
The method of the search is as follows. First, it is searched whether or not there is a partial EXI stream having a state matching the current state of the state machine of the EXI stream decoding unit 301. If there is a matching partial EXI stream, binary data constituting the partial EXI stream and the received binary data are compared, and it is determined that there is a matching partial EXI stream if the binary data match each other in all bits. A partial EXI stream is binary data of part of an EXI stream extracted from the EXI stream. Partial EXI streams are obtained by dividing an EXI stream held by the holding unit 302 in predetermined units of events. The unit of events is the data width of a code corresponding to transition of the state machine used for binarization and decoding of a structured document or the data width of a content of the structured document.
If there is a matching partial EXI stream in a second level cache of the holding unit 302 (step S401: Yes), the EXI stream decoding unit 301 skips the decoding process for the partial EXI stream, uses the decoding result held by the holding unit 302, and resumes decoding of the remaining EXI stream from a state corresponding to a decoding end position (step S402). If there is no matching partial EXI stream (step S401: No), the EXI stream decoding unit 301 performs normal decoding (step S404). The EXI stream decoding unit 301 then determines whether or not decoding of the whole EXI stream is completed (step S403), and terminates the process if completed.
Next, decoding of an EXI stream “1000 0000 1000 0000 1000 0000” will be described first and decoding of an EXI stream “1000 0000 1000 0000 1100 0000” will then be described. Since there is no data in the holding unit 302 when a first EXI stream is to be decoded, the EXI stream decoding unit 301 performs normal decoding from the start to the end according to the decoding rule.
If an XML document obtained by decoding “1000 0000 1000 0000 1000 0000” is:
<A><B>0</B>
</A>,the event and the content that are the decoding result will be:
StartElement(A) StartElement(B) Character(Boolean) Value(0) EndElement EndElement EndDocument.
The EXI stream decoding unit 301 provides the EXI stream and the decoding result with the information on the state machine for the decoding rule having events as transitions to the holding unit 302. The holding unit 302 holds the information in the first level cache. The retention determining unit 303 checks whether data is present in the holding unit 302 at predetermined timing. At this point, data is present only in the first level cache.
The retention determining unit 303 divides the data held in the first level cache into two partial EXI streams as follows:
Partial EXI stream: 10000000100000001
Decoding result: StartElement(A) StartElement(B) Character(Boolean)
Start position: Type=Document, State=init
End position: Type=B, State=Term1
Number of references: 0
Partial EXI stream: 0000000
Decoding result: Value(0) EndElement EndElement EndDocument
Start position: Type=B, State=Term1
End position: Type=Document, State=Term2
Number of references: 0.
The retention determining unit 303 then holds the two partial EXI streams obtained by the division in the second level cache of the holding unit 302. In this case, the data obtained by the division may overlap in such a manner as:
StartElement(A) StartElement(B) Character(Boolean) StartElement(B) Character(Boolean).
If the number of data pieces to be held is to be limited, such a condition that a partial EXI stream having a length equal to or shorter than a threshold is not held may be provided. In this case, since the processing load of decoding a short partial EXI stream is not very heavy, the capacity of the second level cache can be reduced while maintaining the processing efficiency. It is assumed here that there is no second data to be held. “Type”, “State” and the values of the start position and the end position are names used to represent the positions of the state machine for the decoding rule, and any names may be used as long as the positions can be provided. The condition for the division is immediately before a content.
Subsequently, the EXI stream decoding unit 301 decodes the second EXI stream. The EXI stream decoding unit 301 searches whether or not data matching the current decoding position (the EXI stream and the state machine for the decoding rule) is present in the second level cache of the holding unit 302. Out of the second EXI stream “1000 0000 1000 0000 1100 0000”, data up to “1000 0000 1000 0000 1” matches.
The EXI stream decoding unit 301 thus refers to data:
Partial EXI stream: 10000000100000001
Decoding result: StartElement(A) StartElement(B) Character(Boolean)
Start position: Type=Document, State=init
End position: Type=B, State=Term1
Number of references: 0
as matching data in the second level cache and obtains the data as the decoding result.
Subsequently, since there is no decoded partial EXI stream corresponding to the remaining EXI stream “1000000” in the second level cache, the EXI stream decoding unit 301 decodes the remaining EXI stream from the end position.
If the event and the content of the result of decoding the remaining EXI stream “1000000” is:
Value(1) EndElement EndElement EndDocument, the entire decoding result including the referred data will be:
StartElement(A) StartElement(B) Character(Boolean) Value(1) EndElement EndElement EndDocument.
With the decoding device according to the present embodiment as described above, a result of decoding a partial EXI stream is used for decoding an EXI stream if a decoded partial EXI stream having the same bits is present, which allows redundant processing in decoding to be skipped. It is therefore possible to decode an EXI stream more efficiently.
Second EmbodimentNext, an embodiment in which a decoding device is installed as a home server 102 will be described.
Specifically, it is assumed that the results of decoding three EXI streams:
100000001000000010000000
100000010000000010000000
1000000010000000100000001000000011000000
are:
StartElement(A) StartElement(B) Character(Boolean) Value(0) EndElement EndElement EndDocument
StartElement(A) StartElement(C) Character(Boolean) Value(0) EndElement EndElement EndDocument
StartElement(A) StartElement(B) Character(Boolean) Value(0) EndElement StartElement(B) Character(Boolean) Value(1) EndElement EndElement EndDocument.
In this case, the state corresponding to <B> </B> in an XML document occurs three times. If the threshold is three, partial data containing this state out of the divided EXI stream is held in the second level cache. Alternatively, counting may be performed before dividing the EXI streams and the EXI streams may be divided according to the counting results.
According to the present embodiment, a partial EXI stream that frequently occurs is selectively held, which allows efficient decoding while reducing the cache capacity.
The decoding device according to the embodiments described above includes a control device such as a CPU, a storage device such as a read only memory (ROM) and a random access memory (RAM), an external storage device such as an HDD and a CD drive, a display device such as a display, and an input device such as a key board and a mouse, which is a hardware configuration utilizing a common computer system.
Programs to be executed by the decoding device according to the embodiments described above are recorded on a computer readable recording medium such as a CD-ROM, a flexible disk (FD), a CD-R, and a digital versatile disk (DVD) in a form of a file that can be installed or executed, and provided therefrom.
Alternatively, the programs in the embodiments described above may be stored on a computer system connected to a network such as the Internet, and provided by being downloaded via the network. Still alternatively, the programs to be executed by the decoding device according to the embodiments described above may be provided or distributed through a network such as the Internet. Still alternatively, the programs in the embodiments described above may be embedded on a ROM or the like in advance and provided therefrom.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Claims
1. A decoding device comprising:
- a decoder configured to decode binary data into a structured document according to a state machine that has been used to convert the structured document into binary data;
- a holding unit including a first level cache and a second level cache, the first level cache being configured to hold a result of decoding the binary data into the structured document by the decoder with the binary data, and the second level cache being configured to hold partial data pieces into which the binary data held by the first level cache is divided in predetermined units of events of the structured document and the result of decoding that corresponds to the partial data pieces; and
- a retention determiner configured to generate the partial data pieces by dividing the binary data held by the first level cache in the predetermined units of events, and storing the generated partial data pieces and the result of decoding corresponding to the partial data pieces into the second level cache, wherein
- when the binary data that is input includes a part matching a partial data piece held by the second level cache in the predetermined units of events, the decoder outputs the result of decoding corresponding to the matching partial data piece held by the second level cache.
2. The device according to claim 1, wherein the holding unit holds, in the second level cache, the partial data piece containing a data width of a code corresponding to transition of the state machine or a bit string that is part of the binary data obtained by dividing the binary data according to a data width of a content in the structured document, the result of decoding corresponding to the bit string, a state of the state machine corresponding to start of the bit string, and a state of the state machine corresponding to end of the bit string.
3. The device according to claim 1, wherein the retention determiner generates the partial data pieces by dividing the binary data held by the first level cache in units of events of the structured document, and sets the state of the state machine corresponding to the start of the bit string to be an index when holding the partial data pieces with the decoding result in the second level cache.
4. The device according to claim 1, wherein the retention determiner generates the partial data pieces by dividing the binary data held by the first level cache in such a way that a bit at a position immediately after a content is a start point and a bit at a position immediately before a content is an end point in the result of decoding, and holds the partial data pieces with the result of decoding, and position information containing the start points and the end points in the second level cache.
5. The device according to claim 1, wherein
- the decoder stores a count obtained by counting the number of transitions between states of the state machine used to binarize the structured document and decoding the binary data in the first level cache when decoding the binary data, and
- the retention determiner stores the partial data pieces whose count of the states is a predetermined threshold or larger in the second level cache.
6. The device according to claim 1, wherein the retention determiner prevents the holding unit from holding a partial data piece having a bit string that is smaller than a predetermined threshold.
7. The device according to claim 1, wherein the retention determiner stores the partial data pieces in the second level cache when the device is in an idle state.
8. The device according to claim 1, wherein the retention determiner deletes the partial data pieces by using a predetermined cache algorithm when a remaining capacity of the first level cache or the second level cache becomes less than a predetermined threshold.
9. A computer program product comprising a computer-readable medium containing a computer program that causes a computer to function as:
- a decoder configured to decode binary data into a structured document according to a state machine that has been used to convert the structured document into binary data;
- a holding unit including a first level cache configured to hold a result of decoding the binary data into the structured document by the decoder with the binary data, and a second level cache configured to hold partial data pieces into which the binary data held by the first level cache is divided in predetermined units of events of the structured document and the result of decoding that corresponds to the partial data pieces; and
- a retention determiner configured to generate the partial data pieces by dividing the binary data held by the first level cache in the predetermined units of events, and storing the generated partial data pieces and the result of decoding corresponding to the partial data pieces into the second level cache, wherein
- when the binary data that is input includes a part matching a partial data piece held by the second level cache in the predetermined units of events, the decoder outputs the decoding result corresponding to the matching partial data piece held by the second level cache.
Type: Application
Filed: Jan 3, 2014
Publication Date: Sep 25, 2014
Applicant: KABUSHIKI KAISHA TOSHIBA (Tokyo)
Inventors: Yumiko SAKAI (Kawasaki-shi), Yusuke DOI (Yokohama-shi)
Application Number: 14/146,856
International Classification: G06F 17/30 (20060101);