Patents Assigned to Andes Technology Corporation
-
Publication number: 20240004647Abstract: A vector processor with a vector reduction method and an element reduction method is provided. The vector processor includes a vector register file and first and second lanes. In the vector reduction method, the first lane loads a first operand and a first part of a second operand based on a first state parameter and performs a first reduction operation on the first operand and the first part of the second operand to generate a first part of a first reduction result. The second lane loads a second part of the second operand based on the first state parameter and uses the second part of the second operand as a second part of the first reduction result. One of the first lane or the second lane performs a second reduction operation on the first and second parts of the first reduction result to generate a second reduction result.Type: ApplicationFiled: July 1, 2022Publication date: January 4, 2024Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Chia-Wei Hsu
-
Publication number: 20230418614Abstract: A processor, an operation method, and a load-store device are provided. The processor is adapted to access a memory. The processor includes a vector register file (VRF) and the load-store device. The load-store device is coupled to the VRF. The load-store device performs a strided operation on the memory. In a current iteration of the strided operation, the load-store device reads a plurality of first data elements at a plurality of discrete addresses in the memory and writes the first data elements into the VRF, or the load-store device reads a plurality of second data elements from the VRF and writes the second data elements into a plurality of discrete addresses in the memory during the current iteration of the strided operation.Type: ApplicationFiled: June 22, 2022Publication date: December 28, 2023Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Chia-Wei Hsu
-
Patent number: 11797442Abstract: An integrated circuit and a method for executing a cache management operation are provided. The integrated circuit includes a master interface, a slave interface, and a link. The link is connected between the master interface and the slave interface, and the link includes an A-channel, a B-channel, a C-channel, a D-channel, and an E-channel. The A-channel is configured to transmit a cache management operation message of the master interface to the slave interface, and the cache management operation message is configured to manage data consistency between different data caches. The D-channel is configured to transmit a cache management operation acknowledgement message of the slave interface to the master interface.Type: GrantFiled: October 18, 2021Date of Patent: October 24, 2023Assignee: ANDES TECHNOLOGY CORPORATIONInventors: Zhong-Ho Chen, Yu-Lin Hsiao, Hsin Ming Chen
-
Patent number: 11687347Abstract: A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard and dispatches load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatch the load/store instructions to the load/store unit for execution.Type: GrantFiled: May 25, 2021Date of Patent: June 27, 2023Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20230122423Abstract: An integrated circuit and a method for executing a cache management operation are provided. The integrated circuit includes a master interface, a slave interface, and a link. The link is connected between the master interface and the slave interface, and the link includes an A-channel, a B-channel, a C-channel, a D-channel, and an E-channel. The A-channel is configured to transmit a cache management operation message of the master interface to the slave interface, and the cache management operation message is configured to manage data consistency between different data caches. The D-channel is configured to transmit a cache management operation acknowledgement message of the slave interface to the master interface.Type: ApplicationFiled: October 18, 2021Publication date: April 20, 2023Applicant: ANDES TECHNOLOGY CORPORATIONInventors: Zhong-Ho Chen, Yu-Lin Hsiao, Hsin Ming Chen
-
Publication number: 20220382546Abstract: The mask data corresponding to each data element of the issued instruction may be handled by a mask queue, where only the valid mask data are stored to the mask queue. The mask data of multiple vector instructions may be stored in the mask queue. The corresponding mask data may be accessed from the mask queue when the vector instruction(s) is dispatched from the execution queue to the functional unit for execution. In the case of 512-bit wide mask data is needed, the issuing of the vector instruction from the decode/issue unit to the execution queue may be stalled until the mask queue is available. In some embodiments, one mask queue may be dedicated to one execution queue. Alternatively, one mask queue may be shared between two different execution queues. In the disclosure, resources are conserved without dedicating additional storage space for handling mask data of the vector instruction.Type: ApplicationFiled: May 31, 2021Publication date: December 1, 2022Applicant: ANDES TECHNOLOGY CORPORATIONInventors: Thang Minh Tran, Chia-Wei Hsu
-
Publication number: 20220382547Abstract: A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when the instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard, dispatches the load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatches the load/store instructions to the load/store unit for execution.Type: ApplicationFiled: May 25, 2021Publication date: December 1, 2022Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11467841Abstract: A microprocessor that includes a shared functional unit, a first execution queue and a second execution queue is introduced. The first execution queue includes a plurality of entries, wherein each entry of the first execution queue includes a first count value which is decremented until the first count value reaches 0. The first execution queue dispatches the first-type instruction to the shared functional unit when the first count value reaches 0. The second execution queue include a plurality of entries, wherein each entry of the second execution queue comprises a second count value which is decremented until the second count value reaches 0. The second execution queue dispatches the second-type instruction to the shared functional unit when the second count value reaches 0.Type: GrantFiled: June 1, 2021Date of Patent: October 11, 2022Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11281586Abstract: The invention provides a processor including a prediction table, a prediction logic circuit, and a prediction verification circuit. The prediction table has a plurality of sets respectively corresponding to a plurality of cache sets of a cache memory in the cache system, each of the sets has a plurality of confidence values, and the prediction table provides the confidence values of a selected set according to the index. The prediction logic circuit receives the confidence values of the selected set, and generates a prediction result by judging whether each of the confidence values of the selected set is larger than a threshold value or not. The prediction verification circuit receives the prediction result, generates a correct/incorrect information according to the prediction result, and generates an update information according to the correct/incorrect information. Wherein, the prediction verification circuit updates the confidence values of the prediction table according to the update information.Type: GrantFiled: May 9, 2017Date of Patent: March 22, 2022Assignee: ANDES TECHNOLOGY CORPORATIONInventors: Kun-Ho Liu, Chieh-Jen Cheng, Chuan-Hua Chang, I-Cheng Kevin Chen
-
Patent number: 11263013Abstract: A processor that includes a register file, a read shifter, a decode unit and a plurality of functional units is introduced. The register file includes a read port. The read shifter includes a plurality of shifter entries and is configured to shift out a shifter entry among the plurality of shifter entries every clock cycle. Each of the plurality of shifter entries is associated with a clock cycle and each of the plurality of shifter entries comprises a read value that indicates an availability of the read port of the register file for a read operation in the clock cycle. The decode unit is coupled to the read shifter and is configured to decode and issue an instruction based on the read values included in the plurality of shifter entries of the read shifter. The plurality of functional units is coupled to the decode unit and the register file and is configured to execute the instruction issued by the decode unit and perform the read operation to the read port of the register file.Type: GrantFiled: April 7, 2020Date of Patent: March 1, 2022Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11204770Abstract: A microprocessor using a counter in a scoreboard is introduced to handle data dependency. The microprocessor includes a register file having a plurality of registers mapped to entries of the scoreboard. Each entry of the scoreboard has a counter that tracks the data dependency of each of the registers. The counter decrements for every clock cycle until the counter resets itself when it counts down to 0. With the implementation of the counter in the scoreboard, the instruction pipeline may be managed according to the number of clock cycles of a previous issued instruction takes to access the register which is recorded in the counter of the scoreboard.Type: GrantFiled: April 1, 2020Date of Patent: December 21, 2021Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210389979Abstract: A data processing system includes a priority scheduler and execution queue between an instruction decode unit and a functional function. The priority scheduler determines whether a source operand data specified by an instruction issued by the instruction decode unit is ready or not. The priority scheduler prioritizes the decoding instruction having all of the source operand data ready over the ready instruction from the execution queue to send to the functional unit. The decoding instruction having a data dependency is placed into the execution queue.Type: ApplicationFiled: June 15, 2020Publication date: December 16, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210374070Abstract: A microprocessor includes a translation look-aside buffer (TLB) having a plurality of TLB entries addressable by a branch address and having a branch target buffer (BTB), including a plurality of BTB entries addressable by the branch address. Each TLB entry includes a virtual address. Each BTB entry including a branch tag-way data and a target tag-way data. To perform a branch prediction, the BTB and TLB are accessed, where the TLB way associative data representing one of N sets of TLB entries is used to determine BTB hit or BTB miss. If BTB hit, the branch target address of the branch address may be obtained by accessing the TLB using target tag-way data in the BTB, or by using the branch page address when a same page bit in the hit BTB entry is set.Type: ApplicationFiled: June 1, 2020Publication date: December 2, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11188478Abstract: A microprocessor includes a translation look-aside buffer (TLB) having a plurality of TLB entries addressable by a branch address and having a branch target buffer (BTB), including a plurality of BTB entries addressable by the branch address. Each TLB entry includes a virtual address. Each BTB entry including a branch tag-way data and a target tag-way data. To perform a branch prediction, the BTB and TLB are accessed, where the TLB way associative data representing one of N sets of TLB entries is used to determine BTB hit or BTB miss. If BTB hit, the branch target address of the branch address may be obtained by accessing the TLB using target tag-way data in the BTB, or by using the branch page address when a same page bit in the hit BTB entry is set.Type: GrantFiled: June 1, 2020Date of Patent: November 30, 2021Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11188468Abstract: A processor includes a prediction table, a prediction logic circuit, and a prediction verification circuit. The prediction table has a plurality of sets, each of the sets has a hot way number, at least one warm way number, and at least one confidence value corresponding to the at least one warm way number. The prediction logic circuit generates a prediction result by predicting if the at least one warm way number is an opened way. The prediction verification circuit generates a correct/incorrect information according to the prediction result, and generates an update information according to the correct/incorrect information. The prediction verification circuit updates the hot way number, the at least one warm way number and the at least one confidence value of the at least one warm way number according to the update information.Type: GrantFiled: June 15, 2020Date of Patent: November 30, 2021Assignee: ANDES TECHNOLOGY CORPORATIONInventors: Kun-Ho Liu, Chieh-Jen Cheng, Chuan-Hua Chang, I-Cheng Kevin Chen
-
Patent number: 11163582Abstract: In the disclosure, the microprocessor resolves the conflicts in decode stage and schedules the instruction to be executed at a future time. The instruction is issued to an execution queue until the scheduled time in the future when it is dispatched to a functional unit for execution. The disclosure uses a counter for the functional unit to track when the resource is available in the future to accept the next instruction. The disclosure also tracks the future N cycles when the register file read and write ports are scheduled to read and write operand data.Type: GrantFiled: April 20, 2020Date of Patent: November 2, 2021Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210326141Abstract: In the disclosure, the microprocessor resolves the conflicts in decode stage and schedules the instruction to be executed at a future time. The instruction is issued to an execution queue until the scheduled time in the future when it is dispatched to a functional unit for execution. The disclosure uses a counter for the functional unit to track when the resource is available in the future to accept the next instruction. The disclosure also tracks the future N cycles when the register file read and write ports are scheduled to read and write operand data.Type: ApplicationFiled: April 20, 2020Publication date: October 21, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210311741Abstract: A processor that includes a register file, a read shifter, a decode unit and a plurality of functional units is introduced. The register file includes a read port. The read shifter includes a plurality of shifter entries and is configured to shift out a shifter entry among the plurality of shifter entries every clock cycle. Each of the plurality of shifter entries is associated with a clock cycle and each of the plurality of shifter entries comprises a read value that indicates an availability of the read port of the register file for a read operation in the clock cycle. The decode unit is coupled to the read shifter and is configured to decode and issue an instruction based on the read values included in the plurality of shifter entries of the read shifter. The plurality of functional units is coupled to the decode unit and the register file and is configured to execute the instruction issued by the decode unit and perform the read operation to the read port of the register file.Type: ApplicationFiled: April 7, 2020Publication date: October 7, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210311743Abstract: A microprocessor using a counter in a scoreboard is introduced to handle data dependency. The microprocessor includes a register file having a plurality of registers mapped to entries of the scoreboard. Each entry of the scoreboard has a counter that tracks the data dependency of each of the registers. The counter decrements for every clock cycle until the counter resets itself when it counts down to 0. With the implementation of the counter in the scoreboard, the instruction pipeline may be managed according to the number of clock cycles of a previous issued instruction takes to access the register which is recorded in the counter of the scoreboard.Type: ApplicationFiled: April 1, 2020Publication date: October 7, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Publication number: 20210303305Abstract: A processor that includes a register file, a latency shifter, a decode unit and a plurality of functional units is introduced. The register file includes a write port. The latency shifter includes a plurality of shifter entries and shifts out a shifter entry among the shifter entries every clock cycle. Each of the shifter entries is associated with a clock cycle and each of shifter entries includes a writeback value that indicates whether the write port of the register file is available for a writeback operation in the associated clock cycles. The decode unit is configured to decode an instruction and issue the instruction according to the writeback value of the latency shifter. The functional units are coupled to the decode unit and the register file and are configured to execute the instruction issued by the decode unit and perform writeback operation to the write port of the register file.Type: ApplicationFiled: March 31, 2020Publication date: September 30, 2021Applicant: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran