台湾专利TW201303736A 使用可分割引擎實體化的虛擬核心以支援程式碼區塊執行的記憶體片段

专利PDF首页>>台湾专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
一種用於針對一處理器使用複數個記憶體片段來執行指令的系統。該系統包括一通用前端排程器，用於接收一輸入的指令序列，其中該通用前端排程器分割該輸入的指令序列成為複數個指令的程式碼區塊，並產生複數個繼承向量來描述該等程式碼區塊的指令之間之交互相關性。該系統另包括耦合來接收由該通用前端排程器分配的程式碼區塊之該處理器的複數個虛擬核心，其中每一虛擬核心包含複數個可分割引擎的一個別的資源子集合，其中該等程式碼區塊使用該等可分割引擎根據一虛擬核心模式與根據該等個別的繼承向量來執行。複數個記憶體片段被耦合至該等可分割引擎來提供資料儲存。
公开号:TW201303736A
申请号:TW101110092
申请日:2012-03-23
公开日:2013-01-16
发明作者:Mohammad Abdallah
申请人:Soft Machines Inc；
IPC主号:G06F9-00

专利说明:
使用可分割引擎實體化的虛擬核心以支援程式碼區塊執行的記憶體片段
本申請案主張由Mohammad A.Abdallah於2011年3月25日立案之共同申請共同受讓的美國臨時專利申請案編號61/467,940之優先權，其名為「使用可分割引擎實體化的虛擬核心以支援程式碼區塊執行的記憶體片段」(MEMORY FRAGMENTS FOR SUPPORTING CODE BLOCK EXECUTION BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES)，在此完整引述併入。【相關申請案之對照】
本申請案關於由Mohammad A.Abdallah於2007年4月12日立案之共同申請共同受讓的美國專利申請案編號2009/0113170，其名為「處理在關連作業中平行指定之指令矩陣的裝置與方法」(APPARATUS AND METHOD FOR PROCESSING AN INSTRUCTION MATRIX SPECIFYING PARALLEL IN DEPENDENT OPERATIONS)，在此完整引述併入。
本申請案關於由Mohammad A.Abdallah於2007年11月14日立案之共同申請共同受讓的美國專利申請案編號2010/0161948，其名為「在支援多種內容切換模式與虛擬化方案之多執行緒架構中處理複雜指令格式的裝置與方法」(APPARATUS AND METHOD FOR PROCESSING COMPLEX INSTRUCTION FORMATS IN A MULTITHREADED ARCHITECTURE SUPPORTING VARIOUS CONTEXT SWITCH MODES AND VIRTUALIZATION SCHEMES)，在此完整引述併入。
本發明概略關於數位電腦系統，尤指一種用於選擇包含一指令序列之指令的系統與方法。
要處理相關或完全無關之多項工作即需要處理器。這種處理器之內部狀態通常由暫存器所構成，其可在每一次程式執行的特定瞬間保存不同的數值。在每一次程式執行的瞬間，該內部狀態圖像被稱為該處理器的架構狀態。
當程式碼執行被切換來運作另一項功能時(例如另一執行緒、程序或程式)，則該機器/處理器之狀態必須被儲存，使得該項新的功能可利用該等內部暫存器來建構其新的狀態。一旦該項新功能終止時，則其狀態即被丟棄，而該先前內容的狀態將被恢復，並重新開始執行。這種切換程序被稱為一內容切換，且通常包括10餘次或數百次循環，特別是利用到採用大量暫存器(例如64、128、256)的現今架構及/或無順序執行時。
在感知執行緒(thread-aware)的硬體架構中，通常該硬體可對於有限數目的硬體支援執行緒來支援多種內容狀態。在此例中，該硬體對於每一支援的執行緒複製所有架構狀態元件。此即可在當執行一新執行緒時無需內容切換。但是，如此仍具有多項缺點，即對於在硬體中支援的每一額外的執行緒要複製所有架構狀態元件(即暫存器)之區域、電力與複雜性。此外，如果軟體執行緒的數目超過明確支援之硬體執行緒的數目，則仍必須執行該內容切換。
此即為需要大量執行緒的一微細粒度基礎(fine granularity basis)之下所常見需要的平行化(parallelism)。具有複製內容狀態硬體儲存之硬體執行緒感知架構無助於非執行緒的軟體程式碼，且僅能對於被執行緒化的軟體減少其內容切換的次數。但是，那些執行緒通常針對粗粒度平行化所建構，並在初始化與同步化、離開微細粒度平行化(例如函數呼叫與迴路平行執行)時造成沉重的軟體負擔，而無法有效率的執行緒初始化/自動產生。這些描述的負擔在使用針對非明確地/簡易地平行化/執行緒的軟體程式碼之現今的編譯器或使用者平行化技術時，伴隨著這些程式碼之自動平行化的困難性。
在一具體實施例中，本發明實施成一種針對一處理器使用複數個記憶體片段執行指令之系統。該系統包括一通用前端排程器，用於接收一輸入的指令序列，其中該通用前端排程器分割該輸入的指令序列成為複數個指令的程式碼區塊，並產生複數個繼承向量來描述該等程式碼區塊的指令之間之交互相關性。該系統另包括耦合來接收由該通用前端排程器分配的程式碼區塊之該處理器的複數個虛擬核心，其中每一虛擬核心包含複數個可分割引擎的一個別的資源子集合，其中該等程式碼區塊使用該等可分割引擎根據一虛擬核心模式與根據該等個別的繼承向量來執行。複數個記憶體片段被耦合至該等可分割引擎來提供資料儲存。
本發明之其它具體實施例利用一共用排程器、一共用暫存器檔案及一共用記憶體子系統來針對處理器之多個可分割引擎實施片段化的位址空間。該等可分割引擎可用於實施複數個虛擬核心。片段化藉由允許額外的虛擬核心來共同合作執行指令序列而能夠調整微處理器的效能。該片段化階層在每一快取階層皆相同(例如L1快取、L2快取及該共用暫存器檔案)。該片段化階層可使用位址位元將該位址空間區分成片段，其中使用該等位址位元使得該等片段在快取線邊界之上及頁面邊界之下。每一片段可設置成利用一多埠式記憶庫結構做儲存。
前述內容為一總結，因此包含必要的簡化、一般化與細節之省略；因此，本技術專業人士將可瞭解到該總結僅為例示性，而並非為任何型式的限制。僅由該等申請專利範圍所定義之本發明的其它態樣、創新特徵與好處將可在以下提出的非限制性詳細說明中做瞭解。
雖然本發明已經配合一具體實施例做說明，本發明並非要被限制於此處所提出的該等特定型式。相反地，本發明係要涵蓋其它這些選項、修正及同等者，其係被合理地包括在由該等附屬申請專利範圍所定義之本發明的範圍之內。
在以下的詳細說明中，已提出許多特定細節，例如特定方法順序、結構、元件與連接。但是應瞭解到這些與其它特定細節不需要被用來實施本發明之具體實施例。在其它狀況下，熟知的結構、元件或連接已被省略或並未特別詳細說明係為了避免不必要地混淆本說明。
在本說明書中的「一(one、an)具體實施例」係要代表配合該具體實施例所描述的一特定特徵、結構或特性被包括在本發明的至少一具體實施例中。在本說明書中多處有用語「在一具體實施例中」的出現並非一定都參照到相同的具體實施例，也非為與其它具體實施例相互排斥的獨立或其它的具體實施例。再者，所述之多種特徵可由一些具體實施例呈現而不出現於其它具體實施例。同樣地，所述之多種需求可為一些具體實施例的需求而非其它具體實施例所需要。
該等詳細說明的一些部份在以下係以程序、步驟、邏輯方塊、處理，以及其它對於一電腦記憶體內資料位元之作業的符號表示來呈現。這些說明及表示為在資料處理技術中那些專業人士所使用來最佳地傳遞他們工作的實質內容到本技藝中其他專業人士的手段。概言之，在此處的程序、電腦可執行步驟、邏輯方塊及程序等，其應視為可達到所想要結果之步驟或指令的一自我符合的順序。該等步驟為那些需要實體量的實體操縱。通常但非必要，這些數量可採取一電腦可讀取儲存媒體的電子或磁性信號之型式，並能夠在一電腦系統中被儲存、轉換、組合、比較，及另可進行操縱。主要為了共通用法的原因，較為方便地是將這些信號稱為位元、數值、元件、符號、字元、項目、數目或類似者。
但是應要注意到所有這些及類似術語係要關聯於該等適當實體數量，並僅為應用到這些數量的便利標記。除非在以下討論中可瞭解者之外有特定地陳述，將可瞭解到在整個本發明討論中所利用的術語，例如「處理」或「存取」或「寫入」或「儲存」或「重製」或類似者，皆代表一電腦系統或類似的電子運算裝置之動作及程序，其可操縱及轉換表示成該電腦系統之暫存器及記憶體及其它電腦可讀取媒體中的實體(電子)數量的資料成為類似地表示成在該電腦系統記憶體、或暫存器、或其它像是資訊儲存、傳輸或顯示裝置內的實體數量之其它資料。
本發明之具體實施例利用一共用通用前端排程器、複數個分段的暫存器檔案及一記憶體子系統來針對一多核心處理器之多個核心實施片段化的位址空間。在一具體實施例中，片段化可藉由允許額外的虛擬核心(例如軟核心)來共同合作地執行包含一或多個執行緒的指令序列而能夠調整微處理器之效能。該片段化階層對於每一快取階層皆相同(例如L1快取、L2快取及該共用暫存器檔案)。該片段化階層可使用位址位元將該位址空間區分成片段，其中使用該等位址位元使得該等片段由在快取線邊界之上及頁面邊界之下的位元做辨識。每一片段設置成利用一多埠式記憶庫結構做儲存。本發明之具體實施例在以下由圖1A與圖1B做進一步說明。
圖1A所示為根據本發明一具體實施例之一種處理器的概述圖。如圖1A所示，該處理器包括一通用前端提取與排程器10及複數個可分割引擎11-14。
圖1A所示為該通用前端產生程式碼區塊與繼承向量來在它們個別的可分割引擎上支援程式碼序列之執行的方式概述。每一程式碼序列20-23根據該特定虛擬程式碼執行模式而屬於相同的邏輯核心/執行緒或屬於不同的邏輯核心/執行緒。該通用前端提取與排程器將處理程式碼序列20-23來產生程式碼區塊與繼承向量。這些程式碼區塊與繼承向量被分配給特定可分割引擎11-14，如所示。
該等可分割引擎根據一種選擇的模式實施虛擬核心。一可分割引擎包括一節段、一片段與一些執行單元。在該等可分割引擎之內的該等資源可用於實施具有多種模式的虛擬核心。如該虛擬核心模式所提供者，可實施一軟核心或許多軟核心來支援一邏輯核心/執行緒。在圖1A的具體實施例中，根據該選擇的模式，該等虛擬核心可支援一邏輯核心/執行緒或四邏輯核心/執行緒。在該等虛擬核心支援四邏輯核心/執行緒的一具體實施例中，每一虛擬核心的該等資源被分散橫跨每一可分割引擎。在該等虛擬核心支援一邏輯核心/執行緒的一具體實施例中，所有該等引擎的該等資源係專屬於該核心/執行緒。該等引擎被分割，使得每一引擎提供包含每一虛擬核心的該等資源之一子集合。換言之，一虛擬核心將包含該等引擎11-14之每一者的該等資源之一子集合。該等引擎11-14之每一者的該等資源之間的通訊由一通用內連線結構30所提供，藉以實施此程序。另外，引擎11-14可用於實施一實體模式，其中引擎11-14之該等資源係專屬於支援一專屬的核心/執行緒之執行。依此方式，由該等引擎實施的該等軟核心包含具有分散橫跨該等引擎之每一者的資源之虛擬核心。該等虛擬核心執行模式另於下述在後續的圖示中做進一步說明。
必須注意到在一種習用的核心實施中，僅有一核心/引擎內的資源僅被分配至一邏輯執行緒/核心。相反地，在本發明之具體實施例中，任何引擎/核心之該等資源可被分割成與其它引擎/核心分割共同地實體化被分配至一邏輯執行緒/核心的一虛擬核心。此外，本發明之具體實施例可實施多種虛擬執行模式，其中那些相同引擎可被分割成支援許多專屬的核心/執行緒、許多動態分配的核心/執行緒、或是所有引擎之所有該等資源支援一單一核心/執行緒之執行的一具體實施例。這些具體實施例在下述之說明中做進一步說明。
圖1B所示為根據本發明一具體實施例中針對一多核心處理器之可分割引擎及它們的組件之概要圖，其中包括分段的排程器與暫存器檔案、通用內連線與一片段化記憶體子系統。如圖1所示，顯示有四個片段101-104。該片段化階層對於每一快取階層皆相同(例如L1快取、L2快取及該負載儲存緩衝器)。資料可經由記憶體通用內連線110a於該等L1快取之每一者、該等L2快取之每一者、及該等負載儲存緩衝器之每一者之間交換。
該記憶體通用內連線包含一路由矩陣，其允許複數個核心(例如位址計算與執行單元121-124)來存取可能儲存在該分段的快取階層(例如L1快取、負載儲存緩衝器與L2快取)中任何一點處的資料。圖1亦描述了片段101-104之每一者可由位址計算與執行單元121-124經由記憶體通用內連線110a存取之方式。
執行通用內連線110b類似地包含一路由矩陣，其可允許該等複數個核心(例如位址計算與執行單元121-124)來存取可能儲存在該等分段的暫存器檔案之任何一處的資料。因此，該等核心可經由記憶體通用內連線110a或執行通用內連線110b存取儲存在該等片段之任何一者中的資料及儲存在該等節段之任何一者中的資料。此外，必須注意到在一具體實施例中，另一通用內連線存在於該等共用分割提取與排程器之每一者之間。此係由在每一共用分割提取與排程器之間連接的該等水平箭頭所示。
圖1B另顯示一通用前端提取與排程器150，其為整個機器的視圖，且其管理該等暫存器檔案節段與該等片段化的記憶體子系統之運用。位址產生包含片段定義之基礎。該通用前端提取與排程器藉由分配指令序列至每一節段的分割排程器來運作。然後該共用分割排程器分派那些指令序列在位址計算與執行單元121-124上執行。
必須注意到在一具體實施例中，該等共用分割提取與排程器之功能性可被加入到通用前端排程器150。在這種具體實施例中，該等節段未包括個別的共用分割提取與排程器，且它們之間將不需要一內連線。
此外，必須注意到圖1A所示的該等可分割引擎可用一階層方式巢化。在這種具體實施例中，一第一級可分割引擎將包括一局部前端提取與排程器，及與其連接的多個次級可分割引擎。
圖2所示為根據本發明一具體實施例之排程器流程圖。如圖2所示，顯示一桶緩衝器(bucket buffer)，其包括推測式執行緒桶指標、桶來源及目的地清單。該等排程器與執行桶包括一桶分派選擇器及該虛擬暫存器匹配與讀取，其包括一暫存器階層與一暫存器快取的可能性。該後端為已執行的桶被記錄及在汰除之前強制異常排序的地方。該暫存器階層/快取亦做為該等執行的桶結果之一中間儲存，直到它們為非推測性，並可更新該架構狀態。下述揭示一種該前端、該分派階段與已執行的桶被記錄處的該後端的可能實施。
圖2所示為由管理少量緊密耦合的執行緒之一桶緩衝器調整成管理多個桶緩衝器與執行緒之硬體電路的觀念之方式。可被擴充來處理可能具有較少緊密互動的較大量之執行緒的那些電路被描述為一通用前端(例如圖1所示的通用前端排程器150)。
該程序開始於提取一新執行緒矩陣/桶/區塊，然後該新執行緒桶被指定至該桶緩衝器中一空的桶槽。在執行緒分配指標陣列852中該等執行緒分配指標之每一者構成一桶間隔，其為該執行緒被允許實體上來將其指令的區塊/桶置於其中。那些執行緒之每一者以循環的方式保持分配桶至其連續空間之相對應間隔內的該桶緩衝器陣列中。在每一執行緒空間內該等桶/區塊被指定一新編號852，其在每一次一新的桶/區塊被指定時即遞增。對於桶850中每一有效來源。每一桶的該等有效來源具有一有效讀取位元「Rv」，代表此來源為此桶內該等指令所需要。利用相同方式，要由此桶中的指令寫回的每一目的地暫存器在該桶中具有一有效位元「Wv」，且其在一目的地繼承向量853中具有一欄位。當一新的桶要被提取到該桶緩衝器中時，其由執行緒桶分配指標852所指到的該先前分配的桶繼承該目的地繼承向量。該繼承向量自該先前分配的桶複製，然後其覆寫對應於將由那些桶指令更新的該等暫存器的那些有效目的地欄位。該等有效目的地將標示該目前桶編號，而該等無效目的地係自該桶內的該相對應繼承向量複製。然後該執行緒桶指標藉由遞增其指標(其纏繞在其間隔內)對該新的提取桶來更新。
在該桶分派與執行階段中，每當一桶被執行而沒有任何異常處理時，則桶執行旗標(包含該桶編號)854被設定，並廣播到整個該桶緩衝器，並在每一桶內被閂鎖/監視，其具有以該桶編號為來源的一來源。亦可能連同該桶編號傳送其它相關的資訊，例如關於虛擬暫存器位置的資訊。當該等來源桶的所有該等執行旗標被設置在一桶內時，則桶預備位元855被設定，且該桶預備好被分派與執行。當該桶執行而沒有任何異常且其預備好以該程式的序列順序來更新該架構狀態時，則其汰除該桶，且汰除執行緒指標857被遞增到該陣列中的下一個桶。該汰除的桶位置可被指定給一新的桶。
那些緊密相關的執行緒皆可在該矩陣/桶/區塊緩衝器內共存；每一執行緒將佔據屬於該執行緒的連續桶之一間隔。該執行緒的分配指標以一循環方式移動到桶的此間隔內，以該所述的循環方式提取新的指令桶，並將其分配在該執行緒間隔內。利用這種間隔區段化，該整個桶緩衝器利用桶的不同或相等間隔長度來動態地區分。
此處所介紹的該繼承向量的概念係針對該指令桶以及該執行緒。每一指令矩陣/區塊/桶寫入到該等架構性暫存器當中特定的暫存器當中。在分配階段中每一新的桶更新此繼承向量，其寫入其本身的該執行緒與桶編號到此向量中，而使得其並未寫入其中的該等暫存器之該等欄位保持未更新。此桶繼承向量B_iv 856以程式順序由每一桶轉送至下一個桶。在圖8中，如果在該矩陣中該等指令寫入到那些暫存器中時，每一矩陣將其本身的編號寫入到該等架構目的地暫存器中，否則其繼承來自在該執行緒中該先前的桶之該B_iv的該數值。
圖3所示為根據本發明一具體實施例之示例性硬體電路圖，其中顯示有儲存運算子與結果的一分段的暫存器檔案並具有一內連線。圖3所示為經由該執行通用內連線耦合至複數個執行單元的一運算子結果緩衝器。
圖4所示為根據本發明一具體實施例之一通用前端排程器之示意圖。該通用前端排程器設置成處理可能具有較少緊密互動之較多數目的執行緒(例如圖1所示之排程器150中的通用前端)。此圖所示為來自一邏輯核心的一序列的指令如何被分配橫跨許多虛擬核心。此程序將針對存在於該機器中每一邏輯核心來重複。必須注意到圖4的「引擎」包含一虛擬核心的該等組件，其中該暫存器檔案被明確地描述成顯示在該暫存器檔案階級處虛擬核心間的通訊之態樣。
例如，如圖4所示，該通用前端排程器可處理一執行緒標頭902，但並不需要該執行緒內該等實際指令來強制進行橫跨那些遠離的執行緒之相關性檢查。該執行緒的標頭與其桶的該等子標頭僅包含關於那些執行緒與桶寫入其中的該等架構暫存器之資訊(那些指令之目的地暫存器)，在那些標頭中不需要包括實際指令或那些指令的來源。實際上其足以列出那些目的地暫存器或一位元向量，其中每一個別位元針對為一指令之一目的地的每一暫存器做設定。該標頭並不需要被實際上放置成該等指令的一標頭；其可為任何格式的封包，或該等執行緒內該等指令之該等目的地暫存器之小型化表示，其可能或可能不儲存有該等指令資訊之其餘部份。
此通用前端僅以程式順序提取該等執行緒/區塊之該等標頭，並產生動態執行緒及/或桶繼承向量901(Tiv及/或Biv)。每次分配一新執行緒時，那些繼承向量藉由保持該目前執行緒桶將不會寫入或更新之該等舊的欄位來轉送，如903所示。那些繼承向量被分佈到大量的引擎/核心或處理器904，其每一者可包括一局部前端與一提取單元(其將提取與儲存針對每一桶產生該相關性向量的該等實際指令)及具有局部暫存器檔案905的一局部矩陣/區塊/桶緩衝器。然後該等局部前端提取該等實際指令，並使用來自由該通用前端取得的該等繼承向量之資訊來填充被帶入到那些引擎來執行的該等指令之該等指令來源的相關性資訊。圖3所示為一通用前端實施，及其僅使用關於該等指令之簡要資訊(例如其僅為那些指令寫入其中的該等暫存器)將該等繼承向量散佈到不同的引擎904的方式。其它要放置在該標頭中而有助益的資訊為關於在該等執行緒內或橫跨其間的該控制路徑中的變化之資訊。一通用分支預測器可用於預測橫跨那些執行緒之控制流程。所以這些標頭可包括該等分支的目的地與偏移量。除了該分支預測器來決定控制流程之外，該硬體/編譯器可決定橫跨一分支的兩條控制路徑來分派獨立的執行緒。這此例中，稍後將使用該繼承向量合併那兩條路徑之執行。圖3亦顯示出當一新執行緒的一標頭由該通用前端提取時的轉送程序，例如執行緒2(906)將更新被轉送給它的相對應繼承向量901，造成暫存器1、2、3、4、6、0與7利用T2標記被更新的向量910。請注意在910中，暫存器5並非由T2桶寫入，因此其標記係由一先前繼承向量所繼承。
一項有趣的觀察為該等暫存器檔案允許該等核心/引擎之間橫跨通訊。由於橫跨引擎而需要的該等暫存器之一早期要求(來降低該存取潛時)只要在該執行緒的該等指令桶被提取與分配在該局部桶緩衝器中時即被放置，其時間為該來源相關性資訊出現之時，使得可能在該等實際指令被分派來執行之很久之前就發出橫跨引擎執行緒參照。在任何情況下，該指令將不會被分派，直到該交互參照的來源被轉送且抵達時。此交互參照的來源可被儲存在該局部多執行緒的暫存器檔案或暫存器快取中。雖然此交互參照的來源可被儲存在類似於該負載儲存緩衝器的一緩衝器中(其可重新使用該負載儲存緩衝器實體儲存器與相關性檢查機制，但做為一暫存器負載而非記憶體負載)。可使用許多拓樸來連接橫跨該等引擎/核心的該等暫存器檔案，其可為一環狀拓樸或橫桿拓樸或網格路由內連線。
以下的討論可例示暫存器檔案分段化可如何用於一引擎內且亦可橫跨引擎來使用。當該桶被分派時，其來源被傳送(同時或依序)至該暫存器檔案與該暫存器快取。如果該暫存器檔案被實體上統一，並直接支援執行緒化，則該運算子直接由該相對應的執行緒暫存器節段讀取。如果該暫存器檔案為一虛擬暫存器，包括使用標籤的一實體上分段的暫存器檔案，則一標籤匹配必須做為該虛擬暫存器讀取的一部份來完成。如果該標籤匹配，則該讀取由該分段的暫存器檔案發生。
所揭示者為可支援軟體執行緒、硬體產生的執行緒、VLIW執行、SIMD & MIMD執行以及無順序超純量執行之模擬的暫存器架構。雖然其實體上為分段，但可看作一統一的架構資源。此分段的暫存器為該虛擬暫存器檔案的一部份，其可包括一暫存器階層與一暫存器快取，以及儲存與檢查暫存器標籤的機制。如果我們使用一位置為主的方案來利用該相關性繼承向量，則可排除該標籤存取。該方案之運作使得當該執行的桶編號於分派階段期間被廣播時，後續指令的所有該等來源執行一CAM(內容可定址匹配，Content addressable match)，其比較它們的來源桶與該剛被分派/執行的桶來設定該來源的該預備旗標。此處該桶被執行之實際位置亦可連同該暫存器編號來傳遞，使得可解決任何的混淆。
例如，考慮一種具有四個暫存器檔案節段的實施，其每一者包含16個暫存器。例如，在分派一桶#x至節段2時，該桶編號x被廣播至該桶緩衝器，該節段#2亦隨其被廣播，使得與桶x有相關性的所有來源將記錄其寫入所有其暫存器在節段2中。當來到分派那些指令時，它們知道它們需要由節段2而非任何其它節段讀取它們的暫存器，即使相同的暫存器編號存在於該等其它節段中。此亦應用至該暫存器快取來避免使用標籤。我們可延伸此觀念至該通用前端，其中除了該執行緒資訊之外，該繼承向量可指定寫入到此暫存器之該指令桶被分配在那一個引擎中。
圖5所示為根據本發明一具體實施例中橫跨許多虛擬核心之指令分配的另一種實施。圖5顯示一運行時間最佳化器排程器550，其藉由分佈繼承向量編碼節段至該等虛擬核心來運作。在一具體實施例中，該最佳化器察看指令的一些程式碼區塊，並重新排程指令橫跨所有該等程式碼區塊，以產生程式碼節段與繼承向量。該最佳化器的目標將是使得程式碼節段在它們個別的虛擬核心上重疊執行之執行效率可最大化。
圖6為根據本發明一具體實施例中具有相對應複數的暫存器檔案與運算子結果緩衝器之複數個暫存器節段。如圖6所示，該執行通用內連線可連接每一暫存器節段至複數個位址計算與執行單元。
圖6的該等暫存器節段可用於實施以下三種執行模式之一：由該編譯器/程式化器被群組在一起來形成一MIMD超指令矩陣，或是每一矩陣在一執行緒的模式中被獨立地執行，其中個別的執行緒在該等四個硬體節段之每一者之上同時地執行。可能的最後執行模式為有能力使用一硬體相關性檢查由一單一執行緒動態地執行四個不同的指令矩陣，以確保在該等四個不同硬體節段上同時執行的那些不同矩陣之間不會存在有相關性。
圖6中的該等暫存器檔案另可根據該執行模式來設置。在一種模式中，該等暫存器當案係以用於四個節段的一MIMD寬度的一MIMD分段的暫存器檔案來看待，或是它們做為四個各自的暫存器檔案，其每一者用於一獨立執行緒。該等暫存器檔案亦支援一動態執行模式，其中該等四個節段為一統一的暫存器檔案，其中被寫入到一特定節段中任何暫存器的資料可由該等其它節段中所有單元存取。那些模式之間的切換可為無縫隙式，只要不同的執行模式可於各自的執行緒基線指令矩陣與MIMD超指令矩陣執行緒之間交替。
在一多執行緒執行模式中，執行一執行緒的每一暫存器檔案與其執行單元整體皆無關於其它暫存器檔案與它們的執行緒。此係類似於每一執行緒具有其本身的暫存器狀態。但是，可指定那些執行緒之間的相關性。屬於一執行緒的每一矩陣將在該執行緒的暫存器檔案之該執行單元中執行。如果在該硬體上僅有執行一執行緒或非執行緒的單一程式，則使用以下的方法來允許屬於該單一執行緒/程式的平行矩陣能夠存取被寫入到該等其它節段中該等暫存器當中的該等結果。其完成的方法為藉由允許任何矩陣寫入結果到該等四個暫存器檔案節段之任何一者當中，以在該等其它暫存器檔案節段中產生那些暫存器的複本。實際上，此係藉由延伸每一節段的該等寫入埠到該等其餘節段當中來完成。但是，此無法調整，因為我們無法利用具有四倍於單獨一節段所需要之該等寫入埠的每一記憶胞之一有效率的暫存器檔案。我們提出一種機制使得該暫存器檔案可建構成其將不會受到這種單一執行緒暫存器-廣播延伸的影響。
必須注意到關於在本發明之具體實施例中所使用的暫存器節段之額外態樣可見於Mohammad A.Abdallah於2007年11月14日所立案的美國專利申請案編號2010/0161948，其名為「在支援多種內容切換模式與虛擬化方案之多執行緒架構中處理複雜指令格式的裝置與方法」(APPARATUS AND METHOD FOR PROCESSING COMPLEX INSTRUCTION FORMATS IN A MULTITHREADED ARCHITECTURE SUPPORTING VARIOUS CONTEXT SWITCH MODES AND VIRTUALIZATION SCHEMES)。
圖7為根據本發明一具體實施例之一多核心處理器之一片段化記憶體子系統之細部示意圖。圖7概略顯示出執行緒之間及/或負載與儲存器之間該同步化方案的一種完整方案與實施。該方案描述用於橫跨負載/儲存架構及/或橫跨記憶體參照及/或執行緒的記憶體存取之記憶體參照的同步化與歧義消除的一種較佳的方法。在圖2中，我們顯示了暫存器檔案之多個節段(位址及/或資料暫存器)、執行單元、位址計算單元、及第1級快取及/或負載儲存緩衝器與第2級快取及位址暫存器內連線1200與位址計算單元內連線1201的片段。那些片段元件可藉由片段化與分佈其集中的資源到數個引擎當中來被建構在一核心/處理器之內，或者它們可由在一多核心/多處理器組態中的不同核心/處理器之元件來建構。那些片段1211之一在圖中顯示為片段編號1；該等片段可調整成一大的數目(概略為圖中所示的N個片段)。
此機制亦用於那些引擎/核心/處理器之間該記憶體架構的一種同調性方案。此方案開始於來自在一片段/核心/處理器中該等位址計算單元之一者的一位址要求。例如，假設該位址由片段1(1211)所要求。其可使用屬於其本身片段的位址暫存器及/或使用位址內連線匯流排1200橫跨其它片段的暫存器來得到並計算其位址。在計算該位址之後，其產生用於存取快取與記憶體之32位元位址或64位元位址的該參照位址。此位址通常被分段成一標籤欄位與一集合與線欄位。此特定片段/引擎/核心將儲存該位址到其負載儲存緩衝器及/或L1及/或L2位址陣列1202當中，同時其將藉由使用一壓縮技術產生該標籤的一壓縮版本(其比該位址之原始標籤欄位具有較少數目的位元)。
有更多不同的片段/引擎/核心/處理器將使用該集合欄位或該集合欄位的一子集合做為一索引來辨識該位址被維護在那一個片段/核心/處理器中。此藉由該位址集合欄位位元之該等片段的索引化可確保在一特定片段/核心/引擎中該位址之擁有權的排除性，即使對應於該位址之記憶體資料可存在於另一個或多個其它片段/引擎/核心/處理器中。即使該等位址CAM/標籤陣列1202/1206被顯示在要耦合於資料陣列1207的每一片段中，它們可能僅耦合在實際上放置或佈置的鄰近處，或甚至事實上兩者屬於一特定引擎/核心/處理器，但在被保持在該等位址陣列中的位址與在一片段內該等資料陣列中的資料之間並無關係。
圖8為根據本發明一具體實施例如何使用一位址的位元由位址產生來列舉片段的示意圖。在本具體實施例中，片段係由在頁面邊界之上及快取線邊界之下的該等位址位元所定義，如圖8所示。本發明較佳地是維持在該等頁面邊界之上來避免於該等虛擬位址轉譯成實體位址其間造成TLB遺漏。該程序保持在該快取線邊界之下，藉以具有完整的快取線來正確地配合在該硬體快取階層當中。例如，在利用64位元組快取線的系統中，該片段邊界將避免使用最後六個位址位元。相較於利用32位元組快取線的一種系統，該片段邊界將避免使用最後五個位元。一旦定義之後，該片段階層在橫跨該處理器的所有快取階層當中皆相同。
圖9為本發明之具體實施例如何處理負載與儲存之示意圖。如圖9所示，每一片段係關聯於其負載儲存緩衝器與儲存汰除緩衝器。對於任何給定的片段，指定關聯於該片段或另一片段的一位址範圍之負載與儲存將被傳送至該片段的負載儲存緩衝器做處理。必須注意到它們將未依順序到達，因為該等核心並無順序地執行指令。在每一核心之內，該核心不僅可存取到其本身的暫存器檔案，亦可存取到每一個其它核心之暫存器檔案。
本發明之具體實施例實施一分散式負載儲存排序系統。該系統被分佈橫跨多個片段。在一片段之內，局部資料相關性檢查由該片段執行。此係因為該片段僅載入與儲存在該特定片段的該儲存汰除緩衝器之內。此限制了必須察看其它片段來維持資料同調性的需求。依此方式，自一片段內的資料相關性被局部地強制。
關於資料一致性，該儲存分派閘極根據嚴格的程式內順序記憶體一致性規則來強制儲存汰除。儲存為無順序地抵達該負載儲存緩衝器。負載亦為無順序地抵達該負載儲存緩衝器。同時，該等無順序的負載與儲存被轉送至該等儲存汰除緩衝器做處理。必須注意到雖然儲存在一給定片段內依順序汰除，因為它們進入該儲存分派閘極，它們可無順序地來自該等多個片段。該儲存分派閘極強制實施一政策，其可確保即使儲存可無順序地存在於橫跨儲存汰除緩衝器，且即使該等緩衝器可相對於其它緩衝器的儲存為無順序地轉送儲存至該儲存分派閘極，該分派閘極可確保它們被嚴格地依順序轉送至片段記憶體。此係因為該儲存分派閘極具有儲存汰除的一整體概觀，並僅允許儲存依順序橫跨所有該等片段(例如通用地)離開至該記憶體之通用可見側。依此方式，該儲存分派閘極係做為一通用觀察者來確保該等儲存最終橫跨所有片段依序地返回到記憶體。
圖10為根據本發明一具體實施例中那些片段可被分成兩個或更多區域之方法。圖10所示為一單一片段可被分成多個區域之方法。區域區分可經由該位址產生程序來實施。區域區分改變了負載儲存檢查必須在一片段內完成的方式，因為在此例中相對於橫跨該整個片段，它們僅必須針對每個區域來完成。區域區分亦有好處在於其可使得單一埠的記憶體之行為可像是多埠記憶體，其中該單一埠係對不同的區域來存取。
圖11為根據本發明一具體實施例中該處理器之一種作業模式，其中該等可分割引擎之該等硬體資源係用於做為類似在執行應用程式中的邏輯核心。在此具體實施例中，該等虛擬核心之該等引擎的該等硬體資源被設置成實體核心。在圖11的模式中，其每一實體核心被設置成做為一邏輯核心。多執行緒應用程式與多執行緒功能性係根據該應用程式的軟體之執行緒化的可程式性。
圖12為根據本發明一具體實施例中該處理器之一種作業模式，其中軟核心被用於像是在執行應用程式時的邏輯核心來運作。在此具體實施例中，虛擬核心的該等可分割引擎將支援複數個軟核心。在圖12的模式中，每一軟核心被設置成做為一邏輯核心。多執行緒應用程式與多執行緒功能性係根據該應用程式的軟體之執行緒化的可程式性。
圖13為根據本發明一具體實施例中該處理器之一種作業模式，其中該等軟核心被用於像是在執行應用程式時的一單一邏輯核心來運作。在圖13的模式中，每一軟核心被設置成做為一單一邏輯核心。在這種實施中，一單一執行緒的應用程式將其指令序列分開，並分配在該等虛擬核心之間，其中它們被協同地執行來達成高單一執行緒效能。依此方式，單一執行緒的效能可隨著加入額外的軟核心來調整。
在選擇該處理器之操作模式時可使用一些策略。對於具有大量引擎(例如8引擎、12引擎等)之一處理器，一些軟核心可設置成做為一單一邏輯核心，而該等其餘的核心可在該等其它模式中運作。此屬性可允許一種資源的智慧型分割來確保該硬體之最大利用率及/或最低浪費的電力消耗。例如，在一具體實施例中，核心(例如軟或邏輯核心)可根據正在執行的應用程式之種類來以每個執行緒為基礎做分配。
圖14為根據本發明一具體實施例中用於支援邏輯核心與虛擬核心功能之片段分段的示例性實施。如上所述，該片段分段化可允許該處理器設置成支援不同的虛擬核心執行模式，如上所述。
該通用內連線允許核心的執行緒來存取任何的埠1401。必須注意到此處所使用的術語「執行緒」(thread)代表來自不同邏輯核心的指令序列、來自相同邏輯核心的指令序列，或是兩者之某種混合。
該等執行緒利用埠1401之一來存取該負載儲存緩衝器之方式可根據該等仲裁器之政策而調整，如所示。因此，使用埠1401中任何一者的一執行緒經由埠1402可較大量或較少量地存取該負載儲存緩衝器。該分配的大小與該分配被管理的方式由該仲裁器控制。該仲裁器可動態地根據一特定執行緒的需求而分配存取該等埠。
該負載儲存緩衝器設置成具有散佈橫跨該等埠之複數個入口。至該負載儲存緩衝器之存取由該仲裁器控制。依此方式，該仲裁器可動態地分配在該負載儲存緩衝器中的入口至該等不同的執行緒。
圖14亦顯示了在負載儲存緩衝器與該L1快取之間的該等埠之上的仲裁器。因此，利用上述之該負載儲存緩衝器，使用該等埠1403中任何一者的一執行緒經由埠1404可較大量或較少量地存取該L1快取。該分配的大小與該分配被管理的方式由該仲裁器控制。該仲裁器可動態地根據一特定執行緒的需求而分配存取該等埠。
該L1快取設置成具有散佈橫跨該等埠之複數條路線。該L1快取之存取由該仲裁器控制。依此方式，該仲裁器可動態地分配在該L1快取中的入口至該等不同的執行緒。
在一具體實施例中，該等仲裁器設置成與用於追蹤功能性的複數個計數器1460及提供一限制功能之複數個臨界值限制暫存器1450運作。該限制功能指定一給定執行緒之最高資源分配百分比。該追蹤功能追蹤在任何給定時間時分配給一給定執行緒的該等實際資源。這些追蹤與限制功能影響了該負載儲存緩衝器、L1快取、L2快取或該等通用內連線之每一執行緒入口、路線或埠之數目的分配。例如，分配給每一執行緒之該負載儲存緩衝器中入口的總數可對於一可變臨界值做動態地檢查。此可變臨界值可根據一給定的執行緒之轉送進度來更新。例如，在一具體實施例中，減慢的執行緒(例如大數目或L2遺失等)以造成緩慢轉送進度來量化，因此它們個別的資源分配臨界值被降低，其包括該等入口臨界值、該等路線臨界值與該等埠臨界值。
圖14亦顯示出一共享的L2快取。在本具體實施例中，該共享的L2快取具有一固定埠配置，而在來自該L1快取的存取之間沒有任何的仲裁。在該處理器上執行的執行緒皆共享存取該L2快取及該L2快取的該等資源。
圖15為根據本發明一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器的一片段記憶體。
一示例性邏輯核心與其與該處理器之該等資源的關係由圖15之陰影部所示。在圖11的操作模式中，該多實體核心對多邏輯核心模式，其中該等實體核心用於像是在執行應用程式中的邏輯核心來運作，每一邏輯核心將設置成具有該負載儲存緩衝器與該L1快取之該等資源的一固定比例。該等埠可被特定地指定至每一執行緒或核心。在該負載儲存緩衝器中的入口可被特定地對每一執行緒或核心來保留。在該L1快取內的路線可被特定地對每一執行緒或核心來保留。多執行緒應用程式與多執行緒功能性係根據該應用程式的軟體之執行緒化的可程式性。此顯示成一個邏輯核心具有該等片段之每一者的該儲存緩衝器與該L1快取的一分配的埠與一分配的部份。依此方式，該邏輯核心包含每一片段的該等資源之一固定分配的片層。
在一具體實施例中，在該多實體核心對多邏輯核心模式中，該等四個片段可根據存取每一片段的埠之數目(例如埠1401)來分割。例如，在每一片段具有六個埠的一具體實施例中，每一片段的該等資源，及每一分割的該等資源將引擎者，即可以這種方式區分來支援橫跨該等四個片段的六個實體核心與該等四個分割雙引擎。每一分割可被分配其本身的埠。同樣地，該負載儲存緩衝器與該L1快取的該等資源將以這種方式分配來支援六個實體核心。例如，在該負載儲存緩衝器具有48個入口的一具體實施例中，該等48個入口可被分配成使得每一實體核心有12個入口來支援實施有四個實體核心的一種模式，或是它們可分配成使得在實施有六個實體核心的狀況中每一實體核心有八個入口。
圖16為根據本發明另一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器的一片段記憶體。
配合圖15，一個示例性邏輯核心與其與該處理器之該等資源的關係如圖16的陰影部所示。在圖11的操作模式中，該多實體核心對多邏輯核心模式，一整個分割表引擎係專屬於支援一單一邏輯核心的執行。此係由圖16中的陰影部所顯示。該實體資源引擎係用於類似在執行應用程式中的邏輯核心來運作。
圖17為根據本發明一具體實施例中實施一多軟體核心對多邏輯核心模式之一示例性四片段處理器的一片段記憶體。
一示例性邏輯核心與其與該處理器的該等資源之關係係如圖17的陰影部所示。在圖12的操作模式中，該多軟核心對多邏輯模式，其中虛擬核心用於類似在執行應用程式中的邏輯核心來運作，該負載儲存緩衝器的該等資源之分配的大小與該分配被管理的方式由該仲裁器控制。該仲裁器可動態地根據一特定執行緒或核心的需求而分配存取該等埠。同樣地，該L1快取的該等資源之分配的大小與該分配被管理的方式由該仲裁器控制。該仲裁器可動態地根據一特定執行緒或核心的需求而分配存取該等埠。因此，在任何給定實例中，該邏輯執行緒/核心(例如陰影部)可使用不同的仲裁器與不同的埠。
依此方式，存取該負載儲存緩衝器的該等資源與存取該L1快取的該等資源可以是更為政策導向，並可更為基於進行轉送進度之各自的執行緒或核心之該等需求。此顯示成一個邏輯核心具有該等片段之每一者的該儲存緩衝器與該L1快取的一動態分配的埠與一動態分配的部份。依此方式，該邏輯核心包含每一片段的該等資源之一非固定動態分配的片層。
圖18為根據本發明一具體實施例中實施一多軟核心對一邏輯核心模式之一示例性四片段處理器的一片段記憶體。
在圖13的操作模式中，該多軟核心對一邏輯核心模式，其中該等軟核心用於類似在執行應用程式中一單一邏輯核心來運作，該等軟核心之每一者設置成協同於該等其它軟核心運作成一單一邏輯核心。一單一執行緒或核心具有該等負載儲存緩衝器之所有該等資源與該等L1快取的所有該等資源。在這種實施中，一單一執行緒的應用程式將其指令序列分開，並分配在該等軟核心之間，其中它們被協同地執行來達成高單一執行緒效能。依此方式，單一執行緒的效能可隨著加入額外的軟核心來調整。此係顯示在圖17中，其中一個示例性邏輯核心與其與該處理器之該等資源的關係藉由遮影該處理器之所有該等資源來顯示。
圖19為根據本發明一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器之位址計算與執行單元、運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器。
一示例性邏輯核心與其與該處理器之該等資源的關係如圖19之陰影部所示。在圖11的操作模式中，該多實體核心對多邏輯核心模式，其中該等實體核心用於像是在執行應用程式中的邏輯核心來運作，每一邏輯核心將設置成具有該等位址計算單元、運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器之該等資源的一固定比例。多執行緒應用程式與多執行緒功能性係根據該應用程式的軟體之執行緒化的可程式性。此係顯示成一個邏輯核心具有一分配的位址計算與執行單元、一分配的執行緒暫存器檔案與一分配的共用分割排程器。依此方式，該邏輯核心包含一固定分配的節段。但是在一具體實施例中，在此操作模式下，該等位址計算與執行單元仍可被共享(例如代表該等位址計算與執行單元之每一者將不會被遮影)。
圖20為根據本發明一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器之位址計算與執行單元、運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器之另一種實施。
一示例性邏輯核心與其與該處理器之該等資源的關係如圖20之陰影部所示。但是在圖20的具體實施例中，一實體核心的該等資源被分散橫跨該等片段之每一者與該等可分割引擎之每一者。此係顯示為一個邏輯核心具有該等位址計算與執行單元之一分配的部份、該等執行緒的暫存器檔案之一分配的部份，及橫跨該等節段之每一者的共用分割排程器之一分配的部份。此外，圖20顯示出一個邏輯核心如何將被分配該等位址計算執行單元之每一者的該等資源之部份。依此方式，該邏輯核心包含該等節段之每一者的一固定分配的部份。
圖21為根據本發明一具體實施例中實施一多軟核心對多邏輯核心模式之一示例性四片段處理器之位址計算與執行單元、暫存器檔案與共用分割排程器。
一示例性邏輯核心與其與該處理器之該等資源的關係如圖21之陰影部所示。在圖12的操作模式中，該多軟核心對多邏輯核心模式，其中該等軟核心用於像是在執行應用程式中的邏輯核心來運作，每一邏輯核心將設置成具有對於該等位址計算單元之任何一者、該等運算子/結果緩衝器之一動態分配部份、執行緒的暫存器檔案與共用分割排程器之一共享的存取。多執行緒應用程式與多執行緒功能性係根據該應用程式的軟體之執行緒化的可程式性。
圖22為根據本發明一具體實施例中實施一多軟核心對一邏輯核心模式之一示例性四片段處理器之位址計算與執行單元、暫存器檔案與共用分割排程器。
一示例性邏輯核心與其與該處理器之該等資源的關係如圖22之陰影部所示。在圖13的操作模式中，該多軟核心對一邏輯核心模式，其中該等軟核心用於像是在執行應用程式中一單一邏輯核心來運作，每一邏輯核心將設置成具有對於所有該等位址計算單元、及所有該等運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器之一共享的存取。在這種實施中，一單一執行緒的應用程式將其指令序列分開，並分配在該等虛擬核心之間，其中它們被協同地執行來達成高單一執行緒效能。依此方式，單一執行緒的效能可隨著加入額外的軟核心來調整。
圖23為根據本發明一具體實施例之一示例性微處理器管線2300的示意圖。微處理器管線2300包括一提取模組2301，其實施該程序的功能來辨識與擷取包含一執行的該等指令，如上所述。在圖23的具體實施例中，該提取模組接著為一解碼模組2302、一分配模組2303、一分派模組2304、一執行模組2305與一汰除模組2306。必須注意到微處理器管線2300僅為可實施上述之本發明之具體實施例的功能之管線的一種示例。本技術專業人士將可瞭解到可實施其它的微處理器管線來包括上述之該解碼模組的功能。
為了解釋的目的，前述的說明已經參照特定具體實施例來說明。但是，以上之例示性討論並非窮盡式或限制本發明於所揭示之明確型式。在以上的教示之下可瞭解其有可能許多修改及變化。該等具體實施例係被選擇及描述來最佳地解釋本發明及其實際應用的原理，藉此使得本技術中其它專業人士可在多種具體實施例及多種修正中最佳地利用本發明，使其可適用於所考慮的特定用途。
10‧‧‧通用前端提取與排程器
11-14‧‧‧可分割引擎
20-23‧‧‧程式碼序列
30‧‧‧通用內連線結構
101-104‧‧‧片段
110a‧‧‧記憶體通用內連線
110b‧‧‧執行通用內連線
121-124‧‧‧位址計算與執行單元
150‧‧‧通用前端提取與排程器
550‧‧‧運行時間最佳化器排程器
850‧‧‧桶
852‧‧‧執行緒分配指標陣列
852‧‧‧執行緒桶分配指標
853‧‧‧目的地繼承向量
854‧‧‧桶執行旗標
855‧‧‧桶預備位元
856‧‧‧桶繼承向量B_iv
857‧‧‧汰除執行緒指標
901‧‧‧桶繼承向量
902‧‧‧執行緒標頭
903‧‧‧轉送
904‧‧‧引擎/核心或處理器
905‧‧‧局部暫存器檔案
906‧‧‧執行緒2
910‧‧‧向量
1200‧‧‧位址暫存器內連線
1200‧‧‧位址內連線匯流排
1201‧‧‧位址計算單元內連線
1202‧‧‧位址陣列
1206‧‧‧標籤陣列
1207‧‧‧資料陣列
1211‧‧‧片段
1401、1402、1403、1404‧‧‧埠
1450‧‧‧臨界值限制暫存器
1460‧‧‧計數器
2300‧‧‧微處理器管線
2301‧‧‧提取模組
2302‧‧‧解碼模組
2303‧‧‧分配模組
2304‧‧‧分派模組
2305‧‧‧執行模組
2306‧‧‧汰除模組
本發明藉由範例來例示，但並非限制，在附屬圖面的圖形中類似的參考編號代表類似的元件。
圖1A為該通用前端產生程式碼區塊與繼承向量來在它們個別的可分割引擎上支援程式碼序列之執行的方式概述。
圖1B為根據本發明一具體實施例中針對一多核心處理器之可分割引擎及它們的組件之概要圖，其中包括分段的排程器與暫存器檔案、通用內連線與一片段化記憶體子系統。
圖2為根據本發明一具體實施例之排程器流程圖。
圖3為根據本發明一具體實施例之示例性硬體電路圖，其中顯示有儲存運算子與結果的一分段的暫存器檔案並具有一內連線。
圖4為根據本發明一具體實施例之一通用前端提取及排程器之示意圖。
圖5為根據本發明一具體實施例中橫跨許多虛擬核心之指令分配的另一種實施。
圖6為根據本發明一具體實施例中具有相對應複數的暫存器檔案與運算子及結果緩衝器之複數個暫存器節段。
圖7為根據本發明一具體實施例之一多核心處理器之一片段化記憶體子系統之細部示意圖。
圖8為根據本發明一具體實施例如何使用一位址的位元由位址產生來列舉片段的示意圖。
圖9為本發明之具體實施例如何處理負載與儲存之示意圖。
圖10為根據本發明一具體實施例中那些片段可被分成兩個或更多區域之方法。
圖11為根據本發明一具體實施例中該處理器之一種作業模式，其中虛擬核心被設置成對應於在執行應用程式時邏輯核心之實體核心。
圖12為根據本發明一具體實施例中該處理器之一種作業模式，其中虛擬核心被設置成對應於在執行應用程式時邏輯核心之軟核心。
圖13為根據本發明一具體實施例中該處理器之一種作業模式，其中該等虛擬核心被設置成對應於在執行應用程式時一單一邏輯核心之軟核心。
圖14為根據本發明一具體實施例中用於支援邏輯核心與虛擬核心功能之片段分段的示例性實施。
圖15為根據本發明一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器的一片段記憶體。
圖16為根據本發明另一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器的一片段記憶體。
圖17為根據本發明一具體實施例中實施一多軟核心對多邏輯核心模式之一示例性四片段處理器的一片段記憶體。
圖18為根據本發明一具體實施例中實施一多軟核心對一邏輯核心模式之一示例性四片段處理器的一片段記憶體。
圖19為根據本發明一具體實施例中實施一實體對邏輯模式之一示例性四片段處理器之位址計算與執行單元、運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器。
圖20為根據本發明一具體實施例中實施一多實體對多邏輯模式之一示例性四片段處理器之位址計算與執行單元、運算子/結果緩衝器、執行緒的暫存器檔案與共用分割排程器之另一種實施。
圖21為根據本發明一具體實施例中實施一多軟核心對多邏輯模式之一示例性四片段處理器之位址計算與執行單元、暫存器檔案與共用分割排程器。
圖22為根據本發明一具體實施例中實施一多軟核心對一邏輯核心模式之一示例性四片段處理器之位址計算與執行單元、暫存器檔案與共用分割排程器。
圖23為根據本發明一具體實施例之一示例性微處理器管線的示意圖。
101-104‧‧‧片段
110a‧‧‧記憶體通用內連線
110b‧‧‧執行通用內連線
121-124‧‧‧位址計算與執行單元

权利要求:
Claims (24)
[1] 一種用於針對一處理器使用複數個記憶體片段來執行指令的系統，該系統包含：一通用前端排程器，用於接收一輸入的指令序列，其中該通用前端排程器分割該輸入的指令序列成為複數個指令的程式碼區塊，並產生複數個繼承向量來描述該等程式碼區塊的指令之間之交互相關性；耦合來接收由該通用前端排程器分配的程式碼區塊之該處理器的複數個虛擬核心，其中每一虛擬核心包含複數個可分割引擎的一個別的資源子集合，其中該等程式碼區塊使用該等可分割引擎根據一虛擬核心模式與根據該等個別的繼承向量來執行；及複數個記憶體片段，其被耦合至該等可分割引擎來提供資料儲存。
[2] 如申請專利範圍第1項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的一實體資源的子集合被分配成支援一邏輯核心的一單一邏輯執行緒之執行。
[3] 如申請專利範圍第2項之系統，每一記憶體片段實施複數個邏輯核心的一部份。
[4] 如申請專利範圍第1項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的實體資源根據一可調整臨界值被動態地分配來支援一單一邏輯核心的一單一邏輯執行緒之執行。
[5] 如申請專利範圍第4項之系統，其中該等複數個記憶體片段實施複數個邏輯核心的一部份。
[6] 如申請專利範圍第1項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的該組實體資源被分配成支援一單一邏輯執行緒之執行。
[7] 如申請專利範圍第1項之系統，其中每一記憶體片段另包含一L1快取片段與L2快取片段與一負載儲存緩衝器。
[8] 如申請專利範圍第1項之系統，其中一內連線鏈結該等複數個記憶體片段之每一者至該等複數個可分割引擎之每一者。
[9] 一種用於使用複數個記憶體片段來執行指令的處理器，其包含：一通用前端排程器，用於接收一輸入的指令序列，其中該通用前端排程器分割該輸入的指令序列成為複數個指令的程式碼區塊，並產生複數個繼承向量來描述該等程式碼區塊的指令之間之交互相關性；耦合來接收由該通用前端排程器分配的程式碼區塊之該處理器的複數個虛擬核心，其中每一虛擬核心包含複數個可分割引擎的一個別的資源子集合，其中該等程式碼區塊使用該等可分割引擎根據一虛擬核心模式與根據該等個別的繼承向量來執行；及複數個記憶體片段，其被耦合至該等可分割引擎來提供資料儲存。
[10] 如申請專利範圍第9項之處理器，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的一實體資源的子集合被分配成支援一邏輯核心的一單一邏輯執行緒之執行。
[11] 如申請專利範圍第10項之處理器，每一記憶體片段實施複數個邏輯核心的一部份。
[12] 如申請專利範圍第9項之處理器，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的實體資源根據一可調整臨界值被動態地分配來支援一單一邏輯核心的一單一邏輯執行緒之執行。
[13] 如申請專利範圍第12項之處理器，其中該等複數個記憶體片段實施複數個邏輯核心的一部份。
[14] 如申請專利範圍第9項之處理器，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的該組實體資源被分配成支援一單一邏輯執行緒之執行。
[15] 如申請專利範圍第9項之處理器，其中每一記憶體片段另包含一L1快取片段與L2快取片段與一負載儲存緩衝器。
[16] 如申請專利範圍第9項之處理器，其中一內連線鏈結該等複數個記憶體片段之每一者至該等複數個執行單元之每一者。
[17] 一種用於針對一處理器使用複數個記憶體片段來執行指令的系統，該系統包含：一通用前端排程器，用於接收一輸入的指令序列，其中該通用前端排程器分割該輸入的指令序列成為複數個指令的程式碼區塊；耦合來接收由該通用前端排程器分配的程式碼區塊之該處理器的複數個虛擬核心，其中每一虛擬核心包含複數個可分割引擎的一個別的資源子集合，其中該等程式碼區塊使用該等可分割引擎根據一虛擬核心模式來執行；及複數個記憶體片段，其被耦合至該等可分割引擎來提供資料儲存。
[18] 如申請專利範圍第17項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的一實體資源的子集合被分配成支援一邏輯核心的一單一邏輯執行緒之執行。
[19] 如申請專利範圍第18項之系統，每一記憶體片段實施複數個邏輯核心的一部份。
[20] 如申請專利範圍第17項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的實體資源根據一可調整臨界值被動態地分配來支援一單一邏輯核心的一單一邏輯執行緒之執行。
[21] 如申請專利範圍第20項之系統，其中該等複數個記憶體片段實施複數個邏輯核心的一部份。
[22] 如申請專利範圍第17項之系統，其中該等複數個記憶體片段實施一執行模式，其中每一記憶體片段的該組實體資源被分配成支援一單一邏輯執行緒之執行。
[23] 如申請專利範圍第17項之系統，其中每一記憶體片段另包含一L1快取片段、一L2快取片段與一負載儲存緩衝器。
[24] 如申請專利範圍第17項之系統，其中一內連線鏈結該等複數個記憶體片段之每一者至該等複數個可分割引擎之每一者。

类似技术:

公开号 | 公开日 | 专利标题

TWI518504B|2016-01-21|使用可分割引擎實體化的虛擬核心以支援程式碼區塊執行的暫存器檔案節段

TWI520070B|2016-02-01|使用可分割引擎實體化的虛擬核心以支援程式碼區塊執行的記憶體片段

TWI533129B|2016-05-11|使用可分割引擎實體化的虛擬核心執行指令序列程式碼區塊

JP6621476B2|2019-12-18|プロセッサ・コア内で使用するための実行スライス回路、プロセッサ・コア、およびプロセッサ・コアによりプログラム命令を実行する方法

TWI619077B|2018-03-21|執行群組爲區塊的多重執行緒指令的方法、電腦可讀取媒體及電腦系統

US7734895B1|2010-06-08|Configuring sets of processor cores for processing instructions

同族专利:

公开号 | 公开日

KR101826121B1|2018-02-06|

CN103635875A|2014-03-12|

US9921845B2|2018-03-20|

US20180157491A1|2018-06-07|

EP2689326A4|2014-10-22|

KR20160084471A|2016-07-13|

US11204769B2|2021-12-21|

US20120246448A1|2012-09-27|

US20200142701A1|2020-05-07|

KR20180015754A|2018-02-13|

WO2012135050A3|2012-11-29|

KR101636602B1|2016-07-05|

KR20140018945A|2014-02-13|

US10564975B2|2020-02-18|

US20160154653A1|2016-06-02|

US9274793B2|2016-03-01|

EP2689326A2|2014-01-29|

CN108108188A|2018-06-01|

WO2012135050A2|2012-10-04|

KR101966712B1|2019-04-09|

CN103635875B|2018-02-16|

TWI520070B|2016-02-01|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US727487A|1902-10-21|1903-05-05|Swan F Swanson|Dumping-car.|

US4075704A|1976-07-02|1978-02-21|Floating Point Systems, Inc.|Floating point data processor for high speech operation|

US4228496A|1976-09-07|1980-10-14|Tandem Computers Incorporated|Multiprocessor system|

US4245344A|1979-04-02|1981-01-13|Rockwell International Corporation|Processing system with dual buses|

US4527237A|1979-10-11|1985-07-02|Nanodata Computer Corporation|Data processing system|

US4414624A|1980-11-19|1983-11-08|The United States Of America As Represented By The Secretary Of The Navy|Multiple-microcomputer processing|

US4524415A|1982-12-07|1985-06-18|Motorola, Inc.|Virtual machine data processor|

US4597061B1|1983-01-03|1998-06-09|Texas Instruments Inc|Memory system using pipleline circuitry for improved system|

US4577273A|1983-06-06|1986-03-18|Sperry Corporation|Multiple microcomputer system for digital computers|

US4682281A|1983-08-30|1987-07-21|Amdahl Corporation|Data storage unit employing translation lookaside buffer pointer|

US4633434A|1984-04-02|1986-12-30|Sperry Corporation|High performance storage unit|

US4600986A|1984-04-02|1986-07-15|Sperry Corporation|Pipelined split stack with high performance interleaved decode|

US4835680A|1985-03-15|1989-05-30|Xerox Corporation|Adaptive processor array capable of learning variable associations useful in recognizing classes of inputs|

JPH042976B2|1985-10-15|1992-01-21|||

US4920477A|1987-04-20|1990-04-24|Multiflow Computer, Inc.|Virtual address table look aside buffer miss recovery method and apparatus|

US4943909A|1987-07-08|1990-07-24|At&T Bell Laboratories|Computational origami|

JP2930341B2|1988-10-07|1999-08-03|マーチン・マリエッタ・コーポレーション|データ並列処理装置|

US5339398A|1989-07-31|1994-08-16|North American Philips Corporation|Memory architecture and method of data organization optimized for hashing|

US5471593A|1989-12-11|1995-11-28|Branigin; Michael H.|Computer processor with an efficient means of executing many instructions simultaneously|

US5197130A|1989-12-29|1993-03-23|Supercomputer Systems Limited Partnership|Cluster architecture for a highly parallel scalar/vector multiprocessor system|

EP0463965B1|1990-06-29|1998-09-09|Digital Equipment Corporation|Branch prediction unitfor high-performance processor|

US5317754A|1990-10-23|1994-05-31|International Business Machines Corporation|Method and apparatus for enabling an interpretive execution subset|

US5317705A|1990-10-24|1994-05-31|International Business Machines Corporation|Apparatus and method for TLB purge reduction in a multi-level machine system|

US6282583B1|1991-06-04|2001-08-28|Silicon Graphics, Inc.|Method and apparatus for memory access in a matrix processor computer|

US5539911A|1991-07-08|1996-07-23|Seiko Epson Corporation|High-performance, superscalar-based computer system with out-of-order instruction execution|

JPH07502358A|1991-12-23|1995-03-09|||

JP2647327B2|1992-04-06|1997-08-27|インターナショナル・ビジネス・マシーンズ・コーポレイション|大規模並列コンピューティング・システム装置|

KR100309566B1|1992-04-29|2001-12-15|리패치|파이프라인프로세서에서다중명령어를무리짓고,그룹화된명령어를동시에발행하고,그룹화된명령어를실행시키는방법및장치|

WO1993022722A1|1992-05-01|1993-11-11|Seiko Epson Corporation|A system and method for retiring instructions in a superscalar microprocessor|

EP0576262B1|1992-06-25|2000-08-23|Canon Kabushiki Kaisha|Apparatus for multiplying integers of many figures|

US5493660A|1992-10-06|1996-02-20|Hewlett-Packard Company|Software assisted hardware TLB miss handler|

US5513335A|1992-11-02|1996-04-30|Sgs-Thomson Microelectronics, Inc.|Cache tag memory having first and second single-port arrays and a dual-port array|

US5819088A|1993-03-25|1998-10-06|Intel Corporation|Method and apparatus for scheduling instructions for execution on a multi-issue architecture computer|

US5548773A|1993-03-30|1996-08-20|The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration|Digital parallel processor array for optimum path planning|

US6948172B1|1993-09-21|2005-09-20|Microsoft Corporation|Preemptive multi-tasking with cooperative groups of tasks|

US5469376A|1993-10-14|1995-11-21|Abdallah; Mohammad A. F. F.|Digital circuit for the evaluation of mathematical expressions|

US5517651A|1993-12-29|1996-05-14|Intel Corporation|Method and apparatus for loading a segment register in a microprocessor capable of operating in multiple modes|

US5761476A|1993-12-30|1998-06-02|Intel Corporation|Non-clocked early read for back-to-back scheduling of instructions|

US5956753A|1993-12-30|1999-09-21|Intel Corporation|Method and apparatus for handling speculative memory access operations|

JP3048498B2|1994-04-13|2000-06-05|株式会社東芝|半導体記憶装置|

CN1084005C|1994-06-27|2002-05-01|国际商业机器公司|用于动态控制地址空间分配的方法和设备|

US5548742A|1994-08-11|1996-08-20|Intel Corporation|Method and apparatus for combining a direct-mapped cache and a multiple-way cache in a cache memory|

US5813031A|1994-09-21|1998-09-22|Industrial Technology Research Institute|Caching tag for a large scale cache computer memory system|

US5640534A|1994-10-05|1997-06-17|International Business Machines Corporation|Method and system for concurrent access in a data cache array utilizing multiple match line selection paths|

US5835951A|1994-10-18|1998-11-10|National Semiconductor|Branch processing unit with target cache read prioritization protocol for handling multiple hits|

JP3569014B2|1994-11-25|2004-09-22|富士通株式会社|マルチコンテキストをサポートするプロセッサおよび処理方法|

US5724565A|1995-02-03|1998-03-03|International Business Machines Corporation|Method and system for processing first and second sets of instructions by first and second types of processing systems|

US5673408A|1995-02-14|1997-09-30|Hal Computer Systems, Inc.|Processor structure and method for renamable trap-stack|

US5675759A|1995-03-03|1997-10-07|Shebanow; Michael C.|Method and apparatus for register management using issue sequence prior physical register and register association validity information|

US5634068A|1995-03-31|1997-05-27|Sun Microsystems, Inc.|Packet switched cache coherent multiprocessor system|

US5751982A|1995-03-31|1998-05-12|Apple Computer, Inc.|Software emulation system with dynamic translation of emulated instructions for increased processing speed|

US6209085B1|1995-05-05|2001-03-27|Intel Corporation|Method and apparatus for performing process switching in multiprocessor computer systems|

US6643765B1|1995-08-16|2003-11-04|Microunity Systems Engineering, Inc.|Programmable processor with group floating point operations|

US5710902A|1995-09-06|1998-01-20|Intel Corporation|Instruction dependency chain indentifier|

US6341324B1|1995-10-06|2002-01-22|Lsi Logic Corporation|Exception processing in superscalar microprocessor|

US5864657A|1995-11-29|1999-01-26|Texas Micro, Inc.|Main memory system and checkpointing protocol for fault-tolerant computer system|

US5983327A|1995-12-01|1999-11-09|Nortel Networks Corporation|Data path architecture and arbitration scheme for providing access to a shared system resource|

US5793941A|1995-12-04|1998-08-11|Advanced Micro Devices, Inc.|On-chip primary cache testing circuit and test method|

US5911057A|1995-12-19|1999-06-08|Texas Instruments Incorporated|Superscalar microprocessor having combined register and memory renaming circuits, systems, and methods|

US5699537A|1995-12-22|1997-12-16|Intel Corporation|Processor microarchitecture for efficient dynamic scheduling and execution of chains of dependent instructions|

US6882177B1|1996-01-10|2005-04-19|Altera Corporation|Tristate structures for programmable logic devices|

US5754818A|1996-03-22|1998-05-19|Sun Microsystems, Inc.|Architecture and method for sharing TLB entries through process IDS|

US5904892A|1996-04-01|1999-05-18|Saint-Gobain/Norton Industrial Ceramics Corp.|Tape cast silicon carbide dummy wafer|

US5752260A|1996-04-29|1998-05-12|International Business Machines Corporation|High-speed, multiple-port, interleaved cache with arbitration of multiple access addresses|

US5806085A|1996-05-01|1998-09-08|Sun Microsystems, Inc.|Method for non-volatile caching of network and CD-ROM file accesses using a cache directory, pointers, file name conversion, a local hard disk, and separate small database|

US5829028A|1996-05-06|1998-10-27|Advanced Micro Devices, Inc.|Data cache configured to store data in a use-once manner|

US6108769A|1996-05-17|2000-08-22|Advanced Micro Devices, Inc.|Dependency table for reducing dependency checking hardware|

US5958042A|1996-06-11|1999-09-28|Sun Microsystems, Inc.|Grouping logic circuit in a pipelined superscalar processor|

US5881277A|1996-06-13|1999-03-09|Texas Instruments Incorporated|Pipelined microprocessor with branch misprediction cache circuits, systems and methods|

US5860146A|1996-06-25|1999-01-12|Sun Microsystems, Inc.|Auxiliary translation lookaside buffer for assisting in accessing data in remote address spaces|

US5903760A|1996-06-27|1999-05-11|Intel Corporation|Method and apparatus for translating a conditional instruction compatible with a first instruction set architecture into a conditional instruction compatible with a second ISA|

US5974506A|1996-06-28|1999-10-26|Digital Equipment Corporation|Enabling mirror, nonmirror and partial mirror cache modes in a dual cache system|

US6167490A|1996-09-20|2000-12-26|University Of Washington|Using global memory information to manage memory in a computer network|

KR19980032776A|1996-10-16|1998-07-25|가나이 츠토무|데이타 프로세서 및 데이타 처리시스템|

EP0877981B1|1996-11-04|2004-01-07|Koninklijke Philips Electronics N.V.|Processing device, reads instructions in memory|

US6385715B1|1996-11-13|2002-05-07|Intel Corporation|Multi-threading for a processor utilizing a replay queue|

US6253316B1|1996-11-19|2001-06-26|Advanced Micro Devices, Inc.|Three state branch history using one bit in a branch prediction mechanism|

US5978906A|1996-11-19|1999-11-02|Advanced Micro Devices, Inc.|Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions|

US5903750A|1996-11-20|1999-05-11|Institute For The Development Of Emerging Architectures, L.L.P.|Dynamic branch prediction for branch instructions with multiple targets|

US6212542B1|1996-12-16|2001-04-03|International Business Machines Corporation|Method and system for executing a program within a multiscalar processor by processing linked thread descriptors|

US6134634A|1996-12-20|2000-10-17|Texas Instruments Incorporated|Method and apparatus for preemptive cache write-back|

US5918251A|1996-12-23|1999-06-29|Intel Corporation|Method and apparatus for preloading different default address translation attributes|

US6065105A|1997-01-08|2000-05-16|Intel Corporation|Dependency matrix|

US6016540A|1997-01-08|2000-01-18|Intel Corporation|Method and apparatus for scheduling instructions in waves|

US5802602A|1997-01-17|1998-09-01|Intel Corporation|Method and apparatus for performing reads of related data from a set-associative cache memory|

JP3739888B2|1997-03-27|2006-01-25|株式会社ソニー・コンピュータエンタテインメント|情報処理装置および方法|

US6088780A|1997-03-31|2000-07-11|Institute For The Development Of Emerging Architecture, L.L.C.|Page table walker that uses at least one of a default page size and a page size selected for a virtual address space to position a sliding field in a virtual address|

US6314511B2|1997-04-03|2001-11-06|University Of Washington|Mechanism for freeing registers on processors that perform dynamic out-of-order execution of instructions using renaming registers|

US6035120A|1997-05-28|2000-03-07|Sun Microsystems, Inc.|Method and apparatus for converting executable computer programs in a heterogeneous computing environment|

US6075938A|1997-06-10|2000-06-13|The Board Of Trustees Of The Leland Stanford Junior University|Virtual machine monitors for scalable multiprocessors|

US6073230A|1997-06-11|2000-06-06|Advanced Micro Devices, Inc.|Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches|

US6658447B2|1997-07-08|2003-12-02|Intel Corporation|Priority based simultaneous multi-threading|

US6128728A|1997-08-01|2000-10-03|Micron Technology, Inc.|Virtual shadow registers and virtual register windows|

US6170051B1|1997-08-01|2001-01-02|Micron Technology, Inc.|Apparatus and method for program level parallelism in a VLIW processor|

US6085315A|1997-09-12|2000-07-04|Siemens Aktiengesellschaft|Data processing device with loop pipeline|

US6101577A|1997-09-15|2000-08-08|Advanced Micro Devices, Inc.|Pipelined instruction cache and branch prediction mechanism therefor|

US5901294A|1997-09-18|1999-05-04|International Business Machines Corporation|Method and system for bus arbitration in a multiprocessor system utilizing simultaneous variable-width bus access|

US6185660B1|1997-09-23|2001-02-06|Hewlett-Packard Company|Pending access queue for providing data to a target register during an intermediate pipeline phase after a computer cache miss|

US5905509A|1997-09-30|1999-05-18|Compaq Computer Corp.|Accelerated Graphics Port two level Gart cache having distributed first level caches|

US6226732B1|1997-10-02|2001-05-01|Hitachi Micro Systems, Inc.|Memory system architecture|

US5922065A|1997-10-13|1999-07-13|Institute For The Development Of Emerging Architectures, L.L.C.|Processor utilizing a template field for encoding instruction sequences in a wide-word format|

US6178482B1|1997-11-03|2001-01-23|Brecis Communications|Virtual register sets|

US6021484A|1997-11-14|2000-02-01|Samsung Electronics Co., Ltd.|Dual instruction set architecture|

US6256728B1|1997-11-17|2001-07-03|Advanced Micro Devices, Inc.|Processor configured to selectively cancel instructions from its pipeline responsive to a predicted-taken short forward branch instruction|

US6260131B1|1997-11-18|2001-07-10|Intrinsity, Inc.|Method and apparatus for TLB memory ordering|

US6016533A|1997-12-16|2000-01-18|Advanced Micro Devices, Inc.|Way prediction logic for cache array|

US6219776B1|1998-03-10|2001-04-17|Billions Of Operations Per Second|Merged array controller and processing element|

US6609189B1|1998-03-12|2003-08-19|Yale University|Cycle segmented prefix circuits|

JP3657424B2|1998-03-20|2005-06-08|松下電器産業株式会社|番組情報を放送するセンター装置と端末装置|

US6216215B1|1998-04-02|2001-04-10|Intel Corporation|Method and apparatus for senior loads|

US6157998A|1998-04-03|2000-12-05|Motorola Inc.|Method for performing branch prediction and resolution of two or more branch instructions within two or more branch prediction buffers|

US6205545B1|1998-04-30|2001-03-20|Hewlett-Packard Company|Method and apparatus for using static branch predictions hints with dynamically translated code traces to improve performance|

US6115809A|1998-04-30|2000-09-05|Hewlett-Packard Company|Compiling strong and weak branching behavior instruction blocks to separate caches for dynamic and static prediction|

US6256727B1|1998-05-12|2001-07-03|International Business Machines Corporation|Method and system for fetching noncontiguous instructions in a single clock cycle|

US8631066B2|1998-09-10|2014-01-14|Vmware, Inc.|Mechanism for providing virtual machines for use by multiple users|

US6272616B1|1998-06-17|2001-08-07|Agere Systems Guardian Corp.|Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths|

US6988183B1|1998-06-26|2006-01-17|Derek Chi-Lan Wong|Methods for increasing instruction-level parallelism in microprocessors and digital system|

US6260138B1|1998-07-17|2001-07-10|Sun Microsystems, Inc.|Method and apparatus for branch instruction processing in a processor|

US6122656A|1998-07-31|2000-09-19|Advanced Micro Devices, Inc.|Processor configured to map logical register numbers to physical register numbers using virtual register numbers|

US6272662B1|1998-08-04|2001-08-07|International Business Machines Corporation|Distributed storage system using front-end and back-end locking|

JP2000057054A|1998-08-12|2000-02-25|Fujitsu Ltd|高速アドレス変換システム|

US6742111B2|1998-08-31|2004-05-25|Stmicroelectronics, Inc.|Reservation stations to increase instruction level parallelism|

US6339822B1|1998-10-02|2002-01-15|Advanced Micro Devices, Inc.|Using padded instructions in a block-oriented cache|

US6332189B1|1998-10-16|2001-12-18|Intel Corporation|Branch prediction architecture|

GB9825102D0|1998-11-16|1999-01-13|Insignia Solutions Plc|Computer system|

JP3110404B2|1998-11-18|2000-11-20|甲府日本電気株式会社|マイクロプロセッサ装置及びそのソフトウェア命令高速化方法並びにその制御プログラムを記録した記録媒体|

US6490673B1|1998-11-27|2002-12-03|Matsushita Electric Industrial Co., Ltd|Processor, compiling apparatus, and compile program recorded on a recording medium|

US6519682B2|1998-12-04|2003-02-11|Stmicroelectronics, Inc.|Pipelined non-blocking level two cache system with inherent transaction collision-avoidance|

US6049501A|1998-12-14|2000-04-11|Motorola, Inc.|Memory data bus architecture and method of configuring multi-wide word memories|

US6477562B2|1998-12-16|2002-11-05|Clearwater Networks, Inc.|Prioritized instruction scheduling for multi-streaming processors|

US7020879B1|1998-12-16|2006-03-28|Mips Technologies, Inc.|Interrupt and exception handling for multi-streaming digital processors|

US6247097B1|1999-01-22|2001-06-12|International Business Machines Corporation|Aligned instruction cache handling of instruction fetches across multiple predicted branch instructions|

US6321298B1|1999-01-25|2001-11-20|International Business Machines Corporation|Full cache coherency across multiple raid controllers|

JP3842474B2|1999-02-02|2006-11-08|株式会社ルネサステクノロジ|データ処理装置|

US6327650B1|1999-02-12|2001-12-04|Vsli Technology, Inc.|Pipelined multiprocessing with upstream processor concurrently writing to local register and to register of downstream processor|

US6732220B2|1999-02-17|2004-05-04|Elbrus International|Method for emulating hardware features of a foreign architecture in a host operating system environment|

US6668316B1|1999-02-17|2003-12-23|Elbrus International Limited|Method and apparatus for conflict-free execution of integer and floating-point operations with a common register file|

US6418530B2|1999-02-18|2002-07-09|Hewlett-Packard Company|Hardware/software system for instruction profiling and trace selection using branch history information for branch predictions|

US6437789B1|1999-02-19|2002-08-20|Evans & Sutherland Computer Corporation|Multi-level cache controller|

US6850531B1|1999-02-23|2005-02-01|Alcatel|Multi-service network switch|

US6212613B1|1999-03-22|2001-04-03|Cisco Technology, Inc.|Methods and apparatus for reusing addresses in a computer|

US6529928B1|1999-03-23|2003-03-04|Silicon Graphics, Inc.|Floating-point adder performing floating-point and integer operations|

US6708268B1|1999-03-26|2004-03-16|Microchip Technology Incorporated|Microcontroller instruction set|

EP1050808B1|1999-05-03|2008-04-30|STMicroelectronics S.A.|Computer instruction scheduling|

US6449671B1|1999-06-09|2002-09-10|Ati International Srl|Method and apparatus for busing data elements|

US6473833B1|1999-07-30|2002-10-29|International Business Machines Corporation|Integrated cache and directory structure for multi-level caches|

US6643770B1|1999-09-16|2003-11-04|Intel Corporation|Branch misprediction recovery using a side memory|

US6772325B1|1999-10-01|2004-08-03|Hitachi, Ltd.|Processor architecture and operation for exploiting improved branch control instruction|

US6704822B1|1999-10-01|2004-03-09|Sun Microsystems, Inc.|Arbitration protocol for a shared data cache|

US6457120B1|1999-11-01|2002-09-24|International Business Machines Corporation|Processor and method including a cache having confirmation bits for improving address predictable branch instruction target predictions|

US7441110B1|1999-12-10|2008-10-21|International Business Machines Corporation|Prefetching using future branch path information derived from branch prediction|

US7107434B2|1999-12-20|2006-09-12|Board Of Regents, The University Of Texas|System, method and apparatus for allocating hardware resources using pseudorandom sequences|

US7925869B2|1999-12-22|2011-04-12|Ubicom, Inc.|Instruction-level multithreading according to a predetermined fixed schedule in an embedded processor using zero-time context switching|

US6557095B1|1999-12-27|2003-04-29|Intel Corporation|Scheduling operations using a dependency matrix|

US6542984B1|2000-01-03|2003-04-01|Advanced Micro Devices, Inc.|Scheduler capable of issuing and reissuing dependency chains|

KR100747128B1|2000-01-03|2007-08-09|어드밴스드 마이크로 디바이시즈, 인코포레이티드|발행 후에 명령의 비투기적 성질을 발견하고 상기 명령을 재발행하는 스케줄러|

US6594755B1|2000-01-04|2003-07-15|National Semiconductor Corporation|System and method for interleaved execution of multiple independent threads|

US6728872B1|2000-02-04|2004-04-27|International Business Machines Corporation|Method and apparatus for verifying that instructions are pipelined in correct architectural sequence|

GB0002848D0|2000-02-08|2000-03-29|Siroyan Limited|Communicating instruction results in processors and compiling methods for processors|

GB2365661A|2000-03-10|2002-02-20|British Telecomm|Allocating switch requests within a packet switch|

US6615340B1|2000-03-22|2003-09-02|Wilmot, Ii Richard Byron|Extended operand management indicator structure and method|

US7140022B2|2000-06-02|2006-11-21|Honeywell International Inc.|Method and apparatus for slack stealing with dynamic threads|

US6604187B1|2000-06-19|2003-08-05|Advanced Micro Devices, Inc.|Providing global translations with address space numbers|

US6557083B1|2000-06-30|2003-04-29|Intel Corporation|Memory system for multiple data types|

US6704860B1|2000-07-26|2004-03-09|International Business Machines Corporation|Data processing system and method for fetching instruction blocks in response to a detected block sequence|

US7206925B1|2000-08-18|2007-04-17|Sun Microsystems, Inc.|Backing Register File for processors|

US6728866B1|2000-08-31|2004-04-27|International Business Machines Corporation|Partitioned issue queue and allocation strategy|

US6721874B1|2000-10-12|2004-04-13|International Business Machines Corporation|Method and system for dynamically shared completion table supporting multiple threads in a processing system|

US6639866B2|2000-11-03|2003-10-28|Broadcom Corporation|Very small swing high performance asynchronous CMOS static memory with power reducing column multiplexing scheme|

US7757065B1|2000-11-09|2010-07-13|Intel Corporation|Instruction segment recording scheme|

JP2002185513A|2000-12-18|2002-06-28|Hitachi Ltd|パケット通信ネットワークおよびパケット転送制御方法|

US6907600B2|2000-12-27|2005-06-14|Intel Corporation|Virtual translation lookaside buffer|

US6877089B2|2000-12-27|2005-04-05|International Business Machines Corporation|Branch prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program|

US6647466B2|2001-01-25|2003-11-11|Hewlett-Packard Development Company, L.P.|Method and apparatus for adaptively bypassing one or more levels of a cache hierarchy|

FR2820921A1|2001-02-14|2002-08-16|Canon Kk|Dispositif et procede de transmission dans un commutateur|

US6985951B2|2001-03-08|2006-01-10|International Business Machines Corporation|Inter-partition message passing method, system and program product for managing workload in a partitioned processing environment|

US6950927B1|2001-04-13|2005-09-27|The United States Of America As Represented By The Secretary Of The Navy|System and method for instruction-level parallelism in a programmable multiple network processor environment|

US7707397B2|2001-05-04|2010-04-27|Via Technologies, Inc.|Variable group associativity branch target address cache delivering multiple target addresses per cache line|

US7200740B2|2001-05-04|2007-04-03|Ip-First, Llc|Apparatus and method for speculatively performing a return instruction in a microprocessor|

US6658549B2|2001-05-22|2003-12-02|Hewlett-Packard Development Company, Lp.|Method and system allowing a single entity to manage memory comprising compressed and uncompressed data|

US6985591B2|2001-06-29|2006-01-10|Intel Corporation|Method and apparatus for distributing keys for decrypting and re-encrypting publicly distributed media|

US7203824B2|2001-07-03|2007-04-10|Ip-First, Llc|Apparatus and method for handling BTAC branches that wrap across instruction cache lines|

US7024545B1|2001-07-24|2006-04-04|Advanced Micro Devices, Inc.|Hybrid branch prediction device with two levels of branch prediction cache|

US6954846B2|2001-08-07|2005-10-11|Sun Microsystems, Inc.|Microprocessor and method for giving each thread exclusive access to one register file in a multi-threading mode and for giving an active thread access to multiple register files in a single thread mode|

US6718440B2|2001-09-28|2004-04-06|Intel Corporation|Memory access latency hiding with hint buffer|

US7150021B1|2001-10-12|2006-12-12|Palau Acquisition Corporation |Method and system to allocate resources within an interconnect device according to a resource allocation table|

US7117347B2|2001-10-23|2006-10-03|Ip-First, Llc|Processor including fallback branch prediction mechanism for far jump and far call instructions|

US7272832B2|2001-10-25|2007-09-18|Hewlett-Packard Development Company, L.P.|Method of protecting user process data in a secure platform inaccessible to the operating system and other tasks on top of the secure platform|

US6964043B2|2001-10-30|2005-11-08|Intel Corporation|Method, apparatus, and system to optimize frequently executed code and to use compiler transformation and hardware support to handle infrequently executed code|

GB2381886B|2001-11-07|2004-06-23|Sun Microsystems Inc|Computer system with virtual memory and paging mechanism|

US7092869B2|2001-11-14|2006-08-15|Ronald Hilton|Memory address prediction under emulation|

US7080169B2|2001-12-11|2006-07-18|Emulex Design & Manufacturing Corporation|Receiving data from interleaved multiple concurrent transactions in a FIFO memory having programmable buffer zones|

US20030126416A1|2001-12-31|2003-07-03|Marr Deborah T.|Suspending execution of a thread in a multi-threaded processor|

US7363467B2|2002-01-03|2008-04-22|Intel Corporation|Dependence-chain processing using trace descriptors having dependency descriptors|

US6640333B2|2002-01-10|2003-10-28|Lsi Logic Corporation|Architecture for a sea of platforms|

US7055021B2|2002-02-05|2006-05-30|Sun Microsystems, Inc.|Out-of-order processor that reduces mis-speculation using a replay scoreboard|

US7331040B2|2002-02-06|2008-02-12|Transitive Limted|Condition code flag emulation for program code conversion|

US20030154363A1|2002-02-11|2003-08-14|Soltis Donald C.|Stacked register aliasing in data hazard detection to reduce circuit|

US6839816B2|2002-02-26|2005-01-04|International Business Machines Corporation|Shared cache line update mechanism|

US6731292B2|2002-03-06|2004-05-04|Sun Microsystems, Inc.|System and method for controlling a number of outstanding data transactions within an integrated circuit|

JP3719509B2|2002-04-01|2005-11-24|株式会社ソニー・コンピュータエンタテインメント|シリアル演算パイプライン、演算装置、算術論理演算回路およびシリアル演算パイプラインによる演算方法|

US7565509B2|2002-04-17|2009-07-21|Microsoft Corporation|Using limits on address translation to control access to an addressable entity|

US6920530B2|2002-04-23|2005-07-19|Sun Microsystems, Inc.|Scheme for reordering instructions via an instruction caching mechanism|

US7113488B2|2002-04-24|2006-09-26|International Business Machines Corporation|Reconfigurable circular bus|

US6760818B2|2002-05-01|2004-07-06|Koninklijke Philips Electronics N.V.|Memory region based data pre-fetching|

US7281055B2|2002-05-28|2007-10-09|Newisys, Inc.|Routing mechanisms in systems having multiple multi-processor clusters|

US7117346B2|2002-05-31|2006-10-03|Freescale Semiconductor, Inc.|Data processing system having multiple register contexts and method therefor|

US6938151B2|2002-06-04|2005-08-30|International Business Machines Corporation|Hybrid branch prediction using a global selection counter and a prediction method comparison table|

US6735747B2|2002-06-10|2004-05-11|Lsi Logic Corporation|Pre-silicon verification path coverage|

US8024735B2|2002-06-14|2011-09-20|Intel Corporation|Method and apparatus for ensuring fairness and forward progress when executing multiple threads of execution|

JP3845043B2|2002-06-28|2006-11-15|富士通株式会社|命令フェッチ制御装置|

JP3982353B2|2002-07-12|2007-09-26|日本電気株式会社|フォルトトレラントコンピュータ装置、その再同期化方法及び再同期化プログラム|

US6944744B2|2002-08-27|2005-09-13|Advanced Micro Devices, Inc.|Apparatus and method for independently schedulable functional units with issue lock mechanism in a processor|

US6950925B1|2002-08-28|2005-09-27|Advanced Micro Devices, Inc.|Scheduler for use in a microprocessor that supports data-speculative execution|

US7546422B2|2002-08-28|2009-06-09|Intel Corporation|Method and apparatus for the synchronization of distributed caches|

TW200408242A|2002-09-06|2004-05-16|Matsushita Electric Ind Co Ltd|Home terminal apparatus and communication system|

US6895491B2|2002-09-26|2005-05-17|Hewlett-Packard Development Company, L.P.|Memory addressing for a virtual machine implementation on a computer processor supporting virtual hash-page-table searching|

US7334086B2|2002-10-08|2008-02-19|Rmi Corporation|Advanced processor with system on a chip interconnect technology|

US6829698B2|2002-10-10|2004-12-07|International Business Machines Corporation|Method, apparatus and system for acquiring a global promotion facility utilizing a data-less transaction|

US7213248B2|2002-10-10|2007-05-01|International Business Machines Corporation|High speed promotion mechanism suitable for lock acquisition in a multiprocessor data processing system|

US7222218B2|2002-10-22|2007-05-22|Sun Microsystems, Inc.|System and method for goal-based scheduling of blocks of code for concurrent execution|

US20040103251A1|2002-11-26|2004-05-27|Mitchell Alsup|Microprocessor including a first level cache and a second level cache having different cache line sizes|

AU2003292451A1|2002-12-04|2004-06-23|Koninklijke Philips Electronics N.V.|Register file gating to reduce microprocessor power dissipation|

US6981083B2|2002-12-05|2005-12-27|International Business Machines Corporation|Processor virtualization mechanism via an enhanced restoration of hard architected states|

US7073042B2|2002-12-12|2006-07-04|Intel Corporation|Reclaiming existing fields in address translation data structures to extend control over memory accesses|

US20040117594A1|2002-12-13|2004-06-17|Vanderspek Julius|Memory management method|

US20040122887A1|2002-12-20|2004-06-24|Macy William W.|Efficient multiplication of small matrices using SIMD registers|

US7191349B2|2002-12-26|2007-03-13|Intel Corporation|Mechanism for processor power state aware distribution of lowest priority interrupt|

US20040139441A1|2003-01-09|2004-07-15|Kabushiki Kaisha Toshiba|Processor, arithmetic operation processing method, and priority determination method|

US6925421B2|2003-01-09|2005-08-02|International Business Machines Corporation|Method, system, and computer program product for estimating the number of consumers that place a load on an individual resource in a pool of physically distributed resources|

US7178010B2|2003-01-16|2007-02-13|Ip-First, Llc|Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack|

US7089374B2|2003-02-13|2006-08-08|Sun Microsystems, Inc.|Selectively unmarking load-marked cache lines during transactional program execution|

US7278030B1|2003-03-03|2007-10-02|Vmware, Inc.|Virtualization system for computers having multiple protection mechanisms|

US6912644B1|2003-03-06|2005-06-28|Intel Corporation|Method and apparatus to steer memory access operations in a virtual memory system|

US7111145B1|2003-03-25|2006-09-19|Vmware, Inc.|TLB miss fault handler and method for accessing multiple page tables|

US7143273B2|2003-03-31|2006-11-28|Intel Corporation|Method and apparatus for dynamic branch prediction utilizing multiple stew algorithms for indexing a global history|

CN1214666C|2003-04-07|2005-08-10|华为技术有限公司|位置业务中限制位置信息请求流量的方法|

US7058764B2|2003-04-14|2006-06-06|Hewlett-Packard Development Company, L.P.|Method of adaptive cache partitioning to increase host I/O performance|

US7139855B2|2003-04-24|2006-11-21|International Business Machines Corporation|High performance synchronization of resource allocation in a logically-partitioned system|

EP1471421A1|2003-04-24|2004-10-27|STMicroelectronics Limited|Speculative load instruction control|

US7290261B2|2003-04-24|2007-10-30|International Business Machines Corporation|Method and logical apparatus for rename register reallocation in a simultaneous multi-threaded processor|

US7469407B2|2003-04-24|2008-12-23|International Business Machines Corporation|Method for resource balancing using dispatch flush in a simultaneous multithread processor|

US7055003B2|2003-04-25|2006-05-30|International Business Machines Corporation|Data cache scrub mechanism for large L2/L3 data cache structures|

US7007108B2|2003-04-30|2006-02-28|Lsi Logic Corporation|System method for use of hardware semaphores for resource release notification wherein messages comprises read-modify-write operation and address|

US7743238B2|2003-05-09|2010-06-22|Arm Limited|Accessing items of architectural state from a register cache in a data processing apparatus when performing branch prediction operations for an indirect branch instruction|

US7861062B2|2003-06-25|2010-12-28|Koninklijke Philips Electronics N.V.|Data processing device with instruction controlled clock speed|

JP2005032018A|2003-07-04|2005-02-03|Semiconductor Energy Lab Co Ltd|遺伝的アルゴリズムを用いたマイクロプロセッサ|

US7149872B2|2003-07-10|2006-12-12|Transmeta Corporation|System and method for identifying TLB entries associated with a physical address of a specified range|

US7089398B2|2003-07-31|2006-08-08|Silicon Graphics, Inc.|Address translation using a page size tag|

US8296771B2|2003-08-18|2012-10-23|Cray Inc.|System and method for mapping between resource consumers and resource providers in a computing system|

US7133950B2|2003-08-19|2006-11-07|Sun Microsystems, Inc.|Request arbitration in multi-core processor|

US7849297B2|2003-08-28|2010-12-07|Mips Technologies, Inc.|Software emulation of directed exceptions in a multithreading processor|

EP1660993B1|2003-08-28|2008-11-19|MIPS Technologies, Inc.|Integrated mechanism for suspension and deallocation of computational threads of execution in a processor|

US9032404B2|2003-08-28|2015-05-12|Mips Technologies, Inc.|Preemptive multitasking employing software emulation of directed exceptions in a multithreading processor|

US7594089B2|2003-08-28|2009-09-22|Mips Technologies, Inc.|Smart memory based synchronization controller for a multi-threaded multiprocessor SoC|

US7111126B2|2003-09-24|2006-09-19|Arm Limited|Apparatus and method for loading data values|

JP4057989B2|2003-09-26|2008-03-05|株式会社東芝|スケジューリング方法および情報処理システム|

US7373637B2|2003-09-30|2008-05-13|International Business Machines Corporation|Method and apparatus for counting instruction and memory location ranges|

US7047322B1|2003-09-30|2006-05-16|Unisys Corporation|System and method for performing conflict resolution and flow control in a multiprocessor system|

FR2860313B1|2003-09-30|2005-11-04|Commissariat Energie Atomique|Composant a architecture reconfigurable dynamiquement|

US7395372B2|2003-11-14|2008-07-01|International Business Machines Corporation|Method and system for providing cache set selection which is power optimized|

US7243170B2|2003-11-24|2007-07-10|International Business Machines Corporation|Method and circuit for reading and writing an instruction buffer|

US20050120191A1|2003-12-02|2005-06-02|Intel Corporation |Checkpoint-based register reclamation|

US20050132145A1|2003-12-15|2005-06-16|Finisar Corporation|Contingent processor time division multiple access of memory in a multi-processor system to allow supplemental memory consumer access|

US7310722B2|2003-12-18|2007-12-18|Nvidia Corporation|Across-thread out of order instruction dispatch in a multithreaded graphics processor|

US7293164B2|2004-01-14|2007-11-06|International Business Machines Corporation|Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions|

US7426749B2|2004-01-20|2008-09-16|International Business Machines Corporation|Distributed computation in untrusted computing environments using distractive computational units|

US20050204118A1|2004-02-27|2005-09-15|National Chiao Tung University|Method for inter-cluster communication that employs register permutation|

US8151103B2|2004-03-13|2012-04-03|Adaptive Computing Enterprises, Inc.|System and method for providing object triggers|

US7478374B2|2004-03-22|2009-01-13|Intel Corporation|Debug system having assembler correcting register allocation errors|

US20050216920A1|2004-03-24|2005-09-29|Vijay Tewari|Use of a virtual machine to emulate a hardware device|

WO2005093562A1|2004-03-29|2005-10-06|Kyoto University|データ処理装置、データ処理プログラム、およびデータ処理プログラムを記録した記録媒体|

US7386679B2|2004-04-15|2008-06-10|International Business Machines Corporation|System, method and storage medium for memory management|

US7383427B2|2004-04-22|2008-06-03|Sony Computer Entertainment Inc.|Multi-scalar extension for SIMD instruction set processors|

US20050251649A1|2004-04-23|2005-11-10|Sony Computer Entertainment Inc.|Methods and apparatus for address map optimization on a multi-scalar extension|

US7418582B1|2004-05-13|2008-08-26|Sun Microsystems, Inc.|Versatile register file design for a multi-threaded processor utilizing different modes and register windows|

US7478198B2|2004-05-24|2009-01-13|Intel Corporation|Multithreaded clustered microarchitecture with dynamic back-end assignment|

US7594234B1|2004-06-04|2009-09-22|Sun Microsystems, Inc.|Adaptive spin-then-block mutual exclusion in multi-threaded processing|

US7284092B2|2004-06-24|2007-10-16|International Business Machines Corporation|Digital data processing apparatus having multi-level register file|

US20050289530A1|2004-06-29|2005-12-29|Robison Arch D|Scheduling of instructions in program compilation|

EP1628235A1|2004-07-01|2006-02-22|Texas Instruments Incorporated|Method and system of ensuring integrity of a secure mode entry sequence|

US8044951B1|2004-07-02|2011-10-25|Nvidia Corporation|Integer-based functionality in a graphics shading language|

US7339592B2|2004-07-13|2008-03-04|Nvidia Corporation|Simulating multiported memories using lower port count memories|

US7398347B1|2004-07-14|2008-07-08|Altera Corporation|Methods and apparatus for dynamic instruction controlled reconfigurable register file|

EP1619593A1|2004-07-22|2006-01-25|Sap Ag|Computer-Implemented method and system for performing a product availability check|

JP4064380B2|2004-07-29|2008-03-19|富士通株式会社|演算処理装置およびその制御方法|

US8443171B2|2004-07-30|2013-05-14|Hewlett-Packard Development Company, L.P.|Run-time updating of prediction hint instructions|

US7213106B1|2004-08-09|2007-05-01|Sun Microsystems, Inc.|Conservative shadow cache support in a point-to-point connected multiprocessing node|

US7318143B2|2004-10-20|2008-01-08|Arm Limited|Reuseable configuration data|

US7707578B1|2004-12-16|2010-04-27|Vmware, Inc.|Mechanism for scheduling execution of threads for fair resource allocation in a multi-threaded and/or multi-core processing system|

US7257695B2|2004-12-28|2007-08-14|Intel Corporation|Register file regions for a processing system|

US7996644B2|2004-12-29|2011-08-09|Intel Corporation|Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache|

US7050922B1|2005-01-14|2006-05-23|Agilent Technologies, Inc.|Method for optimizing test order, and machine-readable media storing sequences of instructions to perform same|

US20060179277A1|2005-02-04|2006-08-10|Flachs Brian K|System and method for instruction line buffer holding a branch target buffer|

US7657891B2|2005-02-04|2010-02-02|Mips Technologies, Inc.|Multithreading microprocessor with optimized thread scheduler for increasing pipeline utilization efficiency|

US7681014B2|2005-02-04|2010-03-16|Mips Technologies, Inc.|Multithreading instruction scheduler employing thread group priorities|

EP1849095B1|2005-02-07|2013-01-02|Richter, Thomas|Low latency massive parallel data processing device|

US7400548B2|2005-02-09|2008-07-15|International Business Machines Corporation|Method for providing multiple reads/writes using a 2read/2write register file array|

US7343476B2|2005-02-10|2008-03-11|International Business Machines Corporation|Intelligent SMT thread hang detect taking into account shared resource contention/blocking|

US7152155B2|2005-02-18|2006-12-19|Qualcomm Incorporated|System and method of correcting a branch misprediction|

US20060200655A1|2005-03-04|2006-09-07|Smith Rodney W|Forward looking branch target address caching|

US8195922B2|2005-03-18|2012-06-05|Marvell World Trade, Ltd.|System for dynamically allocating processing time to multiple threads|

US20060212853A1|2005-03-18|2006-09-21|Marvell World Trade Ltd.|Real-time control apparatus having a multi-thread processor|

GB2424727B|2005-03-30|2007-08-01|Transitive Ltd|Preparing instruction groups for a processor having a multiple issue ports|

US8522253B1|2005-03-31|2013-08-27|Guillermo Rozas|Hardware support for virtual machine and operating system context switching in translation lookaside buffers and virtually tagged caches|

US20060230243A1|2005-04-06|2006-10-12|Robert Cochran|Cascaded snapshots|

US7313775B2|2005-04-06|2007-12-25|Lsi Corporation|Integrated circuit with relocatable processor hardmac|

US7447869B2|2005-04-07|2008-11-04|Ati Technologies, Inc.|Method and apparatus for fragment processing in a virtual memory system|

US20060230409A1|2005-04-07|2006-10-12|Matteo Frigo|Multithreaded processor architecture with implicit granularity adaptation|

US8230423B2|2005-04-07|2012-07-24|International Business Machines Corporation|Multithreaded processor architecture with operational latency hiding|

US20060230253A1|2005-04-11|2006-10-12|Lucian Codrescu|Unified non-partitioned register files for a digital signal processor operating in an interleaved multi-threaded environment|

US7600135B2|2005-04-14|2009-10-06|Mips Technologies, Inc.|Apparatus and method for software specified power management performance using low power virtual threads|

US20060236074A1|2005-04-14|2006-10-19|Arm Limited|Indicating storage locations within caches|

US7437543B2|2005-04-19|2008-10-14|International Business Machines Corporation|Reducing the fetch time of target instructions of a predicted taken branch instruction|

US7461237B2|2005-04-20|2008-12-02|Sun Microsystems, Inc.|Method and apparatus for suppressing duplicative prefetches for branch target cache lines|

US7500043B2|2005-04-22|2009-03-03|Altrix Logic, Inc.|Array of data processing elements with variable precision interconnect|

US8713286B2|2005-04-26|2014-04-29|Qualcomm Incorporated|Register files for a digital signal processor operating in an interleaved multi-threaded environment|

US7630388B2|2005-05-04|2009-12-08|Arm Limited|Software defined FIFO memory for storing a set of data from a stream of source data|

GB2426084A|2005-05-13|2006-11-15|Agilent Technologies Inc|Updating data in a dual port memory|

US7861055B2|2005-06-07|2010-12-28|Broadcom Corporation|Method and system for on-chip configurable data ram for fast memory and pseudo associative caches|

US8010969B2|2005-06-13|2011-08-30|Intel Corporation|Mechanism for monitoring instruction set based thread execution on a plurality of instruction sequencers|

US8719819B2|2005-06-30|2014-05-06|Intel Corporation|Mechanism for instruction set based thread execution on a plurality of instruction sequencers|

WO2007027671A2|2005-08-29|2007-03-08|Searete Llc|Scheduling mechanism of a hierarchical processor including multiple parallel clusters|

US7765350B2|2005-09-14|2010-07-27|Koninklijke Philips Electronics N.V.|Method and system for bus arbitration|

US7562271B2|2005-09-26|2009-07-14|Rambus Inc.|Memory system topologies including a buffer device and an integrated circuit memory device|

US7350056B2|2005-09-27|2008-03-25|International Business Machines Corporation|Method and apparatus for issuing instructions from an issue queue in an information handling system|

US7546420B1|2005-09-28|2009-06-09|Sun Microsystems, Inc.|Efficient trace cache management during self-modifying code processing|

US7231106B2|2005-09-30|2007-06-12|Lucent Technologies Inc.|Apparatus for directing an optical signal from an input fiber to an output fiber within a high index host|

US7627735B2|2005-10-21|2009-12-01|Intel Corporation|Implementing vector memory operations|

US7613131B2|2005-11-10|2009-11-03|Citrix Systems, Inc.|Overlay network infrastructure|

US7681019B1|2005-11-18|2010-03-16|Sun Microsystems, Inc.|Executing functions determined via a collection of operations from translated instructions|

US7861060B1|2005-12-15|2010-12-28|Nvidia Corporation|Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior|

US7634637B1|2005-12-16|2009-12-15|Nvidia Corporation|Execution of parallel groups of threads with per-instruction serialization|

US7673111B2|2005-12-23|2010-03-02|Intel Corporation|Memory system with both single and consolidated commands|

US7770161B2|2005-12-28|2010-08-03|International Business Machines Corporation|Post-register allocation profile directed instruction scheduling|

US8423682B2|2005-12-30|2013-04-16|Intel Corporation|Address space emulation|

US7512767B2|2006-01-04|2009-03-31|Sony Ericsson Mobile Communications Ab|Data compression method for supporting virtual memory management in a demand paging system|

US20070186050A1|2006-02-03|2007-08-09|International Business Machines Corporation|Self prefetching L2 cache mechanism for data lines|

GB2435362B|2006-02-20|2008-11-26|Cramer Systems Ltd|Method of configuring devices in a telecommunications network|

JP4332205B2|2006-02-27|2009-09-16|富士通株式会社|キャッシュ制御装置およびキャッシュ制御方法|

US7543282B2|2006-03-24|2009-06-02|Sun Microsystems, Inc.|Method and apparatus for selectively executing different executable code versions which are optimized in different ways|

US7500073B1|2006-03-29|2009-03-03|Sun Microsystems, Inc.|Relocation of virtual-to-physical mappings|

CN103646009B|2006-04-12|2016-08-17|索夫特机械公司|对载明并行和依赖运算的指令矩阵进行处理的装置和方法|

US7610571B2|2006-04-14|2009-10-27|Cadence Design Systems, Inc.|Method and system for simulating state retention of an RTL design|

US7577820B1|2006-04-14|2009-08-18|Tilera Corporation|Managing data in a parallel processing environment|

CN100485636C|2006-04-24|2009-05-06|华为技术有限公司|一种基于模型驱动进行电信级业务开发的调试方法及装置|

US7804076B2|2006-05-10|2010-09-28|Taiwan Semiconductor Manufacturing Co., Ltd|Insulator for high current ion implanters|

US8145882B1|2006-05-25|2012-03-27|Mips Technologies, Inc.|Apparatus and method for processing template based user defined instructions|

US8108844B2|2006-06-20|2012-01-31|Google Inc.|Systems and methods for dynamically choosing a processing element for a compute kernel|

US20080126771A1|2006-07-25|2008-05-29|Lei Chen|Branch Target Extension for an Instruction Cache|

CN100495324C|2006-07-27|2009-06-03|中国科学院计算技术研究所|复杂指令集体系结构中的深度优先异常处理方法|

US8046775B2|2006-08-14|2011-10-25|Marvell World Trade Ltd.|Event-based bandwidth allocation mode switching method and apparatus|

US7904704B2|2006-08-14|2011-03-08|Marvell World Trade Ltd.|Instruction dispatching method and apparatus|

US7539842B2|2006-08-15|2009-05-26|International Business Machines Corporation|Computer memory system for selecting memory buses according to physical memory organization information stored in virtual address translation tables|

US7594060B2|2006-08-23|2009-09-22|Sun Microsystems, Inc.|Data buffer allocation in a non-blocking data services platform using input/output switching fabric|

US7752474B2|2006-09-22|2010-07-06|Apple Inc.|L1 cache flush when processor is entering low power mode|

US7716460B2|2006-09-29|2010-05-11|Qualcomm Incorporated|Effective use of a BHT in processor having variable length instruction set execution modes|

US7774549B2|2006-10-11|2010-08-10|Mips Technologies, Inc.|Horizontally-shared cache victims in multiple core processors|

US7680988B1|2006-10-30|2010-03-16|Nvidia Corporation|Single interconnect providing read and write access to a memory shared by concurrent threads|

US8108625B1|2006-10-30|2012-01-31|Nvidia Corporation|Shared memory with parallel access and access conflict resolution mechanism|

US7617384B1|2006-11-06|2009-11-10|Nvidia Corporation|Structured programming control flow using a disable mask in a SIMD architecture|

EP2523101B1|2006-11-14|2014-06-04|Soft Machines, Inc.|Apparatus and method for processing complex instruction formats in a multi- threaded architecture supporting various context switch modes and virtualization schemes|

US7493475B2|2006-11-15|2009-02-17|Stmicroelectronics, Inc.|Instruction vector-mode processing in multi-lane processor by multiplex switch replicating instruction in one lane to select others along with updated operand address|

US7934179B2|2006-11-20|2011-04-26|Et International, Inc.|Systems and methods for logic verification|

US20080235500A1|2006-11-21|2008-09-25|Davis Gordon T|Structure for instruction cache trace formation|

JP2008130056A|2006-11-27|2008-06-05|Renesas Technology Corp|半導体回路|

US7945763B2|2006-12-13|2011-05-17|International Business Machines Corporation|Single shared instruction predecoder for supporting multiple processors|

WO2008077088A2|2006-12-19|2008-06-26|The Board Of Governors For Higher Education, State Of Rhode Island And Providence Plantations|System and method for branch misprediction prediction using complementary branch predictors|

US7783869B2|2006-12-19|2010-08-24|Arm Limited|Accessing branch predictions ahead of instruction fetching|

EP1940028B1|2006-12-29|2012-02-29|STMicroelectronics Srl|Asynchronous interconnection system for 3D inter-chip communication|

US8321849B2|2007-01-26|2012-11-27|Nvidia Corporation|Virtual architecture and instruction set for parallel thread computing|

TWI329437B|2007-01-31|2010-08-21|Univ Nat Yunlin Sci & Tech||

US20080189501A1|2007-02-05|2008-08-07|Irish John D|Methods and Apparatus for Issuing Commands on a Bus|

US7685410B2|2007-02-13|2010-03-23|Global Foundries Inc.|Redirect recovery cache that receives branch misprediction redirects and caches instructions to be dispatched in response to the redirects|

US7647483B2|2007-02-20|2010-01-12|Sony Computer Entertainment Inc.|Multi-threaded parallel processor methods and apparatus|

US20080209190A1|2007-02-28|2008-08-28|Advanced Micro Devices, Inc.|Parallel prediction of multiple branches|

JP4980751B2|2007-03-02|2012-07-18|富士通セミコンダクター株式会社|データ処理装置、およびメモリのリードアクティブ制御方法。|

US8452907B2|2007-03-27|2013-05-28|Arm Limited|Data processing apparatus and method for arbitrating access to a shared resource|

US20080250227A1|2007-04-04|2008-10-09|Linderman Michael D|General Purpose Multiprocessor Programming Apparatus And Method|

US7716183B2|2007-04-11|2010-05-11|Dot Hill Systems Corporation|Snapshot preserved data cloning|

US7941791B2|2007-04-13|2011-05-10|Perry Wang|Programming environment for heterogeneous processor resource integration|

US7769955B2|2007-04-27|2010-08-03|Arm Limited|Multiple thread instruction fetch from different cache levels|

US7711935B2|2007-04-30|2010-05-04|Netlogic Microsystems, Inc.|Universal branch identifier for invalidation of speculative instructions|

US8555039B2|2007-05-03|2013-10-08|Qualcomm Incorporated|System and method for using a local condition code register for accelerating conditional instruction execution in a pipeline processor|

US8219996B1|2007-05-09|2012-07-10|Hewlett-Packard Development Company, L.P.|Computer processor with fairness monitor|

US9292436B2|2007-06-25|2016-03-22|Sonics, Inc.|Various methods and apparatus to support transactions whose data address sequence within that transaction crosses an interleaved channel address boundary|

CN101344840B|2007-07-10|2011-08-31|苏州简约纳电子有限公司|一种微处理器及在微处理器中执行指令的方法|

US7937568B2|2007-07-11|2011-05-03|International Business Machines Corporation|Adaptive execution cycle control method for enhanced instruction throughput|

US20090025004A1|2007-07-16|2009-01-22|Microsoft Corporation|Scheduling by Growing and Shrinking Resource Allocation|

US8433851B2|2007-08-16|2013-04-30|International Business Machines Corporation|Reducing wiring congestion in a cache subsystem utilizing sectored caches with discontiguous addressing|

US8108545B2|2007-08-27|2012-01-31|International Business Machines Corporation|Packet coalescing in virtual channels of a data processing system in a multi-tiered full-graph interconnect architecture|

US7711929B2|2007-08-30|2010-05-04|International Business Machines Corporation|Method and system for tracking instruction dependency in an out-of-order processor|

GB2452316B|2007-08-31|2009-08-19|Toshiba Res Europ Ltd|Method of Allocating Resources in a Computer.|

US8725991B2|2007-09-12|2014-05-13|Qualcomm Incorporated|Register file system and method for pipelined processing|

US8082420B2|2007-10-24|2011-12-20|International Business Machines Corporation|Method and apparatus for executing instructions|

US7856530B1|2007-10-31|2010-12-21|Network Appliance, Inc.|System and method for implementing a dynamic cache for a data storage system|

CN100478918C|2007-10-31|2009-04-15|中国人民解放军国防科学技术大学|微处理器中分段高速缓存的设计方法及分段高速缓存|

WO2009060459A2|2007-11-09|2009-05-14|Plurality|Shared memory system for a tightly-coupled multiprocessor|

US7877559B2|2007-11-26|2011-01-25|Globalfoundries Inc.|Mechanism to accelerate removal of store operations from a queue|

US8245232B2|2007-11-27|2012-08-14|Microsoft Corporation|Software-configurable and stall-time fair memory access scheduling mechanism for shared memory systems|

US7809925B2|2007-12-07|2010-10-05|International Business Machines Corporation|Processing unit incorporating vectorizable execution unit|

US20090150890A1|2007-12-10|2009-06-11|Yourst Matt T|Strand-based computing hardware and dynamically optimizing strandware for a high performance microprocessor system|

US8145844B2|2007-12-13|2012-03-27|Arm Limited|Memory controller with write data cache and read data cache|

US7831813B2|2007-12-17|2010-11-09|Globalfoundries Inc.|Uses of known good code for implementing processor architectural modifications|

US7870371B2|2007-12-17|2011-01-11|Microsoft Corporation|Target-frequency based indirect jump prediction for high-performance processors|

US20090165007A1|2007-12-19|2009-06-25|Microsoft Corporation|Task-level thread scheduling and resource allocation|

US8782384B2|2007-12-20|2014-07-15|Advanced Micro Devices, Inc.|Branch history with polymorphic indirect branch information|

US7917699B2|2007-12-21|2011-03-29|Mips Technologies, Inc.|Apparatus and method for controlling the exclusivity mode of a level-two cache|

US8645965B2|2007-12-31|2014-02-04|Intel Corporation|Supporting metered clients with manycore through time-limited partitioning|

US9244855B2|2007-12-31|2016-01-26|Intel Corporation|Method, system, and apparatus for page sizing extension|

US7877582B2|2008-01-31|2011-01-25|International Business Machines Corporation|Multi-addressable register file|

WO2009101563A1|2008-02-11|2009-08-20|Nxp B.V.|Multiprocessing implementing a plurality of virtual processors|

US9021240B2|2008-02-22|2015-04-28|International Business Machines Corporation|System and method for Controlling restarting of instruction fetching using speculative address computations|

US7987343B2|2008-03-19|2011-07-26|International Business Machines Corporation|Processor and method for synchronous load multiple fetching sequence and pipeline stage result tracking to facilitate early address generation interlock bypass|

US7949972B2|2008-03-19|2011-05-24|International Business Machines Corporation|Method, system and computer program product for exploiting orthogonal control vectors in timing driven synthesis|

US9513905B2|2008-03-28|2016-12-06|Intel Corporation|Vector instructions to enable efficient synchronization and parallel reduction operations|

US8120608B2|2008-04-04|2012-02-21|Via Technologies, Inc.|Constant buffering for a computational core of a programmable graphics processing unit|

CR20170001A|2008-04-28|2017-08-10|Genentech Inc|Anticuerpos anti factor d humanizados|

US8131982B2|2008-06-13|2012-03-06|International Business Machines Corporation|Branch prediction instructions having mask values involving unloading and loading branch history data|

US8145880B1|2008-07-07|2012-03-27|Ovics|Matrix processor data switch routing systems and methods|

CN102089752B|2008-07-10|2014-05-07|洛克泰克科技有限公司|依赖性问题的有效率的并行计算|

JP2010039536A|2008-07-31|2010-02-18|Panasonic Corp|プログラム変換装置、プログラム変換方法およびプログラム変換プログラム|

US8316435B1|2008-08-14|2012-11-20|Juniper Networks, Inc.|Routing device having integrated MPLS-aware firewall with virtual security system support|

US8135942B2|2008-08-28|2012-03-13|International Business Machines Corpration|System and method for double-issue instructions using a dependency matrix and a side issue queue|

US7769984B2|2008-09-11|2010-08-03|International Business Machines Corporation|Dual-issuance of microprocessor instructions using dual dependency matrices|

US8225048B2|2008-10-01|2012-07-17|Hewlett-Packard Development Company, L.P.|Systems and methods for resource access|

US7941616B2|2008-10-21|2011-05-10|Microsoft Corporation|System to reduce interference in concurrent programs|

GB2464703A|2008-10-22|2010-04-28|Advanced Risc Mach Ltd|An array of interconnected processors executing a cycle-based program|

US8423749B2|2008-10-22|2013-04-16|International Business Machines Corporation|Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node|

WO2010049585A1|2008-10-30|2010-05-06|Nokia Corporation|Method and apparatus for interleaving a data block|

US8032678B2|2008-11-05|2011-10-04|Mediatek Inc.|Shared resource arbitration|

US7848129B1|2008-11-20|2010-12-07|Netlogic Microsystems, Inc.|Dynamically partitioned CAM array|

US8868838B1|2008-11-21|2014-10-21|Nvidia Corporation|Multi-class data cache policies|

US8171223B2|2008-12-03|2012-05-01|Intel Corporation|Method and system to increase concurrency and control replication in a multi-core cache hierarchy|

US8200949B1|2008-12-09|2012-06-12|Nvidia Corporation|Policy based allocation of register file cache to threads in multi-threaded processor|

US8312268B2|2008-12-12|2012-11-13|International Business Machines Corporation|Virtual machine|

US7870308B2|2008-12-23|2011-01-11|International Business Machines Corporation|Programmable direct memory access engine|

US8099586B2|2008-12-30|2012-01-17|Oracle America, Inc.|Branch misprediction recovery mechanism for microprocessors|

US20100169578A1|2008-12-31|2010-07-01|Texas Instruments Incorporated|Cache tag memory|

US20100205603A1|2009-02-09|2010-08-12|Unisys Corporation|Scheduling and dispatching tasks in an emulated operating system|

JP5417879B2|2009-02-17|2014-02-19|富士通セミコンダクター株式会社|キャッシュ装置|

JP2010226275A|2009-03-23|2010-10-07|Nec Corp|通信装置および通信方法|

US8505013B2|2010-03-12|2013-08-06|Lsi Corporation|Reducing data read latency in a network communications processor architecture|

US8805788B2|2009-05-04|2014-08-12|Moka5, Inc.|Transactional virtual disk with differential snapshots|

US8332854B2|2009-05-19|2012-12-11|Microsoft Corporation|Virtualized thread scheduling for hardware thread optimization based on hardware resource parameter summaries of instruction blocks in execution groups|

US8533437B2|2009-06-01|2013-09-10|Via Technologies, Inc.|Guaranteed prefetch instruction|

TW201044185A|2009-06-09|2010-12-16|Zillians Inc|Virtual world simulation systems and methods utilizing parallel coprocessors, and computer program products thereof|

GB2471067B|2009-06-12|2011-11-30|Graeme Roy Smith|Shared resource multi-thread array processor|

US9122487B2|2009-06-23|2015-09-01|Oracle America, Inc.|System and method for balancing instruction loads between multiple execution units using assignment history|

US8386754B2|2009-06-24|2013-02-26|Arm Limited|Renaming wide register source operand with plural short register source operands for select instructions to detect dependency fast with existing mechanism|

CN101582025B|2009-06-25|2011-05-25|浙江大学|片上多处理器体系架构下全局寄存器重命名表的实现方法|

US8397049B2|2009-07-13|2013-03-12|Apple Inc.|TLB prefetching|

US8539486B2|2009-07-17|2013-09-17|International Business Machines Corporation|Transactional block conflict resolution based on the determination of executing threads in parallel or in serial mode|

JP5423217B2|2009-08-04|2014-02-19|富士通株式会社|演算処理装置、情報処理装置、および演算処理装置の制御方法|

US9244732B2|2009-08-28|2016-01-26|Vmware, Inc.|Compensating threads for microarchitectural resource contentions by prioritizing scheduling and execution|

US8127078B2|2009-10-02|2012-02-28|International Business Machines Corporation|High performance unaligned cache access|

US20110082983A1|2009-10-06|2011-04-07|Alcatel-Lucent Canada, Inc.|Cpu instruction and data cache corruption prevention system|

US8695002B2|2009-10-20|2014-04-08|Lantiq Deutschland Gmbh|Multi-threaded processors and multi-processor systems comprising shared resources|

US8364933B2|2009-12-18|2013-01-29|International Business Machines Corporation|Software assisted translation lookaside buffer search mechanism|

JP2011150397A|2010-01-19|2011-08-04|Panasonic Corp|バス調停装置|

KR101699910B1|2010-03-04|2017-01-26|삼성전자주식회사|재구성 가능 프로세서 및 그 제어 방법|

US20120005462A1|2010-07-01|2012-01-05|International Business Machines Corporation|Hardware Assist for Optimizing Code During Processing|

US8312258B2|2010-07-22|2012-11-13|Intel Corporation|Providing platform independent memory logic|

US8751745B2|2010-08-11|2014-06-10|Advanced Micro Devices, Inc.|Method for concurrent flush of L1 and L2 caches|

CN101916180B|2010-08-11|2013-05-29|中国科学院计算技术研究所|Risc处理器中执行寄存器类型指令的方法和其系统|

US9201801B2|2010-09-15|2015-12-01|International Business Machines Corporation|Computing device with asynchronous auxiliary execution unit|

US8856460B2|2010-09-15|2014-10-07|Oracle International Corporation|System and method for zero buffer copying in a middleware environment|

US10228949B2|2010-09-17|2019-03-12|Intel Corporation|Single cycle multi-branch prediction including shadow cache for early far branch prediction|

US20120079212A1|2010-09-23|2012-03-29|International Business Machines Corporation|Architecture for sharing caches among multiple processes|

US8370553B2|2010-10-18|2013-02-05|International Business Machines Corporation|Formal verification of random priority-based arbiters using property strengthening and underapproximations|

US9047178B2|2010-12-13|2015-06-02|SanDisk Technologies, Inc.|Auto-commit memory synchronization|

US8677355B2|2010-12-17|2014-03-18|Microsoft Corporation|Virtual machine branching and parallel execution|

WO2012103245A2|2011-01-27|2012-08-02|Soft Machines Inc.|Guest instruction block with near branching and far branching sequence construction to native instruction block|

US9766893B2|2011-03-25|2017-09-19|Intel Corporation|Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines|

WO2012135041A2|2011-03-25|2012-10-04|Soft Machines, Inc.|Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines|

KR101966712B1|2011-03-25|2019-04-09|인텔 코포레이션|분할가능한 엔진에 의해 인스턴스화된 가상 코어를 이용한 코드 블록의 실행을 지원하는 메모리 프래그먼트|

US20120254592A1|2011-04-01|2012-10-04|Jesus Corbal San Adrian|Systems, apparatuses, and methods for expanding a memory source into a destination register and compressing a source register into a destination memory location|

US9740494B2|2011-04-29|2017-08-22|Arizona Board Of Regents For And On Behalf Of Arizona State University|Low complexity out-of-order issue logic using static circuits|

US8843690B2|2011-07-11|2014-09-23|Avago Technologies General Ip Pte. Ltd.|Memory conflicts learning capability|

US8930432B2|2011-08-04|2015-01-06|International Business Machines Corporation|Floating point execution unit with fixed point functionality|

US20130046934A1|2011-08-15|2013-02-21|Robert Nychka|System caching using heterogenous memories|

US8839025B2|2011-09-30|2014-09-16|Oracle International Corporation|Systems and methods for retiring and unretiring cache lines|

KR101648278B1|2011-11-22|2016-08-12|소프트 머신즈, 인크.|마이크로프로세서 가속 코드 최적화기 및 의존성 재순서화 방법|

US10191746B2|2011-11-22|2019-01-29|Intel Corporation|Accelerated code optimizer for a multiengine microprocessor|

KR101703400B1|2011-11-22|2017-02-06|소프트 머신즈, 인크.|마이크로프로세서 가속 코드 최적화기|

US20130138888A1|2011-11-30|2013-05-30|Jama I. Barreh|Storing a target address of a control transfer instruction in an instruction field|

US8930674B2|2012-03-07|2015-01-06|Soft Machines, Inc.|Systems and methods for accessing a unified translation lookaside buffer|

KR20130119285A|2012-04-23|2013-10-31|한국전자통신연구원|클러스터 컴퓨팅 환경에서의 자원 할당 장치 및 그 방법|

US9684601B2|2012-05-10|2017-06-20|Arm Limited|Data processing apparatus having cache and translation lookaside buffer|

US9996348B2|2012-06-14|2018-06-12|Apple Inc.|Zero cycle load|

US9940247B2|2012-06-26|2018-04-10|Advanced Micro Devices, Inc.|Concurrent access to cache dirty bits|

US9430410B2|2012-07-30|2016-08-30|Soft Machines, Inc.|Systems and methods for supporting a plurality of load accesses of a cache in a single cycle|

US9229873B2|2012-07-30|2016-01-05|Soft Machines, Inc.|Systems and methods for supporting a plurality of load and store accesses of a cache|

US9740612B2|2012-07-30|2017-08-22|Intel Corporation|Systems and methods for maintaining the coherency of a store coalescing cache and a load cache|

US9916253B2|2012-07-30|2018-03-13|Intel Corporation|Method and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput|

US9710399B2|2012-07-30|2017-07-18|Intel Corporation|Systems and methods for flushing a cache with modified data|

US9678882B2|2012-10-11|2017-06-13|Intel Corporation|Systems and methods for non-blocking implementation of cache flush instructions|

US10037228B2|2012-10-25|2018-07-31|Nvidia Corporation|Efficient memory virtualization in multi-threaded processing units|

US9195506B2|2012-12-21|2015-11-24|International Business Machines Corporation|Processor provisioning by a middleware processing system for a plurality of logical processor partitions|

GB2509974B|2013-01-21|2015-04-01|Imagination Tech Ltd|Allocating resources to threads based on speculation metric|

US10140138B2|2013-03-15|2018-11-27|Intel Corporation|Methods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation|

WO2014150806A1|2013-03-15|2014-09-25|Soft Machines, Inc.|A method for populating register view data structure by using register template snapshots|

WO2014150991A1|2013-03-15|2014-09-25|Soft Machines, Inc.|A method for implementing a reduced size register view data structure in a microprocessor|

WO2014150971A1|2013-03-15|2014-09-25|Soft Machines, Inc.|A method for dependency broadcasting through a block organized source view data structure|

US9811342B2|2013-03-15|2017-11-07|Intel Corporation|Method for performing dual dispatch of blocks and half blocks|

US9112767B2|2013-03-15|2015-08-18|Cavium, Inc.|Method and an accumulator scoreboard for out-of-order rule response handling|

WO2014151018A1|2013-03-15|2014-09-25|Soft Machines, Inc.|A method for executing multithreaded instructions grouped onto blocks|

US9632825B2|2013-03-15|2017-04-25|Intel Corporation|Method and apparatus for efficient scheduling for asymmetrical execution units|

US9904625B2|2013-03-15|2018-02-27|Intel Corporation|Methods, systems and apparatus for predicting the way of a set associative cache|

EP2972794A4|2013-03-15|2017-05-03|Soft Machines, Inc.|A method for executing blocks of instructions using a microprocessor architecture having a register view, source view, instruction view, and a plurality of register templates|

US9891924B2|2013-03-15|2018-02-13|Intel Corporation|Method for implementing a reduced size register view data structure in a microprocessor|

US10275255B2|2013-03-15|2019-04-30|Intel Corporation|Method for dependency broadcasting through a source organized source view data structure|

WO2014151043A1|2013-03-15|2014-09-25|Soft Machines, Inc.|A method for emulating a guest centralized flag architecture by using a native distributed flag architecture|

US9886279B2|2013-03-15|2018-02-06|Intel Corporation|Method for populating and instruction view data structure by using register template snapshots|

US9569216B2|2013-03-15|2017-02-14|Soft Machines, Inc.|Method for populating a source view data structure by using register template snapshots|

US9208066B1|2015-03-04|2015-12-08|Centipede Semi Ltd.|Run-time code parallelization with approximate monitoring of instruction sequences|CN103646009B|2006-04-12|2016-08-17|索夫特机械公司|对载明并行和依赖运算的指令矩阵进行处理的装置和方法|

KR101523020B1|2010-06-18|2015-05-26|더 보드 오브 리전츠 오브 더 유니버시티 오브 텍사스 시스템|결합된 분기 타깃 및 프레디킷 예측|

US10228949B2|2010-09-17|2019-03-12|Intel Corporation|Single cycle multi-branch prediction including shadow cache for early far branch prediction|

US8639884B2|2011-02-28|2014-01-28|Freescale Semiconductor, Inc.|Systems and methods for configuring load/store execution units|

US9547593B2|2011-02-28|2017-01-17|Nxp Usa, Inc.|Systems and methods for reconfiguring cache memory|

WO2012135041A2|2011-03-25|2012-10-04|Soft Machines, Inc.|Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines|

US9766893B2|2011-03-25|2017-09-19|Intel Corporation|Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines|

CN107729267B|2011-05-20|2022-01-25|英特尔公司|资源的分散分配以及用于支持由多个引擎执行指令序列的互连结构|

US9442772B2|2011-05-20|2016-09-13|Soft Machines Inc.|Global and local interconnect structure comprising routing matrix to support the execution of instruction sequences by a plurality of engines|

US10191746B2|2011-11-22|2019-01-29|Intel Corporation|Accelerated code optimizer for a multiengine microprocessor|

KR101703400B1|2011-11-22|2017-02-06|소프트 머신즈, 인크.|마이크로프로세서 가속 코드 최적화기|

US8826257B2|2012-03-30|2014-09-02|Intel Corporation|Memory disambiguation hardware to support software binary translation|

US9311148B2|2012-12-20|2016-04-12|Telefonaktiebolaget L M Ericsson |Pseudo-random hardware resource allocation through the plurality of resource controller based on non-repeating sequence of index list entries|