快捷搜索:

您的位置:澳门新葡4473网站 > 澳门新葡4473网站 > Windows 7中的无规律内存通道技术 - Windows7之家,

Windows 7中的无规律内存通道技术 - Windows7之家,

发布时间:2019-11-22 16:58编辑:澳门新葡4473网站浏览(180)

    Win7之家:Windows 7中的无规律内部存款和储蓄器通道技艺

    Appendix A

    附录A

    Windows 7不只好够更加好地运用演算越来越快的CPU,同期对单微电路多为重的支撑也丰盛好。61个人的Windows 7在单台机器上以致能够支撑超越陆十二个逻辑微处理器。

    PC hardware

    Pc的硬件

     

    This appendix describes personal computer (PC) hardware, the platform on which xv6 runs.

    本条附录描述了xv6运营的平台——个人Computer(PC)的硬件。

     

    A PC is a computer that adheres to several industry standards, with the goal that a given piece of software can run on PCs sold by multiple vendors. These standards evolve over time and a PC from 1990s doesn’t look like a PC now.

    三个PC正是风华正茂台固守大多工业标准,以给定的由多个厂商临盆出售的黄金时代雨后冬笋软件能够运作的Computer。随着时间的推迟,从十八世纪三十时代的PC看起来和当今的并不完全平等。

     

    From the outside a PC is a box with a keyboard, a screen, and various devices (e.g., CD-rom, etc.). Inside the box is a circuit board (the ‘‘motherboard’’) with CPU chips, memory chips, graphic chips, I/O controller chips, and busses through which the chips communicate. The busses adhere to standard protocols (e.g., PCI and USB) so that devices will work with PCs from multiple vendors.

    从表面看,pc正是叁个蕴涵键盘、显示屏以致两个设施的盒子(比如CD驱动器等卡塔尔。盒子的内部是一块电路主板(也叫主板卡塔尔国,带有CPU集成电路、内部存款和储蓄器微电路、显示微芯片、I/O调控微芯片以至连接各集成电路的总线。总线服从标准公约(如PCI和USB卡塔 尔(英语:State of Qatar),那样多少个厂家临盆的配备能够和PC一同专门的学问。

     

    From our point of view, we can abstract the PC into three components: CPU, memory, and input/output (I/O) devices. The CPU performs computation, the memory contains instructions and data for that computation, and devices allow the CPU to interact with hardware for storage, communication, and other functions.

    从大家的理念出发,能够把PC抽象为多个组成都部队分:CPU、内部存款和储蓄器和输入输出设备(I/O卡塔 尔(英语:State of Qatar)。CPU担当总括,内存为总计存款和储蓄指令和数据,外界设备允许CPU和硬件之间开展沟通,如存款和储蓄、通信和别的职能。

     

    You can think of main memory as connected to the CPU with a set of wires, or lines, some for address bits, some for data bits, and some for control flags. To read a value from main memory, the CPU sends high or low voltages representing 1 or 0 bits on the address lines and a 1 on the ‘‘read’’ line for a prescribed amount of time and then reads back the value by interpreting the voltages on the data lines. To write a value to main memory, the CPU sends appropriate bits on the address and data lines and a 1 on the ‘‘write’’ line for a prescribed amount of time. Real memory interfaces are more complex than this, but the details are only important if you need to achieve high performance.

    你能够把主存想成和CPU使用意气风发层层的线连接在一块,有的时候是地方位,一时是数据位,不常是调控标识。从主存中读取值,CPU发送高或低电平代表1或0给每地址线,在读取线上自然时间内保持时限信号1,然后通过解释数据线上的电平,读回所代表的数值。写二个值到主存时,CPU给地址线和数据线上发送合适的值,并在写线上自然时间内维持时限信号1。真实的内部存储器接口比那要复杂得多,但细节难题独有当您要高达高品质时才显得至关心重视要。

     

    乘势Computer微芯片和操作系统的进步,微软也对程序猿们提议了越来越高的要求,强调他们支付出的付加物必须适应如此的晋升。基本来讲,开垦者们方可使用多核微电脑带给的双线运维技能提升品质。那正是Windows 7中的无规律内存通道本事。

    Processor and memory

    微Computer和内部存款和储蓄器

     

    A computer’s CPU (central processing unit, or processor) runs a conceptually simple loop: it consults an address in a register called the program counter, reads a machine instruction from that address in memory, advances the program counter past the instruction, and executes the instruction. Repeat. If the execution of the instruction does not modify the program counter, this loop will interpret the memory pointed at by the program counter as a sequence of machine instructions to run one after the other. Instructions that do change the program counter include branches and function calls.

    一个Computer的CPU(中心处理单元,或叫微电脑卡塔尔国运营在多个概念上的粗略循环:它从二个叫程序流速计的存放器中收获地点,从十一分内部存款和储蓄器地址中读取机器指令,依照指令长短扩大程序流速計,实行那个命令。再一次重复。即便指令的实行未有校订程序流速計,这几个循环会把程序流量计指向的那些内部存款和储蓄器中的数额作为一连串的机器指令贰个接多个地去运作。改进程序流速計的吩咐富含分支和作用调用。

     

    The execution engine is useless without the ability to store and modify program data. The fastest storage for data is provided by the processor’s register set. A register is a storage cell inside the processor itself, capable of holding a machine word-sized value (typically 16, 32, or 64 bits). Data stored in registers can typically be read or written quickly, in a single CPU cycle.

    比如没有了蕴藏和天性程序数据的力量,实行引擎是绝非用途的。对于数据来讲最快的存取是Computer中提供的一多种的贮存器。存放器是计算机内部提供的贰个存款和储蓄单元,能够包容四个机器字大小的数值(标准的为16个人、叁十位或陆十三位卡塔尔。贮存器中积攒的数量能在二个CPU周期中被不慢地读写。

     

    PCs have a processor that implements the x86 instruction set, which was originally defined by Intel and has become a standard. Several manufacturers produce processors that implement the instruction set. Like all other PC standards, this standard is also evolving but newer standards are backwards compatible with past standards. The boot loader has to deal with some of this evolution because every PC processor starts simulating an Intel 8088, the CPU chip in the original IBM PC released in 1981. However, for most of xv6 you will be concerned with the modern x86 instruction set.

    PC的微型机,完成X86指令集。那么些命令集最先由因特尔提议并化作正式。四个厂家临盆达成那么些指令集的计算机。象全体其余的PC标准雷同,这几个规范随着时间推移也在转移,但新的行业内部均会向后十分旧的正规。运行器必须要应对这么些发展。因为各样PC微电脑在开机时均模拟成因特尔的8088,这么些CPU晶片是最先IBM PC在一九八三年公布的。然则,在xv6中,大多数气象是您仅须要关爱今世x86指令集。

     

    The modern x86 provides eight general purpose 32-bit registers—%eax, %ebx, %ecx, %edx, %edi, %esi, %ebp, and %esp—and a program counter %eip (the ‘ instruction pointer). The common e prefix stands for extended, as these are 32-bit extensions of the 16-bit registers %ax, %bx, %cx, %dx, %di, %si, %bp, %sp, and %ip. The two register sets are aliased so that, for example, %ax is the bottom half of %eax: writing to %ax changes the value stored in %eax and vice versa. The first four registers also have names for the bottom two 8-bit bytes: %al and %ah denote the low and high 8 bits of %ax; %bl, %bh, %cl, %ch, %dl, and %dh continue the pattern. In addition to these registers, the x86 has eight 80-bit floating-point registers as well as a handful of special-purpose registers like the control registers %cr0, %cr2, %cr3, and %cr4; the debug registers %dr0, %dr1, %dr2, and %dr3; the segment registers %cs, %ds, %es, %fs, %gs, and %ss; and the global and local descriptor table pseudo-registers %gdtr and %ldtr. The control registers and segment registers are important to any operating system. The floating-point and debug registers are less interesting and not used by xv6.

    今世x86提供8个通用的叁14个人存放器——%eax、%ebx、%ecs、%edx、%edi、%esi、%ebp、%esp­和三个程序流量计%eip(指令指针)。平常前缀e代表扩大,这个三21人的存放器是在十五人的寄放器上扩张而来,%ax、%bx、%cd、%dx、%di、%si、%bp、%sp和%ip。这两套贮存器互为外号,如,%ax是%eax的不如部分:写到%ax中的值也被储存到%eax中,反之亦然。前几个贮存器的低字节部分的八个8位还著名字:%al、%ah,代表%ax的高和低8位;%bl、%bh、%cl、%ch、%dl和%dh也是风姿浪漫致情势。在这里些寄放器之外,x86还会有8个八十二人的浮点寄放器,作为为数非常少的独出新裁指标寄放器,如调节存放器%cr0、%cr2、%cr3和%cr4;调试贮存器%dr0、%dr1、%dr2和%dr3;段存放器%cs、%ds、%es、%fs、%gs和%ss;全局和地面描述符表伪寄放器%gdtr和%ldtr。对于别的的操作系统来讲调节存放器和估m贮存器都以相当主要的。在xv6中不关怀也从未利用浮点寄放器和调节和测量试验贮存器。

     

    Registers are fast but expensive. Most processors provide at most a few tens of general-purpose registers. The next conceptual level of storage is the main random-access memory (RAM). Main memory is 10-100x slower than a register, but it  is much cheaper, so there can be more of it. One reason main memory is relatively slow is that it is physically separate from the processor chip. An x86 processor has a few dozen registers, but a typical PC today has gigabytes of main memory. Because of the enormous differences in both access speed and size between registers and main memory, most processors, including the x86, store copies of recently-accessed sections of main memory in on-chip cache memory. The cache memory serves as a middle ground between registers and memory both in access time and in size. Today’s x86 processors typically have two levels of cache, a small first-level cache with access times relatively close to the processor’s clock rate and a larger second-level cache with access times in between the first-level cache and main memory. This table shows actual numbers for an Intel Core 2 Duo system:

    寄放器快但很贵。大多数Computer提供最好些个13个左右的通用贮存器。下五个定义档案的次序的存款和储蓄器是主随机访问内部存款和储蓄器(RAM卡塔 尔(英语:State of Qatar)。主内部存款和储蓄器要比存放器慢10到100倍,但却平价超多,由此数据上得以有过多。主存慢的一个缘由是它与拍卖微电路相抽离的。二个x86微电脑能够有大概10个左右的寄存器,但今日标准的PC却有成G的主存。在贮存器和主存的访谈速度和分寸上有宏大差距,大多数计算机,包含x86,存款和储蓄方今拜会的主存区域的三个拷贝到晶片上的缓存中。缓存的访问速度和分寸均在内部存款和储蓄器和贮存器的三此中间水平。几近期标准的x86微处理机皆有二级缓存,稍的超级缓存访谈时间贴近微处理机的时钟周期,非常的大的二级缓存访谈时间在一级缓存和主存之间。下边包车型地铁表显示了三个因特尔Core 2 Duo的诚实访谈数据:

     

    Intel Core 2 Duo E7200 at 2.53 GHz

    TODO: Plug in non-made-up numbers!

    Storage存储

    access time访问时间

    Size大小

    Register

     0.6 ns

     64 bytes

    L1 cache

     0.5 ns

     64 kilobytes

    L2 cache

     10 ns

     4 megabytes

    main memory

     100 ns

     4 gigabytes

     

     

    For the most part, x86 processors hide the cache from the operating system, so we can think of the processor as having just two kinds of storage—registers and memory—and not worry about the distinctions between the different levels of the memory hierarchy.

    大多数时候,x86微处理器对操作系统掩没了缓存,所以我们得以要是微型机只有两种存款和储蓄——存放器和内部存款和储蓄器——不用顾忌差异存款和储蓄层级之间的异样。

     

    Phil Pennington,Windows Server开辟首席营业官揭露:“陆11人版的Windows 7和Windows Server 2010奔驰M级2支撑单台机器上超过63个的逻辑微处理器。多核微处理器的特出性以往得以因此Windows 7中的无规律内部存款和储蓄器通道技艺展示出来。在不远的现在,4CPU8着力的系统将会贯彻64颗逻辑微机的支撑。相当多服务器等级的应用方案都要求Windows 7中的无规律内部存储器通道能力来贯彻对64颗逻辑微机的支撑。”

    I/O

    输入/输出

     

    Processors must communicate with devices as well as memory. The x86 processor provides special in and out instructions that read and write values from device addresses called I/O ports. The hardware implementation of these instructions is essentially the same as reading and writing memory. Early x86 processors had an extra address line: 0 meant read/write from an I/O port and 1 meant read/write from main memory. Each hardware device monitors these lines for reads and writes to its assigned range of I/O ports. A device’s ports let the software configure the device, examine its status, and cause the device to take actions; for example, software can use I/O port reads and writes to cause the disk interface hardware to read and write sectors on the disk.

    Computer必需与外设通讯,举例内部存款和储蓄器。X86微机提供新鲜的in和out指令用来向被称呼I/O端口的设施地址读取和写入数值。这几个指令的硬件完成和读写内部存款和储蓄器几没有差异。开始的生龙活虎段时代的x86微型机有贰个额外的地址线:0意味着从I/O商品读/写,1代表从主存读/写。各样硬件设施监视这个线来读写到分配给它们的I/O端口。多个设备的端口允许软件配置设施、检测意况和向设施发送运作;比如软件能够利用I/O端口的读、写来调控磁盘接口硬件来读写磁盘扇区。

     

    Many computer architectures have no separate device access instructions. Instead the devices have fixed memory addresses and the processor communicates with the device (at the operating system’s behest) by reading and writing values at those addresses. In fact, modern x86 architectures use this technique, called memory-mapped I/O, for most high-speed devices such as network, disk, and graphics controllers. For reasons of backwards compatibility, though, the old in and out instructions linger, as do legacy hardware devices that use them, such as the IDE disk controller, which xv6 uses.

    多数Computer架构并未有独自的配备访谈指令。取代的是那一个设备采取一定的内部存款和储蓄器地址,通过读写那个地址的值和Computer通讯(在操作系统的一声令下下卡塔尔国。实际上,现代的x86框架结构使用这种技巧,叫做内部存款和储蓄器映射I/O,对于大多数高速设备比方网络、磁盘和出示调整器。因后向宽容的原委,旧的in和out指令保留下去,对legacy硬件设备使用它们,比方xv6中央银行使的IDE磁盘调节器。

    Windows 7中的无规律内存通道本事首要用来缓慢解决Computer的总线约束。通过Windows 7中的无规律内部存款和储蓄器通道技术,机器能够加速硬件设施的运作速度。

    Windows 7 will be able to take advantage not only of faster CPUs, but of multiple processors on a single chip. The 64-bit edition of the operating system in particular will be able to support over 64 Logical Processors per machine.

    In this regard, Microsoft underlined the need for software developers to adapt their applications in accordance with the evolution of processor chips and that of the Windows operating system. Essentially, consistent gains in performance are synonymous with using parallel programming techniques in concordance with many-core processors. This is where non-uniform memory access comes in.

    “The 64-bit versions of Windows 7 and Windows Server 2008 R2 support more than 64 Logical Processors on a single computer,” Phil Pennington, Windows Server Technical Evangelism, revealed. “New processors are now appearing that leverage non-uniform memory access architectures. Within the near future, a system with 4 CPU sockets, 8 processor-cores per socket, and with Simultaneious Multi-Threading enabled per core, will achieve 64 Logical Processors. Many server-class solutions will need to be architected with NUMA awareness in order to achieve linear performance scaling on 64+ LP systems.” ..

    本文由澳门新葡4473网站发布于澳门新葡4473网站,转载请注明出处:Windows 7中的无规律内存通道技术 - Windows7之家,

    关键词: