首页 > 学术百科

lat_mem_rd内存延迟测试工具原理，lmbench编译时llseek链接不到问题解决

lat_mem_rd内存延迟测试⼯具原理，lmbench编译时llseek链

接不到问题解决

命令介绍：

说明页：

lat_mem_rd是lmbench中的⼀个⼯具，它的主要作⽤是测试内存访问的延迟。

源码地址：

cd lmbench3

make

即可进⾏编译，⽣成的⽂件在./bin⽬录下。

编译时遇到链接错误，不到llseek64的问题，可以通过修改

gcc -O -DRUSAGE -DHAVE_uint=1 -DHAVE_int64_t=1 -DHAVE_DRAND48 -DHAVE_SCHED_SETAF

FINITY=1 -o ../bin/x86_64-linux-gnu/disk disk.c ../bin /usr/bin/ld: /tmp/cc7D60jo.o: in function `seekto':

disk.c:(.text+0x37): undefined reference to `llseek'

collect2: error: ld returned 1 exit status

disk.c中将两个llseek改成 lseek64 即可。

氧空位

#ifdef __linux__车载雷达天线

//extern loff_t llseek(int, loff_t, int);

extern loff_t lseek64(int, loff_t, int);

//if (llseek(fd, (loff_t)off, SEEK_SET) == (loff_t)-1) {

if (lseek64(fd, (loff_t)off, SEEK_SET) == (loff_t)-1) {邮件发原理

return(-1);

}

return (0);

#else

传送门：

intel平台可以使⽤官⽅的内存测试⼯具

命令使⽤⽅法：

lat_mem_rd size_in_megabytes stride []

如： lat_mem_rd 128 64 1024

即： size是128MB，

stride分别为64Byte 1024Byte，如果不指定stride，默认值是512，可以指定多个stride，⼀个命令进⾏多次测试。

命令输出说明：

选择不同的参数⽤来测试内存或者cache.

The output is best examined in a graph where you typically get a graph that has four plateaus. The graph should plotted in

log base 2 of the array size on the X axis and the latency on the Y axis. Each stride is then plotted as a curve. The plateaus that appear correspond to the onboard cache (if present), external cache (if present), main memory latency, and TLB miss latency.

As a rough guide, you may be able to extract the latencies of the various parts as follows, but you should really look at the graphs, since these rules of thumb do not always work (some systems do not have onboard cache, for example).

onboard cache

Try stride of 128 and array size of .00098.

external cache

Try stride of 128 and array size of .125.

main memory

Try stride of 128 and array size of 8.

TLB miss

Try the largest stride and the largest array.

下⾯是⼀个测试结果，摘⾃

前⾯结果1.205ns的是访问L1 cache

后续3ns左右是访问L2cache

6ns是访问L3 cach3

访问内存延迟在21ns左右。

再后⾯

> numactl --membind=0 --cpunodebind=0 ./lat_mem_rd 2000 128 "stride=128

0.00049 1.205

0.00098 1.198

0.00195 1.195

0.00293 1.209

0.00391 1.211

反对本本主义论文0.00586 1.201

0.00781 1.199

0.01172 1.201

0.01562 1.194

0.02344 1.200

0.03125 1.217

0.04688 3.523

0.06250 3.646

0.09375 3.616

0.12500 3.611

0.18750 3.658

0.25000 4.928

0.37500 5.837

0.50000 5.791

0.75000 5.843

1.00000 5.883

1.50000 5.959

2.00000 5.983

3.00000 6.174

4.00000 9.150

6.00000 15.852

8.00000 19.982

12.00000 21.567

16.00000 21.585

24.00000 21.735

32.00000 21.610

48.00000 22.535

64.00000 22.093

96.00000 22.033

128.00000 22.608

192.00000 21.498

256.00000 21.594

384.00000 21.492

myb512.00000 21.473

768.00000 22.752

1024.00000 22.462

⼆、内部实现

lat_mem_rd的延迟测试的代码是这样写的

#define ONE p = (char **)*p;

#define FIVE ONE ONE ONE ONE ONE

#define TEN FIVE FIVE

#define FIFTY TEN TEN TEN TEN TEN

#define HUNDRED FIFTY FIFTY

while (iterations-- > 0) {

for (i = 0; i < count; ++i) {

HUNDRED;

}

⽤指针指向下⼀个内存地址空间来循环访问, ⽐如说0.00049 1.584, 这个结果就是在512字节范围内, 步长16来⼀直循环访问, 最后时间除以访问次数就是延迟

范围超过l1 cache的32k的时候, 会有⼀个阶级变化

>南极条约

本文发布于:2024-09-21 22:20:48，感谢您对本站的认可！

本文链接：https://www.17tex.com/xueshu/231442.html

上一篇：行业英语：计算机术语CPU类

下一篇：一文详解Linux C++内存管理

标签：测试访问延迟内存范围不到命令问题

留言与评论（共有 0 条评论）