0%

megaraid

简单介绍megaraid。

LSI公司及产品 (1)

简史:

  • 1981年成立,专攻ASIC;
  • 2000年6月,发布Fusion-MPT architecture;
  • 2001年3月,收购AMI公司的MegaRAID software IP + RAID division;
  • 2001年6月,发布LSI53C1030,第一个Fusion-MPT architecture的I/O Controller(简称IOC);基于ARM 9;
  • 2004年1月,发布SAS1064,这是第一代SAS/SATA controller (IOC);
  • 2005年4月,展示了SAS RoC SAS1078,这是第一代SAS/SATA Raid-on-Chip(RoC);处理器从ARM 9换成PowerPC 440;
  • 2006年2月,首次引入运行Integrated RAID(简称IR,firmware层实现的RAID)的HBA;
  • 2013年12月,LSI被Avago收购;
  • 2016年2月,Avago又收购Broadcom;并且Avago更名为”Broadcom Inc”;
  • 2017年4月,发布Tri-Mode(SAS/SATA/PCIe)的HBA和RAID adapter(基于SAS3408 IOC and SAS3508 RoC);第一次支持SAS/SATA/NVMe,并且芯片从PowerPC换成ARM A15;

简史中有一些名词:

  • IOC和RoC

IOC和RoC是集成电路(Integrated Circuit,简称IC),也就是芯片,是LSI/Broadcom自己的HBA和RAID控制卡的一个核心元件,也会卖给OEM。

IOC全称是I/O Controller,这里特指Storage I/O Controller,之前支持SAS/SATA两种接口(SAS和SATA磁盘),现在通常都还支持PCIe接口(NVMe磁盘)。IOC芯片不支持RAID,或只支持基础的Integrated RAID(简称IR,firmware层实现的RAID),且支持的RAID level也很有限:RAID-0, RAID-1, RAID-1E, RAID-10。IOC芯片主要用于HBA卡和entry-level的MegaRAID卡。Broadcom SAS3616W Tri-Mode I/O Controller长这个样子:

RoC全称是RAID-on-Chip,集成了更高级的基于硬件的RAID功能,如专门的硬件引擎用以加速RAID5,RAID6的parity计算。RoC支持多种level的RAID,RAID-0, RAID-1, RAID-1E, RAID-5, RAID-6, RAID10, RAID-50, RAID-60。RoC主要用于RAID控制卡。Broadcom SAS3516 Tri-Mode RoC长这样子:

像CPU一样,每个IOC和RoC包含多个处理器核心(processor cores),这些cores用于实现Fusion-MPT architecture,Integrated RAID(运行IR firmware),管理SCSI I/O而不用打扰CPU。

  • IR和IT

IR即Integrated RAID,是基于firmware实现的RAID,介于Software RAID和Hardware RAID之间。IR也叫做Hardware-Assisted Software RAIDfirmware RAID,或者fake RAID(对吗?fix me!)。 Hardware-assisted software RAID是使用芯片上运行的firmware实现RAID的功能。IT即Initiator-Target模式,禁用了IOC或RoC的高级功能,只使用Initiator Target功能,可以认为是直通的(对吗?fix me!)。IT模式下,由于是直通的,所以可以做Software RAID(对吗?fix me!)。

  • Fusion-MPT

Fusion-MPT (Message Passing Technology) 是一个体系结构,包括Fusion-MPT firmware,硬件(Ultra320 SCSI, FC, SAS)和OS层的驱动。LSI/Broadcom IOC和RoC都是基于这个体系结构。

  • HBA和RAID adapter

LSI/Broadcom生产两种存储卡:HBA卡和RAID控制卡。

HBA卡几乎全部使用IOC芯片(9400-8i8e 是例外,它使用RoC芯片)。有两种类型的firmware:IT和IR(如前所述)。Broadcom HBA 9405W-16i Tri-Mode是一个基于SAS3616W Tri-Mode IOC芯片的HBA卡:

RAID控制卡,即MegaRAID,可能使用IOC芯片也可能使用RoC芯片。Entry-level的MegaRAID卡使用IOC芯片,和HBA卡使用的IOC一样,只是装的是MegaRAID firmware。这些MegaRAID卡没有onboard cache memory,也没有备用电源,不支持RAID-6和RAID-60,在条带单元的size上也有限制。它们也被叫做 iMR or iMegaRAID (Integrated MegaRAID)。所以,这些MegaRAID卡也属于IR(对吗?fix me!)。针对value and higher市场的MegaRAID卡使用RoC芯片,带有onboard cache memory和备用电源,支持RAID-6,RAID-60等。Broadcom MegaRAID 9460-16i是基于SAS3516 Tri-Mode RoC芯片的MegaRAID卡:

存储卡 芯片 支持RAID的方式 是否有onboard cache memory 和 备用电源
HBA卡(IT firmware) IOC 不支持
HBA卡(IR firmware) IOC firmware(IR)
MegaRAID卡(针对entry level市场) IOC MegaRAID firmware(IR)
MegaRAID卡(针对value and higher市场) RoC Hardware
  • OEM

有两种风格的OEM厂商:1.简单的re-brand LSI/Broadcom的HBA卡和MegaRAID卡;这些卡还是LSI/Broadcom生产的,只是打上了OEM厂商的标。这类很好区别,因为PCB上会显示LSI/Avago/Broadcom的logo;例如IBM ServeRAID M1015是MegaRAID SAS 9240-8i的re-brand版;2. 从LSI/Broadcom购买IOC和RoC芯片,然后使用这些芯片生产自己品牌的HBA卡和RAID卡。例如Dell PERC,uperMicro和Fujitsu的。

MegaCli64 (2)

下载与安装 (2.1)

1
2
3
4
5
6
# wget https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/8-07-06_MegaCLI.zip
# unzip 8-07-06_MegaCLI.zip
# cd Linux/
# rpm -hiv MegaCli-8.07.06-1.noarch.rpm
# ls /opt/MegaRAID/MegaCli/
MegaCli64 install.log libstorelibir-2.so libstorelibir-2.so.13.05-0

MegaCli64涉及的对象 (2.2)

  • Adapter: 即controller,物理MegaRAID卡或物理MegaRAID控制器。一个server上可能有多个adapter,MegaCli64 -adpCount返回其数量。-a用于指定一个或所有adapter(-aN指定Id为N的adapter; -aALL指定所有adapter)。几乎所有常用命令都需要-a选项。一般情况下,server上只有一个adapter,其ID一般为0,这时-a0-aALL是等价的。Adapter相关的操作以-Adp开头,例如,列出所有adapter的信息:MegaCli64 -AdpAllInfo -aALL;只列出adapter-0的信息:MegaCli64 -AdpAllInfo -a0
  • Enclosure: 即Disk Enclosure,物理磁盘柜。一个adapter上可以有一个或多个物理磁盘柜。其ID一般为254, 252等。MegaCli64 -EncInfo -aALL列出所有adapter上的所有磁盘柜;MegaCli64 -EncInfo -a0只列出adapter-0上的磁盘柜。一个磁盘柜里有多个磁盘,这些磁盘就是Physical Drive;
  • Physical Drive: 即磁盘柜里的物理磁盘。-PhysDrv [E0:S0,E1:S1,...]用于指定一个或多个物理磁盘;E{M}:S{N}表示位于enclosure-M的slot-N的磁盘。物理磁盘相关的操作以-pd, -Pd, -PD开头,例如,列出adapter-0上所有的物理磁盘:MegaCli64 -PDList -a0
  • Logical Disk: 也叫virtual disk或virtual drive。由物理磁盘虚拟出的逻辑设备,ID通常是0, 1, 2等。例如由3个物理磁盘构建一个RAID0,physical drives是0, 1, 2;virtual drive是0。Virtual drive包含一个到几个physical drive以及一些RAID设置(例如,RAID level,strip size等)。-L用于指定一个(-LNN为logical disk的ID)或所有(-Lall)logical disk。例如,MegaCli64 -LDInfo -Lall -aALL列出所有adapter上的所有logical drive。Logical driver相关的操作以-LD开头。

Virtual Drive的Cache Policy (2.3)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Cache Policy's are how the raid card uses on board RAM to collect data before writing out to disk or to read data before the system asks for it.

Write cache is used when we have a lot of data to write and it is faster to write data sequentially to disk instead of writing small chunks.

Read cache is used when the system has asked for some data and the raid card keeps the data in cache in case the system asks for the same data again.

It is always faster to read and write to cache then to access spinning disks. Understand that you should only use caching if you have good UPS power to the system.

If the system looses power and does not flush the cache it is possible to loose data. No one wants that. Lets look at each cache policy LSI raid card use.

WriteBack uses the card's cache to collect enough data to make a series of long sequential writes out to disk. This is the fastest write method.
WriteThrough tells the card to write all data directly to disk without cache. This method is quite slow by about 1/10 the speed of WriteBack, but is safer as no data can be lost that was in cache when the machine's power fails.
ReadAdaptive uses an algorithm to see if when the OS asks for a bunch of data blocks sequentially, if we should read a few more sequential blocks because the OS _might_ ask for those too. This method can lead to good speed increases.
ReadAheadNone tells the raid card to only read the data off the raid disk if it was actually asked for. No more, no less.
Cached allows the general use of the cards cache for any data which is read or written. Very efficient if the same data is accessed over and over again.
Direct is straight access to the disk without ever storing data in the cache. This can be slow as any I/O has to touch the disk platters.
Write Cache OK if Bad BBU tells the card to use write caching even if the Battery Backup Unit (BBU) is bad, disabled or missing. This is a good setting if your raid card's BBU charger is bad, if you do not want or can't to replace the BBU or if you do not want WriteThrough enabled during a BBU
relearn test.

No Write Cache if Bad BBU if the BBU is not available for any reason then disable WriteBack and turn on WriteThrough. This option is safer for your data, but the raid card will switch to WriteThrough during a battery relearn cycle.
Disk Cache Policy: Enabled Use the hard drive's own cache. For example if data is written out the drives this option lets the drives themselves cache data internally before writing data to its platters.
Disk Cache Policy: Disabled does not allow the drive to use any of its own internal cache.

MegaCli64的使用例子 (2.4)

现在手上刚好有一台server,带有RAID卡,玩一玩。

例1:确保有RAID卡 (2.4.1)

1
2
# lspci | grep -i raid
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)

可见server上有一个MegaRAID卡;虽然不知道MegaRAID卡的型号,但我们知道它使用的芯片是SAS3108,在Broadcom官网上查得它是一个RoC芯片。

例2:RAID卡的PCI信息 (2.4.2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# MegaCli64 -adpCount

Controller Count: 1.

Exit Code: 0x01


# MegaCli64 -AdpGetPciInfo -aAll

PCI information for Controller 0
--------------------------------
Bus Number : da80
Device Number : 0
Function Number : a0

Exit Code: 0x00

可见,我们只有一个RAID卡;其Bus Number是da80。

例3:RAID卡的属性 (2.4.3)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
# MegaCli64 -AdpAllInfo -aALL

Adapter #0

==============================================================================
Versions
================
Product Name : SAS3108
Serial No :
FW Package Build: 24.1.1-0035

Mfg. Data
================
Mfg. Date : 00/00/00
Rework Date : 00/00/00
Revision No :
Battery FRU : N/A

Image Versions in Flash:
================
BIOS Version : 6.13.00.2_4.14.05.00_0x06010602
FW Version : 4.210.90-3396
NVDATA Version : 3.1310.00-0067
Ctrl-R Version : 5.01-0008
Boot Block Version : 3.00.00.00-0009

Pending Images in Flash
================
None

PCI Info
================
Controller Id : 0000
Vendor Id : 1000
Device Id : 005d
SubVendorId : 19e5
SubDeviceId : d207

Host Interface : PCIE

ChipRevision : C0

Link Speed : 0
Number of Frontend Port: 0
Device Interface : PCIE

Number of Backend Port: 8
Port : Address
0 500e004aaaaaaa1f
1 0000000000000000
2 0000000000000000
3 0000000000000000
4 0000000000000000
5 0000000000000000
6 0000000000000000
7 0000000000000000

HW Configuration
================
SAS Address : 5101b5442bcc7000
BBU : Absent
Alarm : Present
NVRAM : Present
Serial Debugger : Present
Memory : Present
Flash : Present
Memory Size : 1024MB
TPM : Absent
On board Expander: Absent
Upgrade Key : Absent
Temperature sensor for ROC : Present
Temperature sensor for controller : Absent

ROC temperature : 66 degree Celsius

Settings
================
Current Time : 20:29:55 9/12, 2019
Predictive Fail Poll Interval : 300sec
Interrupt Throttle Active Count : 16
Interrupt Throttle Completion : 50us
Rebuild Rate : 30%
PR Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruction Rate : 30%
Cache Flush Interval : 4s
Max Drives to Spinup at One Time : 4
Delay Among Spinup Groups : 2s
Physical Drive Coercion Mode : 1GB
Cluster Mode : Disabled
Alarm : Enabled
Auto Rebuild : Enabled
Battery Warning : Enabled
Ecc Bucket Size : 15
Ecc Bucket Leak Rate : 1440 Minutes
Restore HotSpare on Insertion : Enabled
Expose Enclosure Devices : Enabled
Maintain PD Fail History : Enabled
Host Request Reordering : Enabled
Auto Detect BackPlane Enabled : SGPIO/i2c SEP
Load Balance Mode : Auto
Use FDE Only : No
Security Key Assigned : No
Security Key Failed : No
Security Key Not Backedup : No
Default LD PowerSave Policy : Controller Defined
Maximum number of direct attached drives to spin up in 1 min : 120
Auto Enhanced Import : No
Any Offline VD Cache Preserved : No
Allow Boot with Preserved Cache : No
Disable Online Controller Reset : No
PFK in NVRAM : No
Use disk activity for locate : No
POST delay : 90 seconds
BIOS Error Handling : Stop On Errors
Current Boot Mode :Normal
Capabilities
================
RAID Level Supported : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span
Supported Drives : SAS, SATA

Allowed Mixing:

Mix in Enclosure Allowed
Mix of SAS/SATA of HDD type in VD Allowed

Status
================
ECC Bucket Count : 0

Limitations
================
Max Arms Per VD : 32
Max Spans Per VD : 8
Max Arrays : 128
Max Number of VDs : 64
Max Parallel Commands : 928
Max SGE Count : 60
Max Data Transfer Size : 8192 sectors
Max Strips PerIO : 42
Max LD per array : 16
Min Strip Size : 64 KB
Max Strip Size : 1.0 MB
Max Configurable CacheCade Size: 0 GB
Current Size of CacheCade : 0 GB
Current Size of FW Cache : 837 MB

Device Present
================
Virtual Drives : 13
Degraded : 0
Offline : 0
Physical Devices : 16
Disks : 14
Critical Disks : 0
Failed Disks : 0

Supported Adapter Operations
================
Rebuild Rate : Yes
CC Rate : Yes
BGI Rate : Yes
Reconstruct Rate : Yes
Patrol Read Rate : Yes
Alarm Control : Yes
Cluster Support : No
BBU : Yes
Spanning : Yes
Dedicated Hot Spare : Yes
Revertible Hot Spares : Yes
Foreign Config Import : Yes
Self Diagnostic : Yes
Allow Mixed Redundancy on Array : No
Global Hot Spares : Yes
Deny SCSI Passthrough : No
Deny SMP Passthrough : No
Deny STP Passthrough : No
Support Security : No
Snapshot Enabled : No
Support the OCE without adding drives : Yes
Support PFK : Yes
Support PI : Yes
Support Boot Time PFK Change : No
Disable Online PFK Change : No
Support LDPI Type1 : No
Support LDPI Type2 : No
Support LDPI Type3 : No
PFK TrailTime Remaining : 0 days 0 hours
Support Shield State : Yes
Block SSD Write Disk Cache Change: No
Support Online FW Update : Yes

Supported VD Operations
================
Read Policy : Yes
Write Policy : Yes
IO Policy : Yes
Access Policy : Yes
Disk Cache Policy : Yes
Reconstruction : Yes
Deny Locate : No
Deny CC : No
Allow Ctrl Encryption: No
Enable LDBBM : No
Support Breakmirror : No
Power Savings : No

Supported PD Operations
================
Force Online : Yes
Force Offline : Yes
Force Rebuild : Yes
Deny Force Failed : No
Deny Force Good/Bad : No
Deny Missing Replace : No
Deny Clear : No
Deny Locate : No
Support Temperature : Yes
Disable Copyback : No
Enable JBOD : No
Enable Copyback on SMART : Yes
Enable Copyback to SSD on SMART Error : Yes
Enable SSD Patrol Read : No
PR Correct Unconfigured Areas : Yes
Enable Spin Down of UnConfigured Drives : Yes
Disable Spin Down of hot spares : No
Spin Down time : 30
T10 Power State : No
Error Counters
================
Memory Correctable Errors : 0
Memory Uncorrectable Errors : 0

Cluster Information
================
Cluster Permitted : No
Cluster Active : No

Default Settings
================
Phy Polarity : 0
Phy PolaritySplit : 0
Background Rate : 30
Strip Size : 256kB
Flush Time : 4 seconds
Write Policy : WB
Read Policy : Adaptive
Cache When BBU Bad : Disabled
Cached IO : No
SMART Mode : Mode 6
Alarm Disable : Yes
Coercion Mode : 1GB
ZCR Config : Unknown
Dirty LED Shows Drive Activity : No
BIOS Continue on Error : 0
Spin Down Mode : None
Allowed Device Type : SAS/SATA Mix
Allow Mix in Enclosure : Yes
Allow HDD SAS/SATA Mix in VD : Yes
Allow SSD SAS/SATA Mix in VD : No
Allow HDD/SSD Mix in VD : No
Allow SATA in Cluster : No
Max Chained Enclosures : 16
Disable Ctrl-R : No
Enable Web BIOS : No
Direct PD Mapping : No
BIOS Enumerate VDs : Yes
Restore Hot Spare on Insertion : Yes
Expose Enclosure Devices : Yes
Maintain PD Fail History : Yes
Disable Puncturing : No
Zero Based Enclosure Enumeration : No
PreBoot CLI Enabled : No
LED Show Drive Activity : No
Cluster Disable : Yes
SAS Disable : No
Auto Detect BackPlane Enable : SGPIO/i2c SEP
Use FDE Only : No
Enable Led Header : Yes
Delay during POST : 0
EnableCrashDump : No
Disable Online Controller Reset : No
EnableLDBBM : No
Un-Certified Hard Disk Drives : Allow
Treat Single span R1E as R10 : No
Max LD per array : 16
Power Saving option : Don't Auto spin down Configured Drives
Max power savings option is not allowed for LDs. Only T10 power conditions are to be used.
Default spin down time in minutes: 30
Enable JBOD : No
TTY Log In Flash : No
Auto Enhanced Import : No
BreakMirror RAID Support : No
Disable Join Mirror : Yes
Enable Shield State : Yes
Time taken to detect CME : 60s

Exit Code: 0x00

这里信息很多,目前关注到:

  • Product Name是 SAS3108,如前所述,从官网查得它是一个RoC型号;
  • Number of Frontend Port是0;Number of Backend Port是8;(这是什么意思?前端通过PCIe连server,后端可以连至多8个Disk Enclosure?)
  • BBU不存在;前文所述,RoC一般带有备用电源的,不清楚为何这个不带。
  • Memory存在,size是1024MB;
  • RAID Level Supported比较丰富;
  • Supported Drives是SAS和SATA(不支持NVMe?)
  • 当前存在的Virtual Drives是13个;Physical Devices是16个,其中14个是Disks(那另外2个是什么呢?)。
  • 支持的Virtual Drives的操作(Supported VD Operations)有Read/Write Policy等;支持的Physical Drives(Supported PD Operations)有Force Online/Offline/Rebuild等;
  • 默认设置(Default Settings)里,Strip Size是256kB;

例4:此MegaRAID卡上的Enclosure (2.4.4)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
# MegaCli64 -EncInfo -a0

Number of enclosures on adapter 0 -- 2

Enclosure 0:
Device ID : 14
Number of Slots : 16
Number of Power Supplies : 2
Number of Fans : 4
Number of Temperature Sensors : 4
Number of Alarms : 1
Number of SIM Modules : 1
Number of Physical Drives : 14
Status : Normal
Position : 1
Connector Name : Port A
Enclosure type : SES
FRU Part Number : N/A
Enclosure Serial Number : N/A
ESM Serial Number : N/A
Enclosure Zoning Mode : N/A
Partner Device Id : 65535

Inquiry data :
Vendor Identification : 12G SAS
Product Identification : Expander
Product Revision Level : RevB
Vendor Specific :

Number of Voltage Sensors :0

Number of Power Supplies : 2

Power Supply : 0
Power Supply Status : Unsupported

Power Supply : 1
Power Supply Status : Unsupported

Number of Fans : 4

Fan : 0
Fan Status : Unsupported

Fan : 1
Fan Status : Unsupported

Fan : 2
Fan Status : Unsupported

Fan : 3
Fan Status : Unsupported

Number of Temperature Sensors : 4

Temp Sensor : 0
Temperature : 236
Temperature Sensor Status : Unsupported

Temp Sensor : 1
Temperature : 236
Temperature Sensor Status : Unsupported

Temp Sensor : 2
Temperature : 236
Temperature Sensor Status : Unsupported

Temp Sensor : 3
Temperature : 236
Temperature Sensor Status : Unsupported

Number of Chassis : 1

Chassis : 0
Chassis Status : OK

Enclosure 1:
Device ID : 252
Number of Slots : 8
Number of Power Supplies : 0
Number of Fans : 0
Number of Temperature Sensors : 0
Number of Alarms : 0
Number of SIM Modules : 1
Number of Physical Drives : 0
Status : Normal
Position : 1
Connector Name : Unavailable
Enclosure type : SGPIO
FRU Part Number : N/A
Enclosure Serial Number : N/A
ESM Serial Number : N/A
Enclosure Zoning Mode : N/A
Partner Device Id : Unavailable

Inquiry data :
Vendor Identification : LSI
Product Identification : SGPIO
Product Revision Level : N/A
Vendor Specific :


Exit Code: 0x00

我们知道只有一个MegaRAID控制卡,其ID为0,即adapter 0,所以这里通过-a0来指定只列出本MegaRAID卡上的Enclosure。Enclosure信息有:

  • 此MegaRAID卡上有2个Enclosure(前一节中我们看到,Physical Devices是16个,其中14个是Disks,那另外两个应该就是这俩Enclosure,fix me!)。其中一个Enclosure的ID是14,另一个的ID是252。
  • Enclosure-14上有14块磁盘(Physical Drives),2个Power Supplies,4个风扇,4个温度传感器;有16个slot,最多可以插16块磁盘;
  • Enclosure-252上没有任何磁盘(Physical Drives),没有Power Supplies,也没有风扇和温度传感器;有8个slot,最多可以查8块磁盘;

例5:此MegaRAID卡上的物理磁盘 (2.4.5)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
# MegaCli64 -pdList -a0

Adapter #0

Enclosure Device ID: 14
Slot Number: 0
Drive's position: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 0
WWN: 5000cca250c3e5be
Sequence Number: 2
Media Error Count: 0
Other Error Count: 1
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA

Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1a94800 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A8B0
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500e004aaaaaaa00
Connected Port Number: 0(path0)
Inquiry Data: PN1334PEG8KTEXHGST HUS724040ALA640 MFAOA8B0
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :30C (86.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 14
Slot Number: 1
Drive's position: DiskGroup: 2, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 1
......


Enclosure Device ID: 14
Slot Number: 2
Drive's position: DiskGroup: 3, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 2
......



Enclosure Device ID: 14
Slot Number: 3
Drive's position: DiskGroup: 4, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 3
......


Enclosure Device ID: 14
Slot Number: 4
Drive's position: DiskGroup: 5, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 4
......


Enclosure Device ID: 14
Slot Number: 5
Drive's position: DiskGroup: 6, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 5
......


Enclosure Device ID: 14
Slot Number: 6
Drive's position: DiskGroup: 7, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 6
......


Enclosure Device ID: 14
Slot Number: 7
Drive's position: DiskGroup: 8, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 7
......


Enclosure Device ID: 14
Slot Number: 8
Drive's position: DiskGroup: 9, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 8
......


Enclosure Device ID: 14
Slot Number: 9
Drive's position: DiskGroup: 10, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 9
......


Enclosure Device ID: 14
Slot Number: 10
Drive's position: DiskGroup: 11, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 10
WWN: 5000cca24ce923cc
......


Enclosure Device ID: 14
Slot Number: 11
Drive's position: DiskGroup: 12, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 11
......


Enclosure Device ID: 14
Slot Number: 14
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 12
WWN: 5000CCA06E20BDDF
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.464 GB [0x22cee000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A440
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca06e20bddd
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HITACHI HUC109030CSS600 A440W5GL06HB
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :40C (104.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No


Enclosure Device ID: 14
Slot Number: 15
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: 1
Device Id: 13
......

Exit Code: 0x00

如前所述,此MegaRAID卡上共有14块磁盘,它们都插在Enclosure-14上(Enclosure-252上没有查任何磁盘)。

  • Device-0(Device Id: 0)插在Enclosure-14的Slot-0上,命令行标识是-PhysDrv [E14:S0];大小是3.638 TB;接口类型(PD Type)是SATA;介质类型(Media Type)是Hard Disk Device;
  • Device-1到Device-11插在Slot-1到Slot-11上;磁盘的大小、接口类型、介质类型和Device-0一样;
  • Slot-12和Slot-13空着没有插盘;
  • Device-12(Device Id: 12)插在Slot-14上,命令行标识是-PhysDrv [E14:S14];大小是279.396 GB;接口类型(PD Type)是SAS;介质类型(Media Type)也是Hard Disk Device(279G的磁盘不是SSD?会不会识别错了你?我在另外一个server上,同样的MegaCli64命令看到223G的磁盘的Media Type是: Solid State Device)。
  • Device-13(Device Id: 13)插在Slot-15上,命令行标识是-PhysDrv [E14:S15];磁盘的大小、接口类型、介质类型和Device-12一样;

例6:此MegaRAID卡上的Virtual Drives (2.4.6)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# MegaCli64 -LDInfo -Lall -a0

Adapter 0 -- Virtual Drive Information:

Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 278.464 GB
Sector Size : 512
Is VD emulated : No
Mirror Data : 278.464 GB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
PI type: No PI

Is VD Cached: No


Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0
Size : 3.637 TB
Sector Size : 512
Is VD emulated : No
Parity Size : 0
State : Optimal
Strip Size : 256 KB
Number Of Drives : 1
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
PI type: No PI

Is VD Cached: No


Virtual Drive: 2 (Target Id: 2)
......

Virtual Drive: 3 (Target Id: 3)
......

Virtual Drive: 4 (Target Id: 4)
......

Virtual Drive: 5 (Target Id: 5)
......

Virtual Drive: 6 (Target Id: 6)
......

Virtual Drive: 7 (Target Id: 7)
......

Virtual Drive: 8 (Target Id: 8)
......

Virtual Drive: 9 (Target Id: 9)
......

Virtual Drive: 10 (Target Id: 10)
......

Virtual Drive: 11 (Target Id: 11)
......

Virtual Drive: 12 (Target Id: 12)
......

Exit Code: 0x00

如第2.4.3节所述,此MegaRAID卡上有13个Virtual Drives。

  • 第1个是Virtual Drive: 0 (Target Id: 0):RAID Level是Primary-1, Secondary-0(即RAID-10);使用2个Physical Drive(Number Of Drives为2)。Cache Policy是WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU;Disk Cache Policy是Disk's Default
  • 其他和Virtual Drive: 1 (Target Id: 1)一样:RAID Level是Primary-0, Secondary-0(即RAID-0);使用1个Physical Drive(Number Of Drives为1)。Cache Policy是WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU

注意Virtual Drive: 0 (Target Id: 0)是系统盘,Level是RAID-10,并且Cache Policy更安全(包括WriteThrough, No Write Cache if Bad BBU);其他Virtual Drive是数据盘,Level是RAID-0,Cache Policy也不如系统盘安全(包括WriteBack, Write Cache OK if Bad BBU);

例7:Virtual Drive和Physical Drive的对应关系 (2.4.7)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
# MegaCli64 -LdPdInfo -a0

Adapter #0

Number of Virtual Disks: 13

Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 278.464 GB
Sector Size : 512
Is VD emulated : No
Mirror Data : 278.464 GB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
PI type: No PI

Is VD Cached: No
Number of Spans: 1
Span: 0 - Number of PDs: 2

PD: 0 Information
Enclosure Device ID: 14
Slot Number: 14
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 12
WWN: 5000CCA06E20BDDF
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.464 GB [0x22cee000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A440
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca06e20bddd
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HITACHI HUC109030CSS600 A440W5GL06HB
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :40C (104.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No




PD: 1 Information
Enclosure Device ID: 14
Slot Number: 15
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: 1
Device Id: 13
WWN: 5000CCA06E20C9B7
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.464 GB [0x22cee000 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A440
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca06e20c9b5
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: HITACHI HUC109030CSS600 A440W5GL0ZYB
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :40C (104.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No



Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0
Size : 3.637 TB
Sector Size : 512
Is VD emulated : No
Parity Size : 0
State : Optimal
Strip Size : 256 KB
Number Of Drives : 1
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
PI type: No PI

Is VD Cached: No
Number of Spans: 1
Span: 0 - Number of PDs: 1

PD: 0 Information
Enclosure Device ID: 14
Slot Number: 0
Drive's position: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 0
WWN: 5000cca250c3e5be
Sequence Number: 2
Media Error Count: 0
Other Error Count: 1
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA

Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1a94800 Sectors]
Sector Size: 512
Logical Sector Size: 512
Physical Sector Size: 512
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: A8B0
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500e004aaaaaaa00
Connected Port Number: 0(path0)
Inquiry Data: PN1334PEG8KTEXHGST HUS724040ALA640 MFAOA8B0
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :30C (86.00 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

......

这个命令的输出信息和前两节类似,但更详细。我们可以看到Virtual Drive包含哪些Physical Drive:

Virtual Drive: 0 (Target Id: 0): 包含两个Physical Drive,分别是插在Slot-14的Device-12(Device Id: 12)和插在Slot-15的Device-13(Device Id: 13);
Virtual Drive: 1 (Target Id: 1): 包含一个Physical Drive,即插在Slot-0的Device-0(Device Id: 0);

例8:Virtual Drive和Linux device的对应关系 (2.4.8)

SCSI系统结构模型SAM-4,可以简单的表示为:

一个target有一到多个port(transport end point),包含一到多个LUN。和模型相对应,一个SAS/FC disk就是一个具体的target,有两个port,包含一个LUN;MegaRAID卡创建出的Virtual Drive也是一个具体的target;而HBA卡或MegaRAID卡是具体的initiator。LUN就是SCSI device;initiator也叫host(不是指主机)。

lsscsi命令带上--hosts选项就可以列出系统上的initiators/hosts;不带--hosts选项,lsscsi的默认行为是列出SCSI device(LUN)。若SAS/FC disk上的两个port都和server连通,那么在lsscsi的输出里,disk的LUN会被显示为两个entry。HBA卡上可能有多个port,在lsscsi --hosts的输出里,有些HBA卡是一个port显示为一个entry,有些是整个HBA卡显示为一个entry。

一个SCSI命令,从一个initiator发出,经过transport到达一个target,然后访问一个LUN。目前支持的transport有:IEEE 1394, ATA, FC, iSCSI, SAS, SATA, SPI and USB。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# lsscsi --hosts
[0] megaraid_sas
[1] ahci

# lsscsi -g
[0:0:14:0] enclosu 12G SAS Expander RevB - /dev/sg0
[0:2:0:0] disk LSI LSI 4.21 /dev/sda /dev/sg1
[0:2:1:0] disk LSI LSI 4.21 /dev/sdb /dev/sg2
[0:2:2:0] disk LSI LSI 4.21 /dev/sdc /dev/sg3
[0:2:3:0] disk LSI LSI 4.21 /dev/sdd /dev/sg4
[0:2:4:0] disk LSI LSI 4.21 /dev/sde /dev/sg5
[0:2:5:0] disk LSI LSI 4.21 /dev/sdf /dev/sg6
[0:2:6:0] disk LSI LSI 4.21 /dev/sdg /dev/sg7
[0:2:7:0] disk LSI LSI 4.21 /dev/sdh /dev/sg8
[0:2:8:0] disk LSI LSI 4.21 /dev/sdi /dev/sg9
[0:2:9:0] disk LSI LSI 4.21 /dev/sdj /dev/sg10
[0:2:10:0] disk LSI LSI 4.21 /dev/sdk /dev/sg11
[0:2:11:0] disk LSI LSI 4.21 /dev/sdl /dev/sg12
[0:2:12:0] disk LSI LSI 4.21 /dev/sdm /dev/sg13

第一列是一个四元组[H:C:T:L]。

  • H: SCSI host ID, 即SCSI initiator;
  • C: SCSI host上的channel(即port?);
  • T: Target Number;
  • L: Target上的LUN;

在我们当前的系统里,MegaRAID卡是initiator;它通过channel-2连接targets;每个Virtual Drive是一个target(Number为0,1,……),一个target里只有一个LUN(ID为0)。另外,Enclosure也是一个target(Number为14),MegaRAID通过channel-0和它连接。

参考:
https://forums.servethehome.com/index.php?threads/broadcom-lsi-avago-hba-and-raid-controller-technical-discussion.24119/
https://www.broadcom.com/products/storage
https://raid.wiki.kernel.org/index.php/Hardware_Raid_Setup_using_MegaCli

写的不错,有赏!