Print this page
11490 SRS ring polling disabled for VLANs
11491 Want DLS bypass for VLAN traffic
11492 add VLVF bypass to ixgbe core
2869 duplicate packets with vnics over aggrs
11489 DLS stat delete and aggr kstat can deadlock
Portions contributed by: Theo Schlossnagle <jesus@omniti.com>
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Split |
Close |
Expand all |
Collapse all |
--- old/usr/src/uts/common/io/mac/mac.c
+++ new/usr/src/uts/common/io/mac/mac.c
1 1 /*
2 2 * CDDL HEADER START
3 3 *
4 4 * The contents of this file are subject to the terms of the
5 5 * Common Development and Distribution License (the "License").
6 6 * You may not use this file except in compliance with the License.
7 7 *
8 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9 9 * or http://www.opensolaris.org/os/licensing.
10 10 * See the License for the specific language governing permissions
11 11 * and limitations under the License.
12 12 *
13 13 * When distributing Covered Code, include this CDDL HEADER in each
14 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15 15 * If applicable, add the following below this CDDL HEADER, with the
16 16 * fields enclosed by brackets "[]" replaced with your own identifying
17 17 * information: Portions Copyright [yyyy] [name of copyright owner]
18 18 *
19 19 * CDDL HEADER END
20 20 */
21 21
22 22 /*
23 23 * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
24 24 * Copyright 2019 Joyent, Inc.
25 25 * Copyright 2015 Garrett D'Amore <garrett@damore.org>
26 26 */
27 27
28 28 /*
29 29 * MAC Services Module
30 30 *
31 31 * The GLDv3 framework locking - The MAC layer
32 32 * --------------------------------------------
33 33 *
34 34 * The MAC layer is central to the GLD framework and can provide the locking
35 35 * framework needed for itself and for the use of MAC clients. MAC end points
36 36 * are fairly disjoint and don't share a lot of state. So a coarse grained
37 37 * multi-threading scheme is to single thread all create/modify/delete or set
38 38 * type of control operations on a per mac end point while allowing data threads
39 39 * concurrently.
40 40 *
41 41 * Control operations (set) that modify a mac end point are always serialized on
42 42 * a per mac end point basis, We have at most 1 such thread per mac end point
43 43 * at a time.
44 44 *
45 45 * All other operations that are not serialized are essentially multi-threaded.
46 46 * For example a control operation (get) like getting statistics which may not
47 47 * care about reading values atomically or data threads sending or receiving
48 48 * data. Mostly these type of operations don't modify the control state. Any
49 49 * state these operations care about are protected using traditional locks.
50 50 *
51 51 * The perimeter only serializes serial operations. It does not imply there
52 52 * aren't any other concurrent operations. However a serialized operation may
53 53 * sometimes need to make sure it is the only thread. In this case it needs
54 54 * to use reference counting mechanisms to cv_wait until any current data
55 55 * threads are done.
56 56 *
57 57 * The mac layer itself does not hold any locks across a call to another layer.
58 58 * The perimeter is however held across a down call to the driver to make the
59 59 * whole control operation atomic with respect to other control operations.
60 60 * Also the data path and get type control operations may proceed concurrently.
61 61 * These operations synchronize with the single serial operation on a given mac
62 62 * end point using regular locks. The perimeter ensures that conflicting
63 63 * operations like say a mac_multicast_add and a mac_multicast_remove on the
64 64 * same mac end point don't interfere with each other and also ensures that the
65 65 * changes in the mac layer and the call to the underlying driver to say add a
66 66 * multicast address are done atomically without interference from a thread
67 67 * trying to delete the same address.
68 68 *
69 69 * For example, consider
70 70 * mac_multicst_add()
71 71 * {
72 72 * mac_perimeter_enter(); serialize all control operations
73 73 *
74 74 * grab list lock protect against access by data threads
75 75 * add to list
76 76 * drop list lock
77 77 *
78 78 * call driver's mi_multicst
79 79 *
80 80 * mac_perimeter_exit();
81 81 * }
82 82 *
83 83 * To lessen the number of serialization locks and simplify the lock hierarchy,
84 84 * we serialize all the control operations on a per mac end point by using a
85 85 * single serialization lock called the perimeter. We allow recursive entry into
86 86 * the perimeter to facilitate use of this mechanism by both the mac client and
87 87 * the MAC layer itself.
88 88 *
89 89 * MAC client means an entity that does an operation on a mac handle
90 90 * obtained from a mac_open/mac_client_open. Similarly MAC driver means
91 91 * an entity that does an operation on a mac handle obtained from a
92 92 * mac_register. An entity could be both client and driver but on different
93 93 * handles eg. aggr. and should only make the corresponding mac interface calls
94 94 * i.e. mac driver interface or mac client interface as appropriate for that
95 95 * mac handle.
96 96 *
97 97 * General rules.
98 98 * -------------
99 99 *
100 100 * R1. The lock order of upcall threads is natually opposite to downcall
101 101 * threads. Hence upcalls must not hold any locks across layers for fear of
102 102 * recursive lock enter and lock order violation. This applies to all layers.
103 103 *
104 104 * R2. The perimeter is just another lock. Since it is held in the down
105 105 * direction, acquiring the perimeter in an upcall is prohibited as it would
106 106 * cause a deadlock. This applies to all layers.
107 107 *
108 108 * Note that upcalls that need to grab the mac perimeter (for example
109 109 * mac_notify upcalls) can still achieve that by posting the request to a
110 110 * thread, which can then grab all the required perimeters and locks in the
111 111 * right global order. Note that in the above example the mac layer iself
112 112 * won't grab the mac perimeter in the mac_notify upcall, instead the upcall
113 113 * to the client must do that. Please see the aggr code for an example.
114 114 *
115 115 * MAC client rules
116 116 * ----------------
117 117 *
118 118 * R3. A MAC client may use the MAC provided perimeter facility to serialize
119 119 * control operations on a per mac end point. It does this by by acquring
120 120 * and holding the perimeter across a sequence of calls to the mac layer.
121 121 * This ensures atomicity across the entire block of mac calls. In this
122 122 * model the MAC client must not hold any client locks across the calls to
123 123 * the mac layer. This model is the preferred solution.
124 124 *
125 125 * R4. However if a MAC client has a lot of global state across all mac end
126 126 * points the per mac end point serialization may not be sufficient. In this
127 127 * case the client may choose to use global locks or use its own serialization.
128 128 * To avoid deadlocks, these client layer locks held across the mac calls
129 129 * in the control path must never be acquired by the data path for the reason
130 130 * mentioned below.
131 131 *
132 132 * (Assume that a control operation that holds a client lock blocks in the
133 133 * mac layer waiting for upcall reference counts to drop to zero. If an upcall
134 134 * data thread that holds this reference count, tries to acquire the same
135 135 * client lock subsequently it will deadlock).
136 136 *
137 137 * A MAC client may follow either the R3 model or the R4 model, but can't
138 138 * mix both. In the former, the hierarchy is Perim -> client locks, but in
139 139 * the latter it is client locks -> Perim.
140 140 *
141 141 * R5. MAC clients must make MAC calls (excluding data calls) in a cv_wait'able
142 142 * context since they may block while trying to acquire the perimeter.
143 143 * In addition some calls may block waiting for upcall refcnts to come down to
144 144 * zero.
145 145 *
146 146 * R6. MAC clients must make sure that they are single threaded and all threads
147 147 * from the top (in particular data threads) have finished before calling
148 148 * mac_client_close. The MAC framework does not track the number of client
149 149 * threads using the mac client handle. Also mac clients must make sure
150 150 * they have undone all the control operations before calling mac_client_close.
151 151 * For example mac_unicast_remove/mac_multicast_remove to undo the corresponding
152 152 * mac_unicast_add/mac_multicast_add.
153 153 *
154 154 * MAC framework rules
155 155 * -------------------
156 156 *
157 157 * R7. The mac layer itself must not hold any mac layer locks (except the mac
158 158 * perimeter) across a call to any other layer from the mac layer. The call to
159 159 * any other layer could be via mi_* entry points, classifier entry points into
160 160 * the driver or via upcall pointers into layers above. The mac perimeter may
161 161 * be acquired or held only in the down direction, for e.g. when calling into
162 162 * a mi_* driver enty point to provide atomicity of the operation.
163 163 *
164 164 * R8. Since it is not guaranteed (see R14) that drivers won't hold locks across
165 165 * mac driver interfaces, the MAC layer must provide a cut out for control
166 166 * interfaces like upcall notifications and start them in a separate thread.
167 167 *
168 168 * R9. Note that locking order also implies a plumbing order. For example
169 169 * VNICs are allowed to be created over aggrs, but not vice-versa. An attempt
170 170 * to plumb in any other order must be failed at mac_open time, otherwise it
171 171 * could lead to deadlocks due to inverse locking order.
172 172 *
173 173 * R10. MAC driver interfaces must not block since the driver could call them
174 174 * in interrupt context.
175 175 *
176 176 * R11. Walkers must preferably not hold any locks while calling walker
177 177 * callbacks. Instead these can operate on reference counts. In simple
178 178 * callbacks it may be ok to hold a lock and call the callbacks, but this is
179 179 * harder to maintain in the general case of arbitrary callbacks.
180 180 *
181 181 * R12. The MAC layer must protect upcall notification callbacks using reference
182 182 * counts rather than holding locks across the callbacks.
183 183 *
184 184 * R13. Given the variety of drivers, it is preferable if the MAC layer can make
185 185 * sure that any pointers (such as mac ring pointers) it passes to the driver
186 186 * remain valid until mac unregister time. Currently the mac layer achieves
187 187 * this by using generation numbers for rings and freeing the mac rings only
188 188 * at unregister time. The MAC layer must provide a layer of indirection and
189 189 * must not expose underlying driver rings or driver data structures/pointers
190 190 * directly to MAC clients.
191 191 *
192 192 * MAC driver rules
193 193 * ----------------
194 194 *
195 195 * R14. It would be preferable if MAC drivers don't hold any locks across any
196 196 * mac call. However at a minimum they must not hold any locks across data
197 197 * upcalls. They must also make sure that all references to mac data structures
198 198 * are cleaned up and that it is single threaded at mac_unregister time.
199 199 *
200 200 * R15. MAC driver interfaces don't block and so the action may be done
201 201 * asynchronously in a separate thread as for example handling notifications.
202 202 * The driver must not assume that the action is complete when the call
203 203 * returns.
204 204 *
205 205 * R16. Drivers must maintain a generation number per Rx ring, and pass it
206 206 * back to mac_rx_ring(); They are expected to increment the generation
207 207 * number whenever the ring's stop routine is invoked.
208 208 * See comments in mac_rx_ring();
209 209 *
210 210 * R17 Similarly mi_stop is another synchronization point and the driver must
211 211 * ensure that all upcalls are done and there won't be any future upcall
212 212 * before returning from mi_stop.
213 213 *
214 214 * R18. The driver may assume that all set/modify control operations via
215 215 * the mi_* entry points are single threaded on a per mac end point.
216 216 *
217 217 * Lock and Perimeter hierarchy scenarios
218 218 * ---------------------------------------
219 219 *
220 220 * i_mac_impl_lock -> mi_rw_lock -> srs_lock -> s_ring_lock[i_mac_tx_srs_notify]
221 221 *
222 222 * ft_lock -> fe_lock [mac_flow_lookup]
223 223 *
224 224 * mi_rw_lock -> fe_lock [mac_bcast_send]
225 225 *
226 226 * srs_lock -> mac_bw_lock [mac_rx_srs_drain_bw]
227 227 *
228 228 * cpu_lock -> mac_srs_g_lock -> srs_lock -> s_ring_lock [mac_walk_srs_and_bind]
229 229 *
230 230 * i_dls_devnet_lock -> mac layer locks [dls_devnet_rename]
231 231 *
232 232 * Perimeters are ordered P1 -> P2 -> P3 from top to bottom in order of mac
233 233 * client to driver. In the case of clients that explictly use the mac provided
234 234 * perimeter mechanism for its serialization, the hierarchy is
235 235 * Perimeter -> mac layer locks, since the client never holds any locks across
236 236 * the mac calls. In the case of clients that use its own locks the hierarchy
237 237 * is Client locks -> Mac Perim -> Mac layer locks. The client never explicitly
238 238 * calls mac_perim_enter/exit in this case.
239 239 *
240 240 * Subflow creation rules
241 241 * ---------------------------
242 242 * o In case of a user specified cpulist present on underlying link and flows,
243 243 * the flows cpulist must be a subset of the underlying link.
244 244 * o In case of a user specified fanout mode present on link and flow, the
245 245 * subflow fanout count has to be less than or equal to that of the
246 246 * underlying link. The cpu-bindings for the subflows will be a subset of
247 247 * the underlying link.
248 248 * o In case if no cpulist specified on both underlying link and flow, the
249 249 * underlying link relies on a MAC tunable to provide out of box fanout.
250 250 * The subflow will have no cpulist (the subflow will be unbound)
251 251 * o In case if no cpulist is specified on the underlying link, a subflow can
252 252 * carry either a user-specified cpulist or fanout count. The cpu-bindings
253 253 * for the subflow will not adhere to restriction that they need to be subset
254 254 * of the underlying link.
255 255 * o In case where the underlying link is carrying either a user specified
256 256 * cpulist or fanout mode and for a unspecified subflow, the subflow will be
257 257 * created unbound.
258 258 * o While creating unbound subflows, bandwidth mode changes attempt to
259 259 * figure a right fanout count. In such cases the fanout count will override
260 260 * the unbound cpu-binding behavior.
261 261 * o In addition to this, while cycling between flow and link properties, we
262 262 * impose a restriction that if a link property has a subflow with
263 263 * user-specified attributes, we will not allow changing the link property.
264 264 * The administrator needs to reset all the user specified properties for the
265 265 * subflows before attempting a link property change.
266 266 * Some of the above rules can be overridden by specifying additional command
267 267 * line options while creating or modifying link or subflow properties.
268 268 *
269 269 * Datapath
270 270 * --------
271 271 *
272 272 * For information on the datapath, the world of soft rings, hardware rings, how
273 273 * it is structured, and the path of an mblk_t between a driver and a mac
274 274 * client, see mac_sched.c.
275 275 */
276 276
277 277 #include <sys/types.h>
278 278 #include <sys/conf.h>
279 279 #include <sys/id_space.h>
280 280 #include <sys/esunddi.h>
281 281 #include <sys/stat.h>
282 282 #include <sys/mkdev.h>
283 283 #include <sys/stream.h>
284 284 #include <sys/strsun.h>
285 285 #include <sys/strsubr.h>
286 286 #include <sys/dlpi.h>
287 287 #include <sys/list.h>
288 288 #include <sys/modhash.h>
289 289 #include <sys/mac_provider.h>
290 290 #include <sys/mac_client_impl.h>
291 291 #include <sys/mac_soft_ring.h>
292 292 #include <sys/mac_stat.h>
293 293 #include <sys/mac_impl.h>
294 294 #include <sys/mac.h>
295 295 #include <sys/dls.h>
296 296 #include <sys/dld.h>
297 297 #include <sys/modctl.h>
298 298 #include <sys/fs/dv_node.h>
299 299 #include <sys/thread.h>
300 300 #include <sys/proc.h>
301 301 #include <sys/callb.h>
302 302 #include <sys/cpuvar.h>
303 303 #include <sys/atomic.h>
304 304 #include <sys/bitmap.h>
305 305 #include <sys/sdt.h>
306 306 #include <sys/mac_flow.h>
307 307 #include <sys/ddi_intr_impl.h>
308 308 #include <sys/disp.h>
309 309 #include <sys/sdt.h>
310 310 #include <sys/vnic.h>
311 311 #include <sys/vnic_impl.h>
312 312 #include <sys/vlan.h>
313 313 #include <inet/ip.h>
314 314 #include <inet/ip6.h>
315 315 #include <sys/exacct.h>
316 316 #include <sys/exacct_impl.h>
317 317 #include <inet/nd.h>
318 318 #include <sys/ethernet.h>
319 319 #include <sys/pool.h>
320 320 #include <sys/pool_pset.h>
321 321 #include <sys/cpupart.h>
322 322 #include <inet/wifi_ioctl.h>
323 323 #include <net/wpa.h>
324 324
325 325 #define IMPL_HASHSZ 67 /* prime */
326 326
327 327 kmem_cache_t *i_mac_impl_cachep;
328 328 mod_hash_t *i_mac_impl_hash;
329 329 krwlock_t i_mac_impl_lock;
330 330 uint_t i_mac_impl_count;
331 331 static kmem_cache_t *mac_ring_cache;
332 332 static id_space_t *minor_ids;
333 333 static uint32_t minor_count;
334 334 static pool_event_cb_t mac_pool_event_reg;
335 335
336 336 /*
337 337 * Logging stuff. Perhaps mac_logging_interval could be broken into
338 338 * mac_flow_log_interval and mac_link_log_interval if we want to be
339 339 * able to schedule them differently.
340 340 */
341 341 uint_t mac_logging_interval;
342 342 boolean_t mac_flow_log_enable;
343 343 boolean_t mac_link_log_enable;
344 344 timeout_id_t mac_logging_timer;
345 345
346 346 #define MACTYPE_KMODDIR "mac"
347 347 #define MACTYPE_HASHSZ 67
348 348 static mod_hash_t *i_mactype_hash;
349 349 /*
350 350 * i_mactype_lock synchronizes threads that obtain references to mactype_t
351 351 * structures through i_mactype_getplugin().
352 352 */
353 353 static kmutex_t i_mactype_lock;
354 354
355 355 /*
356 356 * mac_tx_percpu_cnt
357 357 *
358 358 * Number of per cpu locks per mac_client_impl_t. Used by the transmit side
359 359 * in mac_tx to reduce lock contention. This is sized at boot time in mac_init.
360 360 * mac_tx_percpu_cnt_max is settable in /etc/system and must be a power of 2.
361 361 * Per cpu locks may be disabled by setting mac_tx_percpu_cnt_max to 1.
362 362 */
363 363 int mac_tx_percpu_cnt;
364 364 int mac_tx_percpu_cnt_max = 128;
365 365
366 366 /*
367 367 * Call back functions for the bridge module. These are guaranteed to be valid
368 368 * when holding a reference on a link or when holding mip->mi_bridge_lock and
369 369 * mi_bridge_link is non-NULL.
370 370 */
371 371 mac_bridge_tx_t mac_bridge_tx_cb;
372 372 mac_bridge_rx_t mac_bridge_rx_cb;
373 373 mac_bridge_ref_t mac_bridge_ref_cb;
374 374 mac_bridge_ls_t mac_bridge_ls_cb;
375 375
376 376 static int i_mac_constructor(void *, void *, int);
377 377 static void i_mac_destructor(void *, void *);
378 378 static int i_mac_ring_ctor(void *, void *, int);
379 379 static void i_mac_ring_dtor(void *, void *);
380 380 static mblk_t *mac_rx_classify(mac_impl_t *, mac_resource_handle_t, mblk_t *);
381 381 void mac_tx_client_flush(mac_client_impl_t *);
382 382 void mac_tx_client_block(mac_client_impl_t *);
383 383 static void mac_rx_ring_quiesce(mac_ring_t *, uint_t);
384 384 static int mac_start_group_and_rings(mac_group_t *);
385 385 static void mac_stop_group_and_rings(mac_group_t *);
386 386 static void mac_pool_event_cb(pool_event_t, int, void *);
387 387
388 388 typedef struct netinfo_s {
389 389 list_node_t ni_link;
390 390 void *ni_record;
391 391 int ni_size;
392 392 int ni_type;
393 393 } netinfo_t;
394 394
395 395 /*
396 396 * Module initialization functions.
397 397 */
398 398
399 399 void
400 400 mac_init(void)
401 401 {
402 402 mac_tx_percpu_cnt = ((boot_max_ncpus == -1) ? max_ncpus :
403 403 boot_max_ncpus);
404 404
405 405 /* Upper bound is mac_tx_percpu_cnt_max */
406 406 if (mac_tx_percpu_cnt > mac_tx_percpu_cnt_max)
407 407 mac_tx_percpu_cnt = mac_tx_percpu_cnt_max;
408 408
409 409 if (mac_tx_percpu_cnt < 1) {
410 410 /* Someone set max_tx_percpu_cnt_max to 0 or less */
411 411 mac_tx_percpu_cnt = 1;
412 412 }
413 413
414 414 ASSERT(mac_tx_percpu_cnt >= 1);
415 415 mac_tx_percpu_cnt = (1 << highbit(mac_tx_percpu_cnt - 1));
416 416 /*
417 417 * Make it of the form 2**N - 1 in the range
418 418 * [0 .. mac_tx_percpu_cnt_max - 1]
419 419 */
420 420 mac_tx_percpu_cnt--;
421 421
422 422 i_mac_impl_cachep = kmem_cache_create("mac_impl_cache",
423 423 sizeof (mac_impl_t), 0, i_mac_constructor, i_mac_destructor,
424 424 NULL, NULL, NULL, 0);
425 425 ASSERT(i_mac_impl_cachep != NULL);
426 426
427 427 mac_ring_cache = kmem_cache_create("mac_ring_cache",
428 428 sizeof (mac_ring_t), 0, i_mac_ring_ctor, i_mac_ring_dtor, NULL,
429 429 NULL, NULL, 0);
430 430 ASSERT(mac_ring_cache != NULL);
431 431
432 432 i_mac_impl_hash = mod_hash_create_extended("mac_impl_hash",
433 433 IMPL_HASHSZ, mod_hash_null_keydtor, mod_hash_null_valdtor,
434 434 mod_hash_bystr, NULL, mod_hash_strkey_cmp, KM_SLEEP);
435 435 rw_init(&i_mac_impl_lock, NULL, RW_DEFAULT, NULL);
436 436
437 437 mac_flow_init();
438 438 mac_soft_ring_init();
439 439 mac_bcast_init();
440 440 mac_client_init();
441 441
442 442 i_mac_impl_count = 0;
443 443
444 444 i_mactype_hash = mod_hash_create_extended("mactype_hash",
445 445 MACTYPE_HASHSZ,
446 446 mod_hash_null_keydtor, mod_hash_null_valdtor,
447 447 mod_hash_bystr, NULL, mod_hash_strkey_cmp, KM_SLEEP);
448 448
449 449 /*
450 450 * Allocate an id space to manage minor numbers. The range of the
451 451 * space will be from MAC_MAX_MINOR+1 to MAC_PRIVATE_MINOR-1. This
452 452 * leaves half of the 32-bit minors available for driver private use.
↓ open down ↓ |
452 lines elided |
↑ open up ↑ |
453 453 */
454 454 minor_ids = id_space_create("mac_minor_ids", MAC_MAX_MINOR+1,
455 455 MAC_PRIVATE_MINOR-1);
456 456 ASSERT(minor_ids != NULL);
457 457 minor_count = 0;
458 458
459 459 /* Let's default to 20 seconds */
460 460 mac_logging_interval = 20;
461 461 mac_flow_log_enable = B_FALSE;
462 462 mac_link_log_enable = B_FALSE;
463 - mac_logging_timer = 0;
463 + mac_logging_timer = NULL;
464 464
465 465 /* Register to be notified of noteworthy pools events */
466 466 mac_pool_event_reg.pec_func = mac_pool_event_cb;
467 467 mac_pool_event_reg.pec_arg = NULL;
468 468 pool_event_cb_register(&mac_pool_event_reg);
469 469 }
470 470
471 471 int
472 472 mac_fini(void)
473 473 {
474 474
475 475 if (i_mac_impl_count > 0 || minor_count > 0)
476 476 return (EBUSY);
477 477
478 478 pool_event_cb_unregister(&mac_pool_event_reg);
479 479
480 480 id_space_destroy(minor_ids);
481 481 mac_flow_fini();
482 482
483 483 mod_hash_destroy_hash(i_mac_impl_hash);
484 484 rw_destroy(&i_mac_impl_lock);
485 485
486 486 mac_client_fini();
487 487 kmem_cache_destroy(mac_ring_cache);
488 488
489 489 mod_hash_destroy_hash(i_mactype_hash);
490 490 mac_soft_ring_finish();
491 491
492 492
493 493 return (0);
494 494 }
495 495
496 496 /*
497 497 * Initialize a GLDv3 driver's device ops. A driver that manages its own ops
498 498 * (e.g. softmac) may pass in a NULL ops argument.
499 499 */
500 500 void
501 501 mac_init_ops(struct dev_ops *ops, const char *name)
502 502 {
503 503 major_t major = ddi_name_to_major((char *)name);
504 504
505 505 /*
506 506 * By returning on error below, we are not letting the driver continue
507 507 * in an undefined context. The mac_register() function will faill if
508 508 * DN_GLDV3_DRIVER isn't set.
509 509 */
510 510 if (major == DDI_MAJOR_T_NONE)
511 511 return;
512 512 LOCK_DEV_OPS(&devnamesp[major].dn_lock);
513 513 devnamesp[major].dn_flags |= (DN_GLDV3_DRIVER | DN_NETWORK_DRIVER);
514 514 UNLOCK_DEV_OPS(&devnamesp[major].dn_lock);
515 515 if (ops != NULL)
516 516 dld_init_ops(ops, name);
517 517 }
518 518
519 519 void
520 520 mac_fini_ops(struct dev_ops *ops)
521 521 {
522 522 dld_fini_ops(ops);
523 523 }
524 524
525 525 /*ARGSUSED*/
526 526 static int
527 527 i_mac_constructor(void *buf, void *arg, int kmflag)
528 528 {
529 529 mac_impl_t *mip = buf;
530 530
531 531 bzero(buf, sizeof (mac_impl_t));
532 532
533 533 mip->mi_linkstate = LINK_STATE_UNKNOWN;
534 534
535 535 rw_init(&mip->mi_rw_lock, NULL, RW_DRIVER, NULL);
536 536 mutex_init(&mip->mi_notify_lock, NULL, MUTEX_DRIVER, NULL);
537 537 mutex_init(&mip->mi_promisc_lock, NULL, MUTEX_DRIVER, NULL);
538 538 mutex_init(&mip->mi_ring_lock, NULL, MUTEX_DEFAULT, NULL);
539 539
540 540 mip->mi_notify_cb_info.mcbi_lockp = &mip->mi_notify_lock;
541 541 cv_init(&mip->mi_notify_cb_info.mcbi_cv, NULL, CV_DRIVER, NULL);
542 542 mip->mi_promisc_cb_info.mcbi_lockp = &mip->mi_promisc_lock;
543 543 cv_init(&mip->mi_promisc_cb_info.mcbi_cv, NULL, CV_DRIVER, NULL);
544 544
545 545 mutex_init(&mip->mi_bridge_lock, NULL, MUTEX_DEFAULT, NULL);
546 546
547 547 return (0);
548 548 }
549 549
550 550 /*ARGSUSED*/
551 551 static void
552 552 i_mac_destructor(void *buf, void *arg)
553 553 {
554 554 mac_impl_t *mip = buf;
555 555 mac_cb_info_t *mcbi;
556 556
557 557 ASSERT(mip->mi_ref == 0);
558 558 ASSERT(mip->mi_active == 0);
559 559 ASSERT(mip->mi_linkstate == LINK_STATE_UNKNOWN);
560 560 ASSERT(mip->mi_devpromisc == 0);
561 561 ASSERT(mip->mi_ksp == NULL);
562 562 ASSERT(mip->mi_kstat_count == 0);
563 563 ASSERT(mip->mi_nclients == 0);
564 564 ASSERT(mip->mi_nactiveclients == 0);
565 565 ASSERT(mip->mi_single_active_client == NULL);
566 566 ASSERT(mip->mi_state_flags == 0);
567 567 ASSERT(mip->mi_factory_addr == NULL);
568 568 ASSERT(mip->mi_factory_addr_num == 0);
569 569 ASSERT(mip->mi_default_tx_ring == NULL);
570 570
571 571 mcbi = &mip->mi_notify_cb_info;
572 572 ASSERT(mcbi->mcbi_del_cnt == 0 && mcbi->mcbi_walker_cnt == 0);
573 573 ASSERT(mip->mi_notify_bits == 0);
574 574 ASSERT(mip->mi_notify_thread == NULL);
575 575 ASSERT(mcbi->mcbi_lockp == &mip->mi_notify_lock);
576 576 mcbi->mcbi_lockp = NULL;
577 577
578 578 mcbi = &mip->mi_promisc_cb_info;
579 579 ASSERT(mcbi->mcbi_del_cnt == 0 && mip->mi_promisc_list == NULL);
580 580 ASSERT(mip->mi_promisc_list == NULL);
581 581 ASSERT(mcbi->mcbi_lockp == &mip->mi_promisc_lock);
582 582 mcbi->mcbi_lockp = NULL;
583 583
584 584 ASSERT(mip->mi_bcast_ngrps == 0 && mip->mi_bcast_grp == NULL);
585 585 ASSERT(mip->mi_perim_owner == NULL && mip->mi_perim_ocnt == 0);
586 586
587 587 rw_destroy(&mip->mi_rw_lock);
588 588
589 589 mutex_destroy(&mip->mi_promisc_lock);
590 590 cv_destroy(&mip->mi_promisc_cb_info.mcbi_cv);
591 591 mutex_destroy(&mip->mi_notify_lock);
592 592 cv_destroy(&mip->mi_notify_cb_info.mcbi_cv);
593 593 mutex_destroy(&mip->mi_ring_lock);
594 594
595 595 ASSERT(mip->mi_bridge_link == NULL);
596 596 }
597 597
598 598 /* ARGSUSED */
599 599 static int
600 600 i_mac_ring_ctor(void *buf, void *arg, int kmflag)
601 601 {
602 602 mac_ring_t *ring = (mac_ring_t *)buf;
603 603
604 604 bzero(ring, sizeof (mac_ring_t));
605 605 cv_init(&ring->mr_cv, NULL, CV_DEFAULT, NULL);
606 606 mutex_init(&ring->mr_lock, NULL, MUTEX_DEFAULT, NULL);
607 607 ring->mr_state = MR_FREE;
608 608 return (0);
609 609 }
610 610
611 611 /* ARGSUSED */
612 612 static void
613 613 i_mac_ring_dtor(void *buf, void *arg)
614 614 {
615 615 mac_ring_t *ring = (mac_ring_t *)buf;
616 616
617 617 cv_destroy(&ring->mr_cv);
618 618 mutex_destroy(&ring->mr_lock);
619 619 }
620 620
621 621 /*
622 622 * Common functions to do mac callback addition and deletion. Currently this is
623 623 * used by promisc callbacks and notify callbacks. List addition and deletion
624 624 * need to take care of list walkers. List walkers in general, can't hold list
625 625 * locks and make upcall callbacks due to potential lock order and recursive
626 626 * reentry issues. Instead list walkers increment the list walker count to mark
627 627 * the presence of a walker thread. Addition can be carefully done to ensure
628 628 * that the list walker always sees either the old list or the new list.
629 629 * However the deletion can't be done while the walker is active, instead the
630 630 * deleting thread simply marks the entry as logically deleted. The last walker
631 631 * physically deletes and frees up the logically deleted entries when the walk
632 632 * is complete.
633 633 */
634 634 void
635 635 mac_callback_add(mac_cb_info_t *mcbi, mac_cb_t **mcb_head,
636 636 mac_cb_t *mcb_elem)
637 637 {
638 638 mac_cb_t *p;
639 639 mac_cb_t **pp;
640 640
641 641 /* Verify it is not already in the list */
642 642 for (pp = mcb_head; (p = *pp) != NULL; pp = &p->mcb_nextp) {
643 643 if (p == mcb_elem)
644 644 break;
645 645 }
646 646 VERIFY(p == NULL);
647 647
648 648 /*
649 649 * Add it to the head of the callback list. The membar ensures that
650 650 * the following list pointer manipulations reach global visibility
651 651 * in exactly the program order below.
652 652 */
653 653 ASSERT(MUTEX_HELD(mcbi->mcbi_lockp));
654 654
655 655 mcb_elem->mcb_nextp = *mcb_head;
656 656 membar_producer();
657 657 *mcb_head = mcb_elem;
658 658 }
659 659
660 660 /*
661 661 * Mark the entry as logically deleted. If there aren't any walkers unlink
662 662 * from the list. In either case return the corresponding status.
663 663 */
664 664 boolean_t
665 665 mac_callback_remove(mac_cb_info_t *mcbi, mac_cb_t **mcb_head,
666 666 mac_cb_t *mcb_elem)
667 667 {
668 668 mac_cb_t *p;
669 669 mac_cb_t **pp;
670 670
671 671 ASSERT(MUTEX_HELD(mcbi->mcbi_lockp));
672 672 /*
673 673 * Search the callback list for the entry to be removed
674 674 */
675 675 for (pp = mcb_head; (p = *pp) != NULL; pp = &p->mcb_nextp) {
676 676 if (p == mcb_elem)
677 677 break;
678 678 }
679 679 VERIFY(p != NULL);
680 680
681 681 /*
682 682 * If there are walkers just mark it as deleted and the last walker
683 683 * will remove from the list and free it.
684 684 */
685 685 if (mcbi->mcbi_walker_cnt != 0) {
686 686 p->mcb_flags |= MCB_CONDEMNED;
687 687 mcbi->mcbi_del_cnt++;
688 688 return (B_FALSE);
689 689 }
690 690
691 691 ASSERT(mcbi->mcbi_del_cnt == 0);
692 692 *pp = p->mcb_nextp;
693 693 p->mcb_nextp = NULL;
694 694 return (B_TRUE);
695 695 }
696 696
697 697 /*
698 698 * Wait for all pending callback removals to be completed
699 699 */
700 700 void
701 701 mac_callback_remove_wait(mac_cb_info_t *mcbi)
702 702 {
703 703 ASSERT(MUTEX_HELD(mcbi->mcbi_lockp));
704 704 while (mcbi->mcbi_del_cnt != 0) {
705 705 DTRACE_PROBE1(need_wait, mac_cb_info_t *, mcbi);
706 706 cv_wait(&mcbi->mcbi_cv, mcbi->mcbi_lockp);
707 707 }
708 708 }
709 709
710 710 /*
711 711 * The last mac callback walker does the cleanup. Walk the list and unlik
712 712 * all the logically deleted entries and construct a temporary list of
713 713 * removed entries. Return the list of removed entries to the caller.
714 714 */
715 715 mac_cb_t *
716 716 mac_callback_walker_cleanup(mac_cb_info_t *mcbi, mac_cb_t **mcb_head)
717 717 {
718 718 mac_cb_t *p;
719 719 mac_cb_t **pp;
720 720 mac_cb_t *rmlist = NULL; /* List of removed elements */
721 721 int cnt = 0;
722 722
723 723 ASSERT(MUTEX_HELD(mcbi->mcbi_lockp));
724 724 ASSERT(mcbi->mcbi_del_cnt != 0 && mcbi->mcbi_walker_cnt == 0);
725 725
726 726 pp = mcb_head;
727 727 while (*pp != NULL) {
728 728 if ((*pp)->mcb_flags & MCB_CONDEMNED) {
729 729 p = *pp;
730 730 *pp = p->mcb_nextp;
731 731 p->mcb_nextp = rmlist;
732 732 rmlist = p;
733 733 cnt++;
734 734 continue;
735 735 }
736 736 pp = &(*pp)->mcb_nextp;
737 737 }
738 738
739 739 ASSERT(mcbi->mcbi_del_cnt == cnt);
740 740 mcbi->mcbi_del_cnt = 0;
741 741 return (rmlist);
742 742 }
743 743
744 744 boolean_t
745 745 mac_callback_lookup(mac_cb_t **mcb_headp, mac_cb_t *mcb_elem)
746 746 {
747 747 mac_cb_t *mcb;
748 748
749 749 /* Verify it is not already in the list */
750 750 for (mcb = *mcb_headp; mcb != NULL; mcb = mcb->mcb_nextp) {
751 751 if (mcb == mcb_elem)
752 752 return (B_TRUE);
753 753 }
754 754
755 755 return (B_FALSE);
756 756 }
757 757
758 758 boolean_t
759 759 mac_callback_find(mac_cb_info_t *mcbi, mac_cb_t **mcb_headp, mac_cb_t *mcb_elem)
760 760 {
761 761 boolean_t found;
762 762
763 763 mutex_enter(mcbi->mcbi_lockp);
764 764 found = mac_callback_lookup(mcb_headp, mcb_elem);
765 765 mutex_exit(mcbi->mcbi_lockp);
766 766
767 767 return (found);
768 768 }
769 769
770 770 /* Free the list of removed callbacks */
771 771 void
772 772 mac_callback_free(mac_cb_t *rmlist)
773 773 {
774 774 mac_cb_t *mcb;
775 775 mac_cb_t *mcb_next;
776 776
777 777 for (mcb = rmlist; mcb != NULL; mcb = mcb_next) {
778 778 mcb_next = mcb->mcb_nextp;
779 779 kmem_free(mcb->mcb_objp, mcb->mcb_objsize);
780 780 }
781 781 }
782 782
783 783 /*
784 784 * The promisc callbacks are in 2 lists, one off the 'mip' and another off the
785 785 * 'mcip' threaded by mpi_mi_link and mpi_mci_link respectively. However there
786 786 * is only a single shared total walker count, and an entry can't be physically
787 787 * unlinked if a walker is active on either list. The last walker does this
788 788 * cleanup of logically deleted entries.
789 789 */
790 790 void
791 791 i_mac_promisc_walker_cleanup(mac_impl_t *mip)
792 792 {
793 793 mac_cb_t *rmlist;
794 794 mac_cb_t *mcb;
795 795 mac_cb_t *mcb_next;
796 796 mac_promisc_impl_t *mpip;
797 797
798 798 /*
799 799 * Construct a temporary list of deleted callbacks by walking the
800 800 * the mi_promisc_list. Then for each entry in the temporary list,
801 801 * remove it from the mci_promisc_list and free the entry.
802 802 */
803 803 rmlist = mac_callback_walker_cleanup(&mip->mi_promisc_cb_info,
804 804 &mip->mi_promisc_list);
805 805
806 806 for (mcb = rmlist; mcb != NULL; mcb = mcb_next) {
807 807 mcb_next = mcb->mcb_nextp;
808 808 mpip = (mac_promisc_impl_t *)mcb->mcb_objp;
809 809 VERIFY(mac_callback_remove(&mip->mi_promisc_cb_info,
810 810 &mpip->mpi_mcip->mci_promisc_list, &mpip->mpi_mci_link));
811 811 mcb->mcb_flags = 0;
812 812 mcb->mcb_nextp = NULL;
813 813 kmem_cache_free(mac_promisc_impl_cache, mpip);
814 814 }
815 815 }
816 816
817 817 void
818 818 i_mac_notify(mac_impl_t *mip, mac_notify_type_t type)
819 819 {
820 820 mac_cb_info_t *mcbi;
821 821
822 822 /*
823 823 * Signal the notify thread even after mi_ref has become zero and
824 824 * mi_disabled is set. The synchronization with the notify thread
825 825 * happens in mac_unregister and that implies the driver must make
826 826 * sure it is single-threaded (with respect to mac calls) and that
827 827 * all pending mac calls have returned before it calls mac_unregister
828 828 */
829 829 rw_enter(&i_mac_impl_lock, RW_READER);
830 830 if (mip->mi_state_flags & MIS_DISABLED)
831 831 goto exit;
832 832
833 833 /*
834 834 * Guard against incorrect notifications. (Running a newer
835 835 * mac client against an older implementation?)
836 836 */
837 837 if (type >= MAC_NNOTE)
838 838 goto exit;
839 839
840 840 mcbi = &mip->mi_notify_cb_info;
841 841 mutex_enter(mcbi->mcbi_lockp);
842 842 mip->mi_notify_bits |= (1 << type);
843 843 cv_broadcast(&mcbi->mcbi_cv);
844 844 mutex_exit(mcbi->mcbi_lockp);
845 845
846 846 exit:
847 847 rw_exit(&i_mac_impl_lock);
848 848 }
849 849
850 850 /*
851 851 * Mac serialization primitives. Please see the block comment at the
852 852 * top of the file.
853 853 */
854 854 void
855 855 i_mac_perim_enter(mac_impl_t *mip)
856 856 {
857 857 mac_client_impl_t *mcip;
858 858
859 859 if (mip->mi_state_flags & MIS_IS_VNIC) {
860 860 /*
861 861 * This is a VNIC. Return the lower mac since that is what
862 862 * we want to serialize on.
863 863 */
864 864 mcip = mac_vnic_lower(mip);
865 865 mip = mcip->mci_mip;
866 866 }
867 867
868 868 mutex_enter(&mip->mi_perim_lock);
869 869 if (mip->mi_perim_owner == curthread) {
870 870 mip->mi_perim_ocnt++;
871 871 mutex_exit(&mip->mi_perim_lock);
872 872 return;
873 873 }
874 874
875 875 while (mip->mi_perim_owner != NULL)
876 876 cv_wait(&mip->mi_perim_cv, &mip->mi_perim_lock);
877 877
878 878 mip->mi_perim_owner = curthread;
879 879 ASSERT(mip->mi_perim_ocnt == 0);
880 880 mip->mi_perim_ocnt++;
881 881 #ifdef DEBUG
882 882 mip->mi_perim_stack_depth = getpcstack(mip->mi_perim_stack,
883 883 MAC_PERIM_STACK_DEPTH);
884 884 #endif
885 885 mutex_exit(&mip->mi_perim_lock);
886 886 }
887 887
888 888 int
889 889 i_mac_perim_enter_nowait(mac_impl_t *mip)
890 890 {
891 891 /*
892 892 * The vnic is a special case, since the serialization is done based
893 893 * on the lower mac. If the lower mac is busy, it does not imply the
894 894 * vnic can't be unregistered. But in the case of other drivers,
895 895 * a busy perimeter or open mac handles implies that the mac is busy
896 896 * and can't be unregistered.
897 897 */
898 898 if (mip->mi_state_flags & MIS_IS_VNIC) {
899 899 i_mac_perim_enter(mip);
900 900 return (0);
901 901 }
902 902
903 903 mutex_enter(&mip->mi_perim_lock);
904 904 if (mip->mi_perim_owner != NULL) {
905 905 mutex_exit(&mip->mi_perim_lock);
906 906 return (EBUSY);
907 907 }
908 908 ASSERT(mip->mi_perim_ocnt == 0);
909 909 mip->mi_perim_owner = curthread;
910 910 mip->mi_perim_ocnt++;
911 911 mutex_exit(&mip->mi_perim_lock);
912 912
913 913 return (0);
914 914 }
915 915
916 916 void
917 917 i_mac_perim_exit(mac_impl_t *mip)
918 918 {
919 919 mac_client_impl_t *mcip;
920 920
921 921 if (mip->mi_state_flags & MIS_IS_VNIC) {
922 922 /*
923 923 * This is a VNIC. Return the lower mac since that is what
924 924 * we want to serialize on.
925 925 */
926 926 mcip = mac_vnic_lower(mip);
927 927 mip = mcip->mci_mip;
928 928 }
929 929
930 930 ASSERT(mip->mi_perim_owner == curthread && mip->mi_perim_ocnt != 0);
931 931
932 932 mutex_enter(&mip->mi_perim_lock);
933 933 if (--mip->mi_perim_ocnt == 0) {
934 934 mip->mi_perim_owner = NULL;
935 935 cv_signal(&mip->mi_perim_cv);
936 936 }
937 937 mutex_exit(&mip->mi_perim_lock);
938 938 }
939 939
940 940 /*
941 941 * Returns whether the current thread holds the mac perimeter. Used in making
942 942 * assertions.
943 943 */
944 944 boolean_t
945 945 mac_perim_held(mac_handle_t mh)
946 946 {
947 947 mac_impl_t *mip = (mac_impl_t *)mh;
948 948 mac_client_impl_t *mcip;
949 949
950 950 if (mip->mi_state_flags & MIS_IS_VNIC) {
951 951 /*
952 952 * This is a VNIC. Return the lower mac since that is what
953 953 * we want to serialize on.
954 954 */
955 955 mcip = mac_vnic_lower(mip);
956 956 mip = mcip->mci_mip;
957 957 }
958 958 return (mip->mi_perim_owner == curthread);
959 959 }
960 960
961 961 /*
962 962 * mac client interfaces to enter the mac perimeter of a mac end point, given
963 963 * its mac handle, or macname or linkid.
964 964 */
965 965 void
966 966 mac_perim_enter_by_mh(mac_handle_t mh, mac_perim_handle_t *mphp)
967 967 {
968 968 mac_impl_t *mip = (mac_impl_t *)mh;
969 969
970 970 i_mac_perim_enter(mip);
971 971 /*
972 972 * The mac_perim_handle_t returned encodes the 'mip' and whether a
973 973 * mac_open has been done internally while entering the perimeter.
974 974 * This information is used in mac_perim_exit
975 975 */
976 976 MAC_ENCODE_MPH(*mphp, mip, 0);
977 977 }
978 978
979 979 int
980 980 mac_perim_enter_by_macname(const char *name, mac_perim_handle_t *mphp)
981 981 {
982 982 int err;
983 983 mac_handle_t mh;
984 984
985 985 if ((err = mac_open(name, &mh)) != 0)
986 986 return (err);
987 987
988 988 mac_perim_enter_by_mh(mh, mphp);
989 989 MAC_ENCODE_MPH(*mphp, mh, 1);
990 990 return (0);
991 991 }
992 992
993 993 int
994 994 mac_perim_enter_by_linkid(datalink_id_t linkid, mac_perim_handle_t *mphp)
995 995 {
996 996 int err;
997 997 mac_handle_t mh;
998 998
999 999 if ((err = mac_open_by_linkid(linkid, &mh)) != 0)
1000 1000 return (err);
1001 1001
1002 1002 mac_perim_enter_by_mh(mh, mphp);
1003 1003 MAC_ENCODE_MPH(*mphp, mh, 1);
1004 1004 return (0);
1005 1005 }
1006 1006
1007 1007 void
1008 1008 mac_perim_exit(mac_perim_handle_t mph)
1009 1009 {
1010 1010 mac_impl_t *mip;
1011 1011 boolean_t need_close;
1012 1012
1013 1013 MAC_DECODE_MPH(mph, mip, need_close);
1014 1014 i_mac_perim_exit(mip);
1015 1015 if (need_close)
1016 1016 mac_close((mac_handle_t)mip);
1017 1017 }
1018 1018
1019 1019 int
1020 1020 mac_hold(const char *macname, mac_impl_t **pmip)
1021 1021 {
1022 1022 mac_impl_t *mip;
1023 1023 int err;
1024 1024
1025 1025 /*
1026 1026 * Check the device name length to make sure it won't overflow our
1027 1027 * buffer.
1028 1028 */
1029 1029 if (strlen(macname) >= MAXNAMELEN)
1030 1030 return (EINVAL);
1031 1031
1032 1032 /*
1033 1033 * Look up its entry in the global hash table.
1034 1034 */
1035 1035 rw_enter(&i_mac_impl_lock, RW_WRITER);
1036 1036 err = mod_hash_find(i_mac_impl_hash, (mod_hash_key_t)macname,
1037 1037 (mod_hash_val_t *)&mip);
1038 1038
1039 1039 if (err != 0) {
1040 1040 rw_exit(&i_mac_impl_lock);
1041 1041 return (ENOENT);
1042 1042 }
1043 1043
1044 1044 if (mip->mi_state_flags & MIS_DISABLED) {
1045 1045 rw_exit(&i_mac_impl_lock);
1046 1046 return (ENOENT);
1047 1047 }
1048 1048
1049 1049 if (mip->mi_state_flags & MIS_EXCLUSIVE_HELD) {
1050 1050 rw_exit(&i_mac_impl_lock);
1051 1051 return (EBUSY);
1052 1052 }
1053 1053
1054 1054 mip->mi_ref++;
1055 1055 rw_exit(&i_mac_impl_lock);
1056 1056
1057 1057 *pmip = mip;
1058 1058 return (0);
1059 1059 }
1060 1060
1061 1061 void
1062 1062 mac_rele(mac_impl_t *mip)
1063 1063 {
1064 1064 rw_enter(&i_mac_impl_lock, RW_WRITER);
1065 1065 ASSERT(mip->mi_ref != 0);
1066 1066 if (--mip->mi_ref == 0) {
1067 1067 ASSERT(mip->mi_nactiveclients == 0 &&
1068 1068 !(mip->mi_state_flags & MIS_EXCLUSIVE));
1069 1069 }
1070 1070 rw_exit(&i_mac_impl_lock);
1071 1071 }
1072 1072
1073 1073 /*
1074 1074 * Private GLDv3 function to start a MAC instance.
1075 1075 */
1076 1076 int
1077 1077 mac_start(mac_handle_t mh)
1078 1078 {
1079 1079 mac_impl_t *mip = (mac_impl_t *)mh;
1080 1080 int err = 0;
1081 1081 mac_group_t *defgrp;
1082 1082
1083 1083 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
1084 1084 ASSERT(mip->mi_start != NULL);
1085 1085
1086 1086 /*
1087 1087 * Check whether the device is already started.
1088 1088 */
1089 1089 if (mip->mi_active++ == 0) {
1090 1090 mac_ring_t *ring = NULL;
1091 1091
1092 1092 /*
1093 1093 * Start the device.
1094 1094 */
1095 1095 err = mip->mi_start(mip->mi_driver);
1096 1096 if (err != 0) {
1097 1097 mip->mi_active--;
1098 1098 return (err);
1099 1099 }
1100 1100
1101 1101 /*
1102 1102 * Start the default tx ring.
1103 1103 */
1104 1104 if (mip->mi_default_tx_ring != NULL) {
1105 1105
1106 1106 ring = (mac_ring_t *)mip->mi_default_tx_ring;
1107 1107 if (ring->mr_state != MR_INUSE) {
↓ open down ↓ |
634 lines elided |
↑ open up ↑ |
1108 1108 err = mac_start_ring(ring);
1109 1109 if (err != 0) {
1110 1110 mip->mi_active--;
1111 1111 return (err);
1112 1112 }
1113 1113 }
1114 1114 }
1115 1115
1116 1116 if ((defgrp = MAC_DEFAULT_RX_GROUP(mip)) != NULL) {
1117 1117 /*
1118 - * Start the default ring, since it will be needed
1119 - * to receive broadcast and multicast traffic for
1120 - * both primary and non-primary MAC clients.
1118 + * Start the default group which is responsible
1119 + * for receiving broadcast and multicast
1120 + * traffic for both primary and non-primary
1121 + * MAC clients.
1121 1122 */
1122 1123 ASSERT(defgrp->mrg_state == MAC_GROUP_STATE_REGISTERED);
1123 1124 err = mac_start_group_and_rings(defgrp);
1124 1125 if (err != 0) {
1125 1126 mip->mi_active--;
1126 1127 if ((ring != NULL) &&
1127 1128 (ring->mr_state == MR_INUSE))
1128 1129 mac_stop_ring(ring);
1129 1130 return (err);
1130 1131 }
1131 1132 mac_set_group_state(defgrp, MAC_GROUP_STATE_SHARED);
1132 1133 }
1133 1134 }
1134 1135
1135 1136 return (err);
1136 1137 }
1137 1138
1138 1139 /*
1139 1140 * Private GLDv3 function to stop a MAC instance.
1140 1141 */
1141 1142 void
1142 1143 mac_stop(mac_handle_t mh)
1143 1144 {
1144 1145 mac_impl_t *mip = (mac_impl_t *)mh;
1145 1146 mac_group_t *grp;
1146 1147
1147 1148 ASSERT(mip->mi_stop != NULL);
1148 1149 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
1149 1150
1150 1151 /*
1151 1152 * Check whether the device is still needed.
1152 1153 */
1153 1154 ASSERT(mip->mi_active != 0);
1154 1155 if (--mip->mi_active == 0) {
1155 1156 if ((grp = MAC_DEFAULT_RX_GROUP(mip)) != NULL) {
1156 1157 /*
1157 1158 * There should be no more active clients since the
1158 1159 * MAC is being stopped. Stop the default RX group
1159 1160 * and transition it back to registered state.
1160 1161 *
1161 1162 * When clients are torn down, the groups
1162 1163 * are release via mac_release_rx_group which
1163 1164 * knows the the default group is always in
1164 1165 * started mode since broadcast uses it. So
1165 1166 * we can assert that their are no clients
1166 1167 * (since mac_bcast_add doesn't register itself
1167 1168 * as a client) and group is in SHARED state.
1168 1169 */
1169 1170 ASSERT(grp->mrg_state == MAC_GROUP_STATE_SHARED);
1170 1171 ASSERT(MAC_GROUP_NO_CLIENT(grp) &&
1171 1172 mip->mi_nactiveclients == 0);
1172 1173 mac_stop_group_and_rings(grp);
1173 1174 mac_set_group_state(grp, MAC_GROUP_STATE_REGISTERED);
1174 1175 }
1175 1176
1176 1177 if (mip->mi_default_tx_ring != NULL) {
1177 1178 mac_ring_t *ring;
1178 1179
1179 1180 ring = (mac_ring_t *)mip->mi_default_tx_ring;
1180 1181 if (ring->mr_state == MR_INUSE) {
1181 1182 mac_stop_ring(ring);
1182 1183 ring->mr_flag = 0;
1183 1184 }
1184 1185 }
1185 1186
1186 1187 /*
1187 1188 * Stop the device.
1188 1189 */
1189 1190 mip->mi_stop(mip->mi_driver);
1190 1191 }
1191 1192 }
1192 1193
1193 1194 int
1194 1195 i_mac_promisc_set(mac_impl_t *mip, boolean_t on)
1195 1196 {
1196 1197 int err = 0;
1197 1198
1198 1199 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
1199 1200 ASSERT(mip->mi_setpromisc != NULL);
1200 1201
1201 1202 if (on) {
1202 1203 /*
1203 1204 * Enable promiscuous mode on the device if not yet enabled.
1204 1205 */
1205 1206 if (mip->mi_devpromisc++ == 0) {
1206 1207 err = mip->mi_setpromisc(mip->mi_driver, B_TRUE);
1207 1208 if (err != 0) {
1208 1209 mip->mi_devpromisc--;
1209 1210 return (err);
1210 1211 }
1211 1212 i_mac_notify(mip, MAC_NOTE_DEVPROMISC);
1212 1213 }
1213 1214 } else {
1214 1215 if (mip->mi_devpromisc == 0)
1215 1216 return (EPROTO);
1216 1217
1217 1218 /*
1218 1219 * Disable promiscuous mode on the device if this is the last
1219 1220 * enabling.
1220 1221 */
1221 1222 if (--mip->mi_devpromisc == 0) {
1222 1223 err = mip->mi_setpromisc(mip->mi_driver, B_FALSE);
1223 1224 if (err != 0) {
1224 1225 mip->mi_devpromisc++;
1225 1226 return (err);
1226 1227 }
1227 1228 i_mac_notify(mip, MAC_NOTE_DEVPROMISC);
1228 1229 }
1229 1230 }
1230 1231
1231 1232 return (0);
1232 1233 }
1233 1234
1234 1235 /*
1235 1236 * The promiscuity state can change any time. If the caller needs to take
1236 1237 * actions that are atomic with the promiscuity state, then the caller needs
1237 1238 * to bracket the entire sequence with mac_perim_enter/exit
1238 1239 */
1239 1240 boolean_t
1240 1241 mac_promisc_get(mac_handle_t mh)
1241 1242 {
1242 1243 mac_impl_t *mip = (mac_impl_t *)mh;
1243 1244
1244 1245 /*
1245 1246 * Return the current promiscuity.
1246 1247 */
1247 1248 return (mip->mi_devpromisc != 0);
1248 1249 }
1249 1250
1250 1251 /*
1251 1252 * Invoked at MAC instance attach time to initialize the list
1252 1253 * of factory MAC addresses supported by a MAC instance. This function
1253 1254 * builds a local cache in the mac_impl_t for the MAC addresses
1254 1255 * supported by the underlying hardware. The MAC clients themselves
1255 1256 * use the mac_addr_factory*() functions to query and reserve
1256 1257 * factory MAC addresses.
1257 1258 */
1258 1259 void
1259 1260 mac_addr_factory_init(mac_impl_t *mip)
1260 1261 {
1261 1262 mac_capab_multifactaddr_t capab;
1262 1263 uint8_t *addr;
1263 1264 int i;
1264 1265
1265 1266 /*
1266 1267 * First round to see how many factory MAC addresses are available.
1267 1268 */
1268 1269 bzero(&capab, sizeof (capab));
1269 1270 if (!i_mac_capab_get((mac_handle_t)mip, MAC_CAPAB_MULTIFACTADDR,
1270 1271 &capab) || (capab.mcm_naddr == 0)) {
1271 1272 /*
1272 1273 * The MAC instance doesn't support multiple factory
1273 1274 * MAC addresses, we're done here.
1274 1275 */
1275 1276 return;
1276 1277 }
1277 1278
1278 1279 /*
1279 1280 * Allocate the space and get all the factory addresses.
1280 1281 */
1281 1282 addr = kmem_alloc(capab.mcm_naddr * MAXMACADDRLEN, KM_SLEEP);
1282 1283 capab.mcm_getaddr(mip->mi_driver, capab.mcm_naddr, addr);
1283 1284
1284 1285 mip->mi_factory_addr_num = capab.mcm_naddr;
1285 1286 mip->mi_factory_addr = kmem_zalloc(mip->mi_factory_addr_num *
1286 1287 sizeof (mac_factory_addr_t), KM_SLEEP);
1287 1288
1288 1289 for (i = 0; i < capab.mcm_naddr; i++) {
1289 1290 bcopy(addr + i * MAXMACADDRLEN,
1290 1291 mip->mi_factory_addr[i].mfa_addr,
1291 1292 mip->mi_type->mt_addr_length);
1292 1293 mip->mi_factory_addr[i].mfa_in_use = B_FALSE;
1293 1294 }
1294 1295
1295 1296 kmem_free(addr, capab.mcm_naddr * MAXMACADDRLEN);
1296 1297 }
1297 1298
1298 1299 void
1299 1300 mac_addr_factory_fini(mac_impl_t *mip)
1300 1301 {
1301 1302 if (mip->mi_factory_addr == NULL) {
1302 1303 ASSERT(mip->mi_factory_addr_num == 0);
1303 1304 return;
1304 1305 }
1305 1306
1306 1307 kmem_free(mip->mi_factory_addr, mip->mi_factory_addr_num *
1307 1308 sizeof (mac_factory_addr_t));
1308 1309
1309 1310 mip->mi_factory_addr = NULL;
1310 1311 mip->mi_factory_addr_num = 0;
1311 1312 }
1312 1313
1313 1314 /*
1314 1315 * Reserve a factory MAC address. If *slot is set to -1, the function
1315 1316 * attempts to reserve any of the available factory MAC addresses and
1316 1317 * returns the reserved slot id. If no slots are available, the function
1317 1318 * returns ENOSPC. If *slot is not set to -1, the function reserves
1318 1319 * the specified slot if it is available, or returns EBUSY is the slot
1319 1320 * is already used. Returns ENOTSUP if the underlying MAC does not
1320 1321 * support multiple factory addresses. If the slot number is not -1 but
1321 1322 * is invalid, returns EINVAL.
1322 1323 */
1323 1324 int
1324 1325 mac_addr_factory_reserve(mac_client_handle_t mch, int *slot)
1325 1326 {
1326 1327 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
1327 1328 mac_impl_t *mip = mcip->mci_mip;
1328 1329 int i, ret = 0;
1329 1330
1330 1331 i_mac_perim_enter(mip);
1331 1332 /*
1332 1333 * Protect against concurrent readers that may need a self-consistent
1333 1334 * view of the factory addresses
1334 1335 */
1335 1336 rw_enter(&mip->mi_rw_lock, RW_WRITER);
1336 1337
1337 1338 if (mip->mi_factory_addr_num == 0) {
1338 1339 ret = ENOTSUP;
1339 1340 goto bail;
1340 1341 }
1341 1342
1342 1343 if (*slot != -1) {
1343 1344 /* check the specified slot */
1344 1345 if (*slot < 1 || *slot > mip->mi_factory_addr_num) {
1345 1346 ret = EINVAL;
1346 1347 goto bail;
1347 1348 }
1348 1349 if (mip->mi_factory_addr[*slot-1].mfa_in_use) {
1349 1350 ret = EBUSY;
1350 1351 goto bail;
1351 1352 }
1352 1353 } else {
1353 1354 /* pick the next available slot */
1354 1355 for (i = 0; i < mip->mi_factory_addr_num; i++) {
1355 1356 if (!mip->mi_factory_addr[i].mfa_in_use)
1356 1357 break;
1357 1358 }
1358 1359
1359 1360 if (i == mip->mi_factory_addr_num) {
1360 1361 ret = ENOSPC;
1361 1362 goto bail;
1362 1363 }
1363 1364 *slot = i+1;
1364 1365 }
1365 1366
1366 1367 mip->mi_factory_addr[*slot-1].mfa_in_use = B_TRUE;
1367 1368 mip->mi_factory_addr[*slot-1].mfa_client = mcip;
1368 1369
1369 1370 bail:
1370 1371 rw_exit(&mip->mi_rw_lock);
1371 1372 i_mac_perim_exit(mip);
1372 1373 return (ret);
1373 1374 }
1374 1375
1375 1376 /*
1376 1377 * Release the specified factory MAC address slot.
1377 1378 */
1378 1379 void
1379 1380 mac_addr_factory_release(mac_client_handle_t mch, uint_t slot)
1380 1381 {
1381 1382 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
1382 1383 mac_impl_t *mip = mcip->mci_mip;
1383 1384
1384 1385 i_mac_perim_enter(mip);
1385 1386 /*
1386 1387 * Protect against concurrent readers that may need a self-consistent
1387 1388 * view of the factory addresses
1388 1389 */
1389 1390 rw_enter(&mip->mi_rw_lock, RW_WRITER);
1390 1391
1391 1392 ASSERT(slot > 0 && slot <= mip->mi_factory_addr_num);
1392 1393 ASSERT(mip->mi_factory_addr[slot-1].mfa_in_use);
1393 1394
1394 1395 mip->mi_factory_addr[slot-1].mfa_in_use = B_FALSE;
1395 1396
1396 1397 rw_exit(&mip->mi_rw_lock);
1397 1398 i_mac_perim_exit(mip);
1398 1399 }
1399 1400
1400 1401 /*
1401 1402 * Stores in mac_addr the value of the specified MAC address. Returns
1402 1403 * 0 on success, or EINVAL if the slot number is not valid for the MAC.
1403 1404 * The caller must provide a string of at least MAXNAMELEN bytes.
1404 1405 */
1405 1406 void
1406 1407 mac_addr_factory_value(mac_handle_t mh, int slot, uchar_t *mac_addr,
1407 1408 uint_t *addr_len, char *client_name, boolean_t *in_use_arg)
1408 1409 {
1409 1410 mac_impl_t *mip = (mac_impl_t *)mh;
1410 1411 boolean_t in_use;
1411 1412
1412 1413 ASSERT(slot > 0 && slot <= mip->mi_factory_addr_num);
1413 1414
1414 1415 /*
1415 1416 * Readers need to hold mi_rw_lock. Writers need to hold mac perimeter
1416 1417 * and mi_rw_lock
1417 1418 */
1418 1419 rw_enter(&mip->mi_rw_lock, RW_READER);
1419 1420 bcopy(mip->mi_factory_addr[slot-1].mfa_addr, mac_addr, MAXMACADDRLEN);
1420 1421 *addr_len = mip->mi_type->mt_addr_length;
1421 1422 in_use = mip->mi_factory_addr[slot-1].mfa_in_use;
1422 1423 if (in_use && client_name != NULL) {
1423 1424 bcopy(mip->mi_factory_addr[slot-1].mfa_client->mci_name,
1424 1425 client_name, MAXNAMELEN);
1425 1426 }
1426 1427 if (in_use_arg != NULL)
1427 1428 *in_use_arg = in_use;
1428 1429 rw_exit(&mip->mi_rw_lock);
1429 1430 }
1430 1431
1431 1432 /*
1432 1433 * Returns the number of factory MAC addresses (in addition to the
1433 1434 * primary MAC address), 0 if the underlying MAC doesn't support
1434 1435 * that feature.
1435 1436 */
1436 1437 uint_t
1437 1438 mac_addr_factory_num(mac_handle_t mh)
1438 1439 {
1439 1440 mac_impl_t *mip = (mac_impl_t *)mh;
1440 1441
1441 1442 return (mip->mi_factory_addr_num);
1442 1443 }
1443 1444
1444 1445
1445 1446 void
1446 1447 mac_rx_group_unmark(mac_group_t *grp, uint_t flag)
1447 1448 {
1448 1449 mac_ring_t *ring;
1449 1450
1450 1451 for (ring = grp->mrg_rings; ring != NULL; ring = ring->mr_next)
1451 1452 ring->mr_flag &= ~flag;
1452 1453 }
1453 1454
1454 1455 /*
1455 1456 * The following mac_hwrings_xxx() functions are private mac client functions
1456 1457 * used by the aggr driver to access and control the underlying HW Rx group
1457 1458 * and rings. In this case, the aggr driver has exclusive control of the
1458 1459 * underlying HW Rx group/rings, it calls the following functions to
1459 1460 * start/stop the HW Rx rings, disable/enable polling, add/remove mac'
1460 1461 * addresses, or set up the Rx callback.
1461 1462 */
1462 1463 /* ARGSUSED */
1463 1464 static void
1464 1465 mac_hwrings_rx_process(void *arg, mac_resource_handle_t srs,
1465 1466 mblk_t *mp_chain, boolean_t loopback)
1466 1467 {
1467 1468 mac_soft_ring_set_t *mac_srs = (mac_soft_ring_set_t *)srs;
1468 1469 mac_srs_rx_t *srs_rx = &mac_srs->srs_rx;
1469 1470 mac_direct_rx_t proc;
1470 1471 void *arg1;
1471 1472 mac_resource_handle_t arg2;
1472 1473
1473 1474 proc = srs_rx->sr_func;
1474 1475 arg1 = srs_rx->sr_arg1;
1475 1476 arg2 = mac_srs->srs_mrh;
1476 1477
1477 1478 proc(arg1, arg2, mp_chain, NULL);
1478 1479 }
1479 1480
1480 1481 /*
1481 1482 * This function is called to get the list of HW rings that are reserved by
1482 1483 * an exclusive mac client.
1483 1484 *
1484 1485 * Return value: the number of HW rings.
1485 1486 */
1486 1487 int
1487 1488 mac_hwrings_get(mac_client_handle_t mch, mac_group_handle_t *hwgh,
1488 1489 mac_ring_handle_t *hwrh, mac_ring_type_t rtype)
1489 1490 {
1490 1491 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
1491 1492 flow_entry_t *flent = mcip->mci_flent;
1492 1493 mac_group_t *grp;
1493 1494 mac_ring_t *ring;
1494 1495 int cnt = 0;
1495 1496
1496 1497 if (rtype == MAC_RING_TYPE_RX) {
1497 1498 grp = flent->fe_rx_ring_group;
1498 1499 } else if (rtype == MAC_RING_TYPE_TX) {
1499 1500 grp = flent->fe_tx_ring_group;
1500 1501 } else {
1501 1502 ASSERT(B_FALSE);
1502 1503 return (-1);
1503 1504 }
1504 1505 /*
1505 1506 * The mac client did not reserve any RX group, return directly.
1506 1507 * This is probably because the underlying MAC does not support
1507 1508 * any groups.
1508 1509 */
1509 1510 if (hwgh != NULL)
1510 1511 *hwgh = NULL;
1511 1512 if (grp == NULL)
1512 1513 return (0);
1513 1514 /*
1514 1515 * This group must be reserved by this mac client.
1515 1516 */
1516 1517 ASSERT((grp->mrg_state == MAC_GROUP_STATE_RESERVED) &&
1517 1518 (mcip == MAC_GROUP_ONLY_CLIENT(grp)));
1518 1519
1519 1520 for (ring = grp->mrg_rings; ring != NULL; ring = ring->mr_next, cnt++) {
1520 1521 ASSERT(cnt < MAX_RINGS_PER_GROUP);
1521 1522 hwrh[cnt] = (mac_ring_handle_t)ring;
1522 1523 }
1523 1524 if (hwgh != NULL)
1524 1525 *hwgh = (mac_group_handle_t)grp;
1525 1526
1526 1527 return (cnt);
1527 1528 }
1528 1529
1529 1530 /*
1530 1531 * This function is called to get info about Tx/Rx rings.
1531 1532 *
1532 1533 * Return value: returns uint_t which will have various bits set
1533 1534 * that indicates different properties of the ring.
1534 1535 */
1535 1536 uint_t
1536 1537 mac_hwring_getinfo(mac_ring_handle_t rh)
1537 1538 {
1538 1539 mac_ring_t *ring = (mac_ring_t *)rh;
1539 1540 mac_ring_info_t *info = &ring->mr_info;
1540 1541
1541 1542 return (info->mri_flags);
1542 1543 }
1543 1544
1544 1545 /*
1545 1546 * Export ddi interrupt handles from the HW ring to the pseudo ring and
1546 1547 * setup the RX callback of the mac client which exclusively controls
1547 1548 * HW ring.
1548 1549 */
1549 1550 void
1550 1551 mac_hwring_setup(mac_ring_handle_t hwrh, mac_resource_handle_t prh,
1551 1552 mac_ring_handle_t pseudo_rh)
1552 1553 {
1553 1554 mac_ring_t *hw_ring = (mac_ring_t *)hwrh;
1554 1555 mac_ring_t *pseudo_ring;
1555 1556 mac_soft_ring_set_t *mac_srs = hw_ring->mr_srs;
1556 1557
1557 1558 if (pseudo_rh != NULL) {
1558 1559 pseudo_ring = (mac_ring_t *)pseudo_rh;
1559 1560 /* Export the ddi handles to pseudo ring */
1560 1561 pseudo_ring->mr_info.mri_intr.mi_ddi_handle =
1561 1562 hw_ring->mr_info.mri_intr.mi_ddi_handle;
1562 1563 pseudo_ring->mr_info.mri_intr.mi_ddi_shared =
1563 1564 hw_ring->mr_info.mri_intr.mi_ddi_shared;
1564 1565 /*
1565 1566 * Save a pointer to pseudo ring in the hw ring. If
1566 1567 * interrupt handle changes, the hw ring will be
1567 1568 * notified of the change (see mac_ring_intr_set())
1568 1569 * and the appropriate change has to be made to
1569 1570 * the pseudo ring that has exported the ddi handle.
1570 1571 */
1571 1572 hw_ring->mr_prh = pseudo_rh;
1572 1573 }
1573 1574
1574 1575 if (hw_ring->mr_type == MAC_RING_TYPE_RX) {
1575 1576 ASSERT(!(mac_srs->srs_type & SRST_TX));
1576 1577 mac_srs->srs_mrh = prh;
1577 1578 mac_srs->srs_rx.sr_lower_proc = mac_hwrings_rx_process;
1578 1579 }
1579 1580 }
1580 1581
1581 1582 void
1582 1583 mac_hwring_teardown(mac_ring_handle_t hwrh)
1583 1584 {
1584 1585 mac_ring_t *hw_ring = (mac_ring_t *)hwrh;
1585 1586 mac_soft_ring_set_t *mac_srs;
1586 1587
1587 1588 if (hw_ring == NULL)
1588 1589 return;
1589 1590 hw_ring->mr_prh = NULL;
1590 1591 if (hw_ring->mr_type == MAC_RING_TYPE_RX) {
1591 1592 mac_srs = hw_ring->mr_srs;
1592 1593 ASSERT(!(mac_srs->srs_type & SRST_TX));
1593 1594 mac_srs->srs_rx.sr_lower_proc = mac_rx_srs_process;
1594 1595 mac_srs->srs_mrh = NULL;
1595 1596 }
1596 1597 }
1597 1598
1598 1599 int
1599 1600 mac_hwring_disable_intr(mac_ring_handle_t rh)
1600 1601 {
1601 1602 mac_ring_t *rr_ring = (mac_ring_t *)rh;
1602 1603 mac_intr_t *intr = &rr_ring->mr_info.mri_intr;
1603 1604
1604 1605 return (intr->mi_disable(intr->mi_handle));
1605 1606 }
1606 1607
1607 1608 int
1608 1609 mac_hwring_enable_intr(mac_ring_handle_t rh)
1609 1610 {
1610 1611 mac_ring_t *rr_ring = (mac_ring_t *)rh;
1611 1612 mac_intr_t *intr = &rr_ring->mr_info.mri_intr;
1612 1613
1613 1614 return (intr->mi_enable(intr->mi_handle));
1614 1615 }
1615 1616
1616 1617 int
1617 1618 mac_hwring_start(mac_ring_handle_t rh)
1618 1619 {
1619 1620 mac_ring_t *rr_ring = (mac_ring_t *)rh;
1620 1621
1621 1622 MAC_RING_UNMARK(rr_ring, MR_QUIESCE);
1622 1623 return (0);
1623 1624 }
1624 1625
1625 1626 void
1626 1627 mac_hwring_stop(mac_ring_handle_t rh)
1627 1628 {
1628 1629 mac_ring_t *rr_ring = (mac_ring_t *)rh;
1629 1630
1630 1631 mac_rx_ring_quiesce(rr_ring, MR_QUIESCE);
1631 1632 }
1632 1633
1633 1634 mblk_t *
1634 1635 mac_hwring_poll(mac_ring_handle_t rh, int bytes_to_pickup)
1635 1636 {
1636 1637 mac_ring_t *rr_ring = (mac_ring_t *)rh;
1637 1638 mac_ring_info_t *info = &rr_ring->mr_info;
1638 1639
1639 1640 return (info->mri_poll(info->mri_driver, bytes_to_pickup));
1640 1641 }
1641 1642
1642 1643 /*
1643 1644 * Send packets through a selected tx ring.
1644 1645 */
1645 1646 mblk_t *
1646 1647 mac_hwring_tx(mac_ring_handle_t rh, mblk_t *mp)
1647 1648 {
1648 1649 mac_ring_t *ring = (mac_ring_t *)rh;
1649 1650 mac_ring_info_t *info = &ring->mr_info;
1650 1651
1651 1652 ASSERT(ring->mr_type == MAC_RING_TYPE_TX &&
1652 1653 ring->mr_state >= MR_INUSE);
1653 1654 return (info->mri_tx(info->mri_driver, mp));
1654 1655 }
1655 1656
1656 1657 /*
1657 1658 * Query stats for a particular rx/tx ring
1658 1659 */
1659 1660 int
1660 1661 mac_hwring_getstat(mac_ring_handle_t rh, uint_t stat, uint64_t *val)
1661 1662 {
1662 1663 mac_ring_t *ring = (mac_ring_t *)rh;
1663 1664 mac_ring_info_t *info = &ring->mr_info;
1664 1665
1665 1666 return (info->mri_stat(info->mri_driver, stat, val));
1666 1667 }
1667 1668
1668 1669 /*
1669 1670 * Private function that is only used by aggr to send packets through
1670 1671 * a port/Tx ring. Since aggr exposes a pseudo Tx ring even for ports
1671 1672 * that does not expose Tx rings, aggr_ring_tx() entry point needs
1672 1673 * access to mac_impl_t to send packets through m_tx() entry point.
1673 1674 * It accomplishes this by calling mac_hwring_send_priv() function.
1674 1675 */
1675 1676 mblk_t *
1676 1677 mac_hwring_send_priv(mac_client_handle_t mch, mac_ring_handle_t rh, mblk_t *mp)
1677 1678 {
1678 1679 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
1679 1680 mac_impl_t *mip = mcip->mci_mip;
1680 1681
1681 1682 MAC_TX(mip, rh, mp, mcip);
1682 1683 return (mp);
1683 1684 }
1684 1685
1685 1686 /*
1686 1687 * Private function that is only used by aggr to update the default transmission
1687 1688 * ring. Because aggr exposes a pseudo Tx ring even for ports that may
1688 1689 * temporarily be down, it may need to update the default ring that is used by
1689 1690 * MAC such that it refers to a link that can actively be used to send traffic.
1690 1691 * Note that this is different from the case where the port has been removed
1691 1692 * from the group. In those cases, all of the rings will be torn down because
1692 1693 * the ring will no longer exist. It's important to give aggr a case where the
1693 1694 * rings can still exist such that it may be able to continue to send LACP PDUs
1694 1695 * to potentially restore the link.
1695 1696 *
1696 1697 * Finally, we explicitly don't do anything if the ring hasn't been enabled yet.
1697 1698 * This is to help out aggr which doesn't really know the internal state that
1698 1699 * MAC does about the rings and can't know that it's not quite ready for use
1699 1700 * yet.
1700 1701 */
1701 1702 void
1702 1703 mac_hwring_set_default(mac_handle_t mh, mac_ring_handle_t rh)
1703 1704 {
1704 1705 mac_impl_t *mip = (mac_impl_t *)mh;
1705 1706 mac_ring_t *ring = (mac_ring_t *)rh;
1706 1707
1707 1708 ASSERT(MAC_PERIM_HELD(mh));
1708 1709 VERIFY(mip->mi_state_flags & MIS_IS_AGGR);
1709 1710
1710 1711 if (ring->mr_state != MR_INUSE)
1711 1712 return;
1712 1713
1713 1714 mip->mi_default_tx_ring = rh;
1714 1715 }
1715 1716
1716 1717 int
1717 1718 mac_hwgroup_addmac(mac_group_handle_t gh, const uint8_t *addr)
1718 1719 {
1719 1720 mac_group_t *group = (mac_group_t *)gh;
1720 1721
1721 1722 return (mac_group_addmac(group, addr));
1722 1723 }
↓ open down ↓ |
592 lines elided |
↑ open up ↑ |
1723 1724
1724 1725 int
1725 1726 mac_hwgroup_remmac(mac_group_handle_t gh, const uint8_t *addr)
1726 1727 {
1727 1728 mac_group_t *group = (mac_group_t *)gh;
1728 1729
1729 1730 return (mac_group_remmac(group, addr));
1730 1731 }
1731 1732
1732 1733 /*
1734 + * Program the group's HW VLAN filter if it has such support.
1735 + * Otherwise, the group will implicitly accept tagged traffic and
1736 + * there is nothing to do.
1737 + */
1738 +int
1739 +mac_hwgroup_addvlan(mac_group_handle_t gh, uint16_t vid)
1740 +{
1741 + mac_group_t *group = (mac_group_t *)gh;
1742 +
1743 + if (!MAC_GROUP_HW_VLAN(group))
1744 + return (0);
1745 +
1746 + return (mac_group_addvlan(group, vid));
1747 +}
1748 +
1749 +int
1750 +mac_hwgroup_remvlan(mac_group_handle_t gh, uint16_t vid)
1751 +{
1752 + mac_group_t *group = (mac_group_t *)gh;
1753 +
1754 + if (!MAC_GROUP_HW_VLAN(group))
1755 + return (0);
1756 +
1757 + return (mac_group_remvlan(group, vid));
1758 +}
1759 +
1760 +/*
1761 + * Determine if a MAC has HW VLAN support. This is a private API
1762 + * consumed by aggr. In the future it might be nice to have a bitfield
1763 + * in mac_capab_rings_t to track which forms of HW filtering are
1764 + * supported by the MAC.
1765 + */
1766 +boolean_t
1767 +mac_has_hw_vlan(mac_handle_t mh)
1768 +{
1769 + mac_impl_t *mip = (mac_impl_t *)mh;
1770 +
1771 + return (MAC_GROUP_HW_VLAN(mip->mi_rx_groups));
1772 +}
1773 +
1774 +/*
1733 1775 * Set the RX group to be shared/reserved. Note that the group must be
1734 1776 * started/stopped outside of this function.
1735 1777 */
1736 1778 void
1737 1779 mac_set_group_state(mac_group_t *grp, mac_group_state_t state)
1738 1780 {
1739 1781 /*
1740 1782 * If there is no change in the group state, just return.
1741 1783 */
1742 1784 if (grp->mrg_state == state)
1743 1785 return;
1744 1786
1745 1787 switch (state) {
1746 1788 case MAC_GROUP_STATE_RESERVED:
1747 1789 /*
1748 1790 * Successfully reserved the group.
1749 1791 *
1750 1792 * Given that there is an exclusive client controlling this
1751 1793 * group, we enable the group level polling when available,
1752 1794 * so that SRSs get to turn on/off individual rings they's
1753 1795 * assigned to.
1754 1796 */
1755 1797 ASSERT(MAC_PERIM_HELD(grp->mrg_mh));
1756 1798
1757 1799 if (grp->mrg_type == MAC_RING_TYPE_RX &&
1758 1800 GROUP_INTR_DISABLE_FUNC(grp) != NULL) {
1759 1801 GROUP_INTR_DISABLE_FUNC(grp)(GROUP_INTR_HANDLE(grp));
1760 1802 }
1761 1803 break;
1762 1804
1763 1805 case MAC_GROUP_STATE_SHARED:
1764 1806 /*
1765 1807 * Set all rings of this group to software classified.
1766 1808 * If the group has an overriding interrupt, then re-enable it.
1767 1809 */
1768 1810 ASSERT(MAC_PERIM_HELD(grp->mrg_mh));
1769 1811
1770 1812 if (grp->mrg_type == MAC_RING_TYPE_RX &&
1771 1813 GROUP_INTR_ENABLE_FUNC(grp) != NULL) {
1772 1814 GROUP_INTR_ENABLE_FUNC(grp)(GROUP_INTR_HANDLE(grp));
1773 1815 }
1774 1816 /* The ring is not available for reservations any more */
1775 1817 break;
1776 1818
1777 1819 case MAC_GROUP_STATE_REGISTERED:
1778 1820 /* Also callable from mac_register, perim is not held */
1779 1821 break;
1780 1822
1781 1823 default:
1782 1824 ASSERT(B_FALSE);
1783 1825 break;
1784 1826 }
1785 1827
1786 1828 grp->mrg_state = state;
1787 1829 }
1788 1830
1789 1831 /*
1790 1832 * Quiesce future hardware classified packets for the specified Rx ring
1791 1833 */
1792 1834 static void
1793 1835 mac_rx_ring_quiesce(mac_ring_t *rx_ring, uint_t ring_flag)
1794 1836 {
1795 1837 ASSERT(rx_ring->mr_classify_type == MAC_HW_CLASSIFIER);
1796 1838 ASSERT(ring_flag == MR_CONDEMNED || ring_flag == MR_QUIESCE);
1797 1839
1798 1840 mutex_enter(&rx_ring->mr_lock);
1799 1841 rx_ring->mr_flag |= ring_flag;
1800 1842 while (rx_ring->mr_refcnt != 0)
1801 1843 cv_wait(&rx_ring->mr_cv, &rx_ring->mr_lock);
1802 1844 mutex_exit(&rx_ring->mr_lock);
1803 1845 }
1804 1846
1805 1847 /*
1806 1848 * Please see mac_tx for details about the per cpu locking scheme
1807 1849 */
1808 1850 static void
1809 1851 mac_tx_lock_all(mac_client_impl_t *mcip)
1810 1852 {
1811 1853 int i;
1812 1854
1813 1855 for (i = 0; i <= mac_tx_percpu_cnt; i++)
1814 1856 mutex_enter(&mcip->mci_tx_pcpu[i].pcpu_tx_lock);
1815 1857 }
1816 1858
1817 1859 static void
1818 1860 mac_tx_unlock_all(mac_client_impl_t *mcip)
1819 1861 {
1820 1862 int i;
1821 1863
1822 1864 for (i = mac_tx_percpu_cnt; i >= 0; i--)
1823 1865 mutex_exit(&mcip->mci_tx_pcpu[i].pcpu_tx_lock);
1824 1866 }
1825 1867
1826 1868 static void
1827 1869 mac_tx_unlock_allbutzero(mac_client_impl_t *mcip)
1828 1870 {
1829 1871 int i;
1830 1872
1831 1873 for (i = mac_tx_percpu_cnt; i > 0; i--)
1832 1874 mutex_exit(&mcip->mci_tx_pcpu[i].pcpu_tx_lock);
1833 1875 }
1834 1876
1835 1877 static int
1836 1878 mac_tx_sum_refcnt(mac_client_impl_t *mcip)
1837 1879 {
1838 1880 int i;
1839 1881 int refcnt = 0;
1840 1882
1841 1883 for (i = 0; i <= mac_tx_percpu_cnt; i++)
1842 1884 refcnt += mcip->mci_tx_pcpu[i].pcpu_tx_refcnt;
1843 1885
1844 1886 return (refcnt);
1845 1887 }
1846 1888
1847 1889 /*
1848 1890 * Stop future Tx packets coming down from the client in preparation for
1849 1891 * quiescing the Tx side. This is needed for dynamic reclaim and reassignment
1850 1892 * of rings between clients
1851 1893 */
1852 1894 void
1853 1895 mac_tx_client_block(mac_client_impl_t *mcip)
1854 1896 {
1855 1897 mac_tx_lock_all(mcip);
1856 1898 mcip->mci_tx_flag |= MCI_TX_QUIESCE;
1857 1899 while (mac_tx_sum_refcnt(mcip) != 0) {
1858 1900 mac_tx_unlock_allbutzero(mcip);
1859 1901 cv_wait(&mcip->mci_tx_cv, &mcip->mci_tx_pcpu[0].pcpu_tx_lock);
1860 1902 mutex_exit(&mcip->mci_tx_pcpu[0].pcpu_tx_lock);
1861 1903 mac_tx_lock_all(mcip);
1862 1904 }
1863 1905 mac_tx_unlock_all(mcip);
1864 1906 }
1865 1907
1866 1908 void
1867 1909 mac_tx_client_unblock(mac_client_impl_t *mcip)
1868 1910 {
1869 1911 mac_tx_lock_all(mcip);
1870 1912 mcip->mci_tx_flag &= ~MCI_TX_QUIESCE;
1871 1913 mac_tx_unlock_all(mcip);
1872 1914 /*
1873 1915 * We may fail to disable flow control for the last MAC_NOTE_TX
1874 1916 * notification because the MAC client is quiesced. Send the
1875 1917 * notification again.
1876 1918 */
1877 1919 i_mac_notify(mcip->mci_mip, MAC_NOTE_TX);
1878 1920 }
1879 1921
1880 1922 /*
1881 1923 * Wait for an SRS to quiesce. The SRS worker will signal us when the
1882 1924 * quiesce is done.
1883 1925 */
1884 1926 static void
1885 1927 mac_srs_quiesce_wait(mac_soft_ring_set_t *srs, uint_t srs_flag)
1886 1928 {
1887 1929 mutex_enter(&srs->srs_lock);
1888 1930 while (!(srs->srs_state & srs_flag))
1889 1931 cv_wait(&srs->srs_quiesce_done_cv, &srs->srs_lock);
1890 1932 mutex_exit(&srs->srs_lock);
1891 1933 }
1892 1934
1893 1935 /*
1894 1936 * Quiescing an Rx SRS is achieved by the following sequence. The protocol
1895 1937 * works bottom up by cutting off packet flow from the bottommost point in the
1896 1938 * mac, then the SRS, and then the soft rings. There are 2 use cases of this
1897 1939 * mechanism. One is a temporary quiesce of the SRS, such as say while changing
1898 1940 * the Rx callbacks. Another use case is Rx SRS teardown. In the former case
1899 1941 * the QUIESCE prefix/suffix is used and in the latter the CONDEMNED is used
1900 1942 * for the SRS and MR flags. In the former case the threads pause waiting for
1901 1943 * a restart, while in the latter case the threads exit. The Tx SRS teardown
1902 1944 * is also mostly similar to the above.
1903 1945 *
1904 1946 * 1. Stop future hardware classified packets at the lowest level in the mac.
1905 1947 * Remove any hardware classification rule (CONDEMNED case) and mark the
1906 1948 * rings as CONDEMNED or QUIESCE as appropriate. This prevents the mr_refcnt
1907 1949 * from increasing. Upcalls from the driver that come through hardware
1908 1950 * classification will be dropped in mac_rx from now on. Then we wait for
1909 1951 * the mr_refcnt to drop to zero. When the mr_refcnt reaches zero we are
1910 1952 * sure there aren't any upcall threads from the driver through hardware
1911 1953 * classification. In the case of SRS teardown we also remove the
1912 1954 * classification rule in the driver.
1913 1955 *
1914 1956 * 2. Stop future software classified packets by marking the flow entry with
1915 1957 * FE_QUIESCE or FE_CONDEMNED as appropriate which prevents the refcnt from
1916 1958 * increasing. We also remove the flow entry from the table in the latter
1917 1959 * case. Then wait for the fe_refcnt to reach an appropriate quiescent value
1918 1960 * that indicates there aren't any active threads using that flow entry.
1919 1961 *
1920 1962 * 3. Quiesce the SRS and softrings by signaling the SRS. The SRS poll thread,
1921 1963 * SRS worker thread, and the soft ring threads are quiesced in sequence
1922 1964 * with the SRS worker thread serving as a master controller. This
1923 1965 * mechansim is explained in mac_srs_worker_quiesce().
1924 1966 *
1925 1967 * The restart mechanism to reactivate the SRS and softrings is explained
1926 1968 * in mac_srs_worker_restart(). Here we just signal the SRS worker to start the
1927 1969 * restart sequence.
1928 1970 */
1929 1971 void
1930 1972 mac_rx_srs_quiesce(mac_soft_ring_set_t *srs, uint_t srs_quiesce_flag)
1931 1973 {
1932 1974 flow_entry_t *flent = srs->srs_flent;
1933 1975 uint_t mr_flag, srs_done_flag;
1934 1976
1935 1977 ASSERT(MAC_PERIM_HELD((mac_handle_t)FLENT_TO_MIP(flent)));
1936 1978 ASSERT(!(srs->srs_type & SRST_TX));
1937 1979
1938 1980 if (srs_quiesce_flag == SRS_CONDEMNED) {
1939 1981 mr_flag = MR_CONDEMNED;
1940 1982 srs_done_flag = SRS_CONDEMNED_DONE;
1941 1983 if (srs->srs_type & SRST_CLIENT_POLL_ENABLED)
1942 1984 mac_srs_client_poll_disable(srs->srs_mcip, srs);
1943 1985 } else {
1944 1986 ASSERT(srs_quiesce_flag == SRS_QUIESCE);
1945 1987 mr_flag = MR_QUIESCE;
1946 1988 srs_done_flag = SRS_QUIESCE_DONE;
1947 1989 if (srs->srs_type & SRST_CLIENT_POLL_ENABLED)
1948 1990 mac_srs_client_poll_quiesce(srs->srs_mcip, srs);
1949 1991 }
1950 1992
1951 1993 if (srs->srs_ring != NULL) {
1952 1994 mac_rx_ring_quiesce(srs->srs_ring, mr_flag);
1953 1995 } else {
1954 1996 /*
1955 1997 * SRS is driven by software classification. In case
1956 1998 * of CONDEMNED, the top level teardown functions will
1957 1999 * deal with flow removal.
1958 2000 */
1959 2001 if (srs_quiesce_flag != SRS_CONDEMNED) {
1960 2002 FLOW_MARK(flent, FE_QUIESCE);
1961 2003 mac_flow_wait(flent, FLOW_DRIVER_UPCALL);
1962 2004 }
1963 2005 }
1964 2006
1965 2007 /*
1966 2008 * Signal the SRS to quiesce itself, and then cv_wait for the
1967 2009 * SRS quiesce to complete. The SRS worker thread will wake us
1968 2010 * up when the quiesce is complete
1969 2011 */
1970 2012 mac_srs_signal(srs, srs_quiesce_flag);
1971 2013 mac_srs_quiesce_wait(srs, srs_done_flag);
1972 2014 }
1973 2015
1974 2016 /*
1975 2017 * Remove an SRS.
1976 2018 */
1977 2019 void
1978 2020 mac_rx_srs_remove(mac_soft_ring_set_t *srs)
1979 2021 {
1980 2022 flow_entry_t *flent = srs->srs_flent;
1981 2023 int i;
1982 2024
1983 2025 mac_rx_srs_quiesce(srs, SRS_CONDEMNED);
1984 2026 /*
1985 2027 * Locate and remove our entry in the fe_rx_srs[] array, and
1986 2028 * adjust the fe_rx_srs array entries and array count by
1987 2029 * moving the last entry into the vacated spot.
1988 2030 */
1989 2031 mutex_enter(&flent->fe_lock);
1990 2032 for (i = 0; i < flent->fe_rx_srs_cnt; i++) {
1991 2033 if (flent->fe_rx_srs[i] == srs)
1992 2034 break;
1993 2035 }
1994 2036
1995 2037 ASSERT(i != 0 && i < flent->fe_rx_srs_cnt);
1996 2038 if (i != flent->fe_rx_srs_cnt - 1) {
1997 2039 flent->fe_rx_srs[i] =
1998 2040 flent->fe_rx_srs[flent->fe_rx_srs_cnt - 1];
1999 2041 i = flent->fe_rx_srs_cnt - 1;
2000 2042 }
2001 2043
2002 2044 flent->fe_rx_srs[i] = NULL;
2003 2045 flent->fe_rx_srs_cnt--;
2004 2046 mutex_exit(&flent->fe_lock);
2005 2047
2006 2048 mac_srs_free(srs);
2007 2049 }
2008 2050
2009 2051 static void
2010 2052 mac_srs_clear_flag(mac_soft_ring_set_t *srs, uint_t flag)
2011 2053 {
2012 2054 mutex_enter(&srs->srs_lock);
2013 2055 srs->srs_state &= ~flag;
2014 2056 mutex_exit(&srs->srs_lock);
2015 2057 }
2016 2058
2017 2059 void
2018 2060 mac_rx_srs_restart(mac_soft_ring_set_t *srs)
2019 2061 {
2020 2062 flow_entry_t *flent = srs->srs_flent;
2021 2063 mac_ring_t *mr;
2022 2064
2023 2065 ASSERT(MAC_PERIM_HELD((mac_handle_t)FLENT_TO_MIP(flent)));
2024 2066 ASSERT((srs->srs_type & SRST_TX) == 0);
2025 2067
2026 2068 /*
2027 2069 * This handles a change in the number of SRSs between the quiesce and
2028 2070 * and restart operation of a flow.
2029 2071 */
2030 2072 if (!SRS_QUIESCED(srs))
2031 2073 return;
2032 2074
2033 2075 /*
2034 2076 * Signal the SRS to restart itself. Wait for the restart to complete
2035 2077 * Note that we only restart the SRS if it is not marked as
2036 2078 * permanently quiesced.
2037 2079 */
2038 2080 if (!SRS_QUIESCED_PERMANENT(srs)) {
2039 2081 mac_srs_signal(srs, SRS_RESTART);
2040 2082 mac_srs_quiesce_wait(srs, SRS_RESTART_DONE);
2041 2083 mac_srs_clear_flag(srs, SRS_RESTART_DONE);
2042 2084
2043 2085 mac_srs_client_poll_restart(srs->srs_mcip, srs);
2044 2086 }
2045 2087
2046 2088 /* Finally clear the flags to let the packets in */
2047 2089 mr = srs->srs_ring;
2048 2090 if (mr != NULL) {
2049 2091 MAC_RING_UNMARK(mr, MR_QUIESCE);
2050 2092 /* In case the ring was stopped, safely restart it */
2051 2093 if (mr->mr_state != MR_INUSE)
2052 2094 (void) mac_start_ring(mr);
2053 2095 } else {
2054 2096 FLOW_UNMARK(flent, FE_QUIESCE);
2055 2097 }
2056 2098 }
2057 2099
2058 2100 /*
2059 2101 * Temporary quiesce of a flow and associated Rx SRS.
2060 2102 * Please see block comment above mac_rx_classify_flow_rem.
2061 2103 */
2062 2104 /* ARGSUSED */
2063 2105 int
2064 2106 mac_rx_classify_flow_quiesce(flow_entry_t *flent, void *arg)
2065 2107 {
2066 2108 int i;
2067 2109
2068 2110 for (i = 0; i < flent->fe_rx_srs_cnt; i++) {
2069 2111 mac_rx_srs_quiesce((mac_soft_ring_set_t *)flent->fe_rx_srs[i],
2070 2112 SRS_QUIESCE);
2071 2113 }
2072 2114 return (0);
2073 2115 }
2074 2116
2075 2117 /*
2076 2118 * Restart a flow and associated Rx SRS that has been quiesced temporarily
2077 2119 * Please see block comment above mac_rx_classify_flow_rem
2078 2120 */
2079 2121 /* ARGSUSED */
2080 2122 int
2081 2123 mac_rx_classify_flow_restart(flow_entry_t *flent, void *arg)
2082 2124 {
2083 2125 int i;
2084 2126
2085 2127 for (i = 0; i < flent->fe_rx_srs_cnt; i++)
2086 2128 mac_rx_srs_restart((mac_soft_ring_set_t *)flent->fe_rx_srs[i]);
2087 2129
2088 2130 return (0);
2089 2131 }
2090 2132
2091 2133 void
2092 2134 mac_srs_perm_quiesce(mac_client_handle_t mch, boolean_t on)
2093 2135 {
2094 2136 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
2095 2137 flow_entry_t *flent = mcip->mci_flent;
2096 2138 mac_impl_t *mip = mcip->mci_mip;
2097 2139 mac_soft_ring_set_t *mac_srs;
2098 2140 int i;
2099 2141
2100 2142 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
2101 2143
2102 2144 if (flent == NULL)
2103 2145 return;
2104 2146
2105 2147 for (i = 0; i < flent->fe_rx_srs_cnt; i++) {
2106 2148 mac_srs = flent->fe_rx_srs[i];
2107 2149 mutex_enter(&mac_srs->srs_lock);
2108 2150 if (on)
2109 2151 mac_srs->srs_state |= SRS_QUIESCE_PERM;
2110 2152 else
2111 2153 mac_srs->srs_state &= ~SRS_QUIESCE_PERM;
2112 2154 mutex_exit(&mac_srs->srs_lock);
2113 2155 }
2114 2156 }
2115 2157
2116 2158 void
2117 2159 mac_rx_client_quiesce(mac_client_handle_t mch)
2118 2160 {
2119 2161 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
2120 2162 mac_impl_t *mip = mcip->mci_mip;
2121 2163
2122 2164 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
2123 2165
2124 2166 if (MCIP_DATAPATH_SETUP(mcip)) {
2125 2167 (void) mac_rx_classify_flow_quiesce(mcip->mci_flent,
2126 2168 NULL);
2127 2169 (void) mac_flow_walk_nolock(mcip->mci_subflow_tab,
2128 2170 mac_rx_classify_flow_quiesce, NULL);
2129 2171 }
2130 2172 }
2131 2173
2132 2174 void
2133 2175 mac_rx_client_restart(mac_client_handle_t mch)
2134 2176 {
2135 2177 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
2136 2178 mac_impl_t *mip = mcip->mci_mip;
2137 2179
2138 2180 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
2139 2181
2140 2182 if (MCIP_DATAPATH_SETUP(mcip)) {
2141 2183 (void) mac_rx_classify_flow_restart(mcip->mci_flent, NULL);
2142 2184 (void) mac_flow_walk_nolock(mcip->mci_subflow_tab,
2143 2185 mac_rx_classify_flow_restart, NULL);
2144 2186 }
2145 2187 }
2146 2188
2147 2189 /*
2148 2190 * This function only quiesces the Tx SRS and softring worker threads. Callers
2149 2191 * need to make sure that there aren't any mac client threads doing current or
2150 2192 * future transmits in the mac before calling this function.
2151 2193 */
2152 2194 void
2153 2195 mac_tx_srs_quiesce(mac_soft_ring_set_t *srs, uint_t srs_quiesce_flag)
2154 2196 {
2155 2197 mac_client_impl_t *mcip = srs->srs_mcip;
2156 2198
2157 2199 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
2158 2200
2159 2201 ASSERT(srs->srs_type & SRST_TX);
2160 2202 ASSERT(srs_quiesce_flag == SRS_CONDEMNED ||
2161 2203 srs_quiesce_flag == SRS_QUIESCE);
2162 2204
2163 2205 /*
2164 2206 * Signal the SRS to quiesce itself, and then cv_wait for the
2165 2207 * SRS quiesce to complete. The SRS worker thread will wake us
2166 2208 * up when the quiesce is complete
2167 2209 */
2168 2210 mac_srs_signal(srs, srs_quiesce_flag);
2169 2211 mac_srs_quiesce_wait(srs, srs_quiesce_flag == SRS_QUIESCE ?
2170 2212 SRS_QUIESCE_DONE : SRS_CONDEMNED_DONE);
2171 2213 }
2172 2214
2173 2215 void
2174 2216 mac_tx_srs_restart(mac_soft_ring_set_t *srs)
2175 2217 {
2176 2218 /*
2177 2219 * Resizing the fanout could result in creation of new SRSs.
2178 2220 * They may not necessarily be in the quiesced state in which
2179 2221 * case it need be restarted
2180 2222 */
2181 2223 if (!SRS_QUIESCED(srs))
2182 2224 return;
2183 2225
2184 2226 mac_srs_signal(srs, SRS_RESTART);
2185 2227 mac_srs_quiesce_wait(srs, SRS_RESTART_DONE);
2186 2228 mac_srs_clear_flag(srs, SRS_RESTART_DONE);
2187 2229 }
2188 2230
2189 2231 /*
2190 2232 * Temporary quiesce of a flow and associated Rx SRS.
2191 2233 * Please see block comment above mac_rx_srs_quiesce
2192 2234 */
2193 2235 /* ARGSUSED */
2194 2236 int
2195 2237 mac_tx_flow_quiesce(flow_entry_t *flent, void *arg)
2196 2238 {
2197 2239 /*
2198 2240 * The fe_tx_srs is null for a subflow on an interface that is
2199 2241 * not plumbed
2200 2242 */
2201 2243 if (flent->fe_tx_srs != NULL)
2202 2244 mac_tx_srs_quiesce(flent->fe_tx_srs, SRS_QUIESCE);
2203 2245 return (0);
2204 2246 }
2205 2247
2206 2248 /* ARGSUSED */
2207 2249 int
2208 2250 mac_tx_flow_restart(flow_entry_t *flent, void *arg)
2209 2251 {
2210 2252 /*
2211 2253 * The fe_tx_srs is null for a subflow on an interface that is
2212 2254 * not plumbed
2213 2255 */
2214 2256 if (flent->fe_tx_srs != NULL)
2215 2257 mac_tx_srs_restart(flent->fe_tx_srs);
2216 2258 return (0);
2217 2259 }
2218 2260
2219 2261 static void
2220 2262 i_mac_tx_client_quiesce(mac_client_handle_t mch, uint_t srs_quiesce_flag)
2221 2263 {
2222 2264 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
2223 2265
2224 2266 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
2225 2267
2226 2268 mac_tx_client_block(mcip);
2227 2269 if (MCIP_TX_SRS(mcip) != NULL) {
2228 2270 mac_tx_srs_quiesce(MCIP_TX_SRS(mcip), srs_quiesce_flag);
2229 2271 (void) mac_flow_walk_nolock(mcip->mci_subflow_tab,
2230 2272 mac_tx_flow_quiesce, NULL);
2231 2273 }
2232 2274 }
2233 2275
2234 2276 void
2235 2277 mac_tx_client_quiesce(mac_client_handle_t mch)
2236 2278 {
2237 2279 i_mac_tx_client_quiesce(mch, SRS_QUIESCE);
2238 2280 }
2239 2281
2240 2282 void
2241 2283 mac_tx_client_condemn(mac_client_handle_t mch)
2242 2284 {
2243 2285 i_mac_tx_client_quiesce(mch, SRS_CONDEMNED);
2244 2286 }
2245 2287
2246 2288 void
2247 2289 mac_tx_client_restart(mac_client_handle_t mch)
2248 2290 {
2249 2291 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
2250 2292
2251 2293 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
2252 2294
2253 2295 mac_tx_client_unblock(mcip);
2254 2296 if (MCIP_TX_SRS(mcip) != NULL) {
2255 2297 mac_tx_srs_restart(MCIP_TX_SRS(mcip));
2256 2298 (void) mac_flow_walk_nolock(mcip->mci_subflow_tab,
2257 2299 mac_tx_flow_restart, NULL);
2258 2300 }
2259 2301 }
2260 2302
2261 2303 void
2262 2304 mac_tx_client_flush(mac_client_impl_t *mcip)
2263 2305 {
2264 2306 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
2265 2307
2266 2308 mac_tx_client_quiesce((mac_client_handle_t)mcip);
2267 2309 mac_tx_client_restart((mac_client_handle_t)mcip);
2268 2310 }
2269 2311
2270 2312 void
2271 2313 mac_client_quiesce(mac_client_impl_t *mcip)
2272 2314 {
2273 2315 mac_rx_client_quiesce((mac_client_handle_t)mcip);
2274 2316 mac_tx_client_quiesce((mac_client_handle_t)mcip);
2275 2317 }
2276 2318
2277 2319 void
2278 2320 mac_client_restart(mac_client_impl_t *mcip)
2279 2321 {
2280 2322 mac_rx_client_restart((mac_client_handle_t)mcip);
2281 2323 mac_tx_client_restart((mac_client_handle_t)mcip);
2282 2324 }
2283 2325
2284 2326 /*
2285 2327 * Allocate a minor number.
2286 2328 */
2287 2329 minor_t
2288 2330 mac_minor_hold(boolean_t sleep)
2289 2331 {
2290 2332 id_t id;
2291 2333
2292 2334 /*
2293 2335 * Grab a value from the arena.
2294 2336 */
2295 2337 atomic_inc_32(&minor_count);
2296 2338
2297 2339 if (sleep)
2298 2340 return ((uint_t)id_alloc(minor_ids));
2299 2341
2300 2342 if ((id = id_alloc_nosleep(minor_ids)) == -1) {
2301 2343 atomic_dec_32(&minor_count);
2302 2344 return (0);
2303 2345 }
2304 2346
2305 2347 return ((uint_t)id);
2306 2348 }
2307 2349
2308 2350 /*
2309 2351 * Release a previously allocated minor number.
2310 2352 */
2311 2353 void
2312 2354 mac_minor_rele(minor_t minor)
2313 2355 {
2314 2356 /*
2315 2357 * Return the value to the arena.
2316 2358 */
2317 2359 id_free(minor_ids, minor);
2318 2360 atomic_dec_32(&minor_count);
2319 2361 }
2320 2362
2321 2363 uint32_t
2322 2364 mac_no_notification(mac_handle_t mh)
2323 2365 {
2324 2366 mac_impl_t *mip = (mac_impl_t *)mh;
2325 2367
2326 2368 return (((mip->mi_state_flags & MIS_LEGACY) != 0) ?
2327 2369 mip->mi_capab_legacy.ml_unsup_note : 0);
2328 2370 }
2329 2371
2330 2372 /*
2331 2373 * Prevent any new opens of this mac in preparation for unregister
2332 2374 */
2333 2375 int
2334 2376 i_mac_disable(mac_impl_t *mip)
2335 2377 {
2336 2378 mac_client_impl_t *mcip;
2337 2379
2338 2380 rw_enter(&i_mac_impl_lock, RW_WRITER);
2339 2381 if (mip->mi_state_flags & MIS_DISABLED) {
2340 2382 /* Already disabled, return success */
2341 2383 rw_exit(&i_mac_impl_lock);
2342 2384 return (0);
2343 2385 }
2344 2386 /*
2345 2387 * See if there are any other references to this mac_t (e.g., VLAN's).
2346 2388 * If so return failure. If all the other checks below pass, then
2347 2389 * set mi_disabled atomically under the i_mac_impl_lock to prevent
2348 2390 * any new VLAN's from being created or new mac client opens of this
2349 2391 * mac end point.
2350 2392 */
2351 2393 if (mip->mi_ref > 0) {
2352 2394 rw_exit(&i_mac_impl_lock);
2353 2395 return (EBUSY);
2354 2396 }
2355 2397
2356 2398 /*
2357 2399 * mac clients must delete all multicast groups they join before
2358 2400 * closing. bcast groups are reference counted, the last client
2359 2401 * to delete the group will wait till the group is physically
2360 2402 * deleted. Since all clients have closed this mac end point
2361 2403 * mi_bcast_ngrps must be zero at this point
2362 2404 */
2363 2405 ASSERT(mip->mi_bcast_ngrps == 0);
2364 2406
2365 2407 /*
2366 2408 * Don't let go of this if it has some flows.
2367 2409 * All other code guarantees no flows are added to a disabled
2368 2410 * mac, therefore it is sufficient to check for the flow table
2369 2411 * only here.
2370 2412 */
2371 2413 mcip = mac_primary_client_handle(mip);
2372 2414 if ((mcip != NULL) && mac_link_has_flows((mac_client_handle_t)mcip)) {
2373 2415 rw_exit(&i_mac_impl_lock);
2374 2416 return (ENOTEMPTY);
2375 2417 }
2376 2418
2377 2419 mip->mi_state_flags |= MIS_DISABLED;
2378 2420 rw_exit(&i_mac_impl_lock);
2379 2421 return (0);
2380 2422 }
2381 2423
2382 2424 int
2383 2425 mac_disable_nowait(mac_handle_t mh)
2384 2426 {
2385 2427 mac_impl_t *mip = (mac_impl_t *)mh;
2386 2428 int err;
2387 2429
2388 2430 if ((err = i_mac_perim_enter_nowait(mip)) != 0)
2389 2431 return (err);
2390 2432 err = i_mac_disable(mip);
2391 2433 i_mac_perim_exit(mip);
2392 2434 return (err);
2393 2435 }
2394 2436
2395 2437 int
2396 2438 mac_disable(mac_handle_t mh)
2397 2439 {
2398 2440 mac_impl_t *mip = (mac_impl_t *)mh;
2399 2441 int err;
2400 2442
2401 2443 i_mac_perim_enter(mip);
2402 2444 err = i_mac_disable(mip);
2403 2445 i_mac_perim_exit(mip);
2404 2446
2405 2447 /*
2406 2448 * Clean up notification thread and wait for it to exit.
↓ open down ↓ |
664 lines elided |
↑ open up ↑ |
2407 2449 */
2408 2450 if (err == 0)
2409 2451 i_mac_notify_exit(mip);
2410 2452
2411 2453 return (err);
2412 2454 }
2413 2455
2414 2456 /*
2415 2457 * Called when the MAC instance has a non empty flow table, to de-multiplex
2416 2458 * incoming packets to the right flow.
2417 - * The MAC's rw lock is assumed held as a READER.
2418 2459 */
2419 2460 /* ARGSUSED */
2420 2461 static mblk_t *
2421 2462 mac_rx_classify(mac_impl_t *mip, mac_resource_handle_t mrh, mblk_t *mp)
2422 2463 {
2423 2464 flow_entry_t *flent = NULL;
2424 2465 uint_t flags = FLOW_INBOUND;
2425 2466 int err;
2426 2467
2427 2468 /*
2428 - * If the mac is a port of an aggregation, pass FLOW_IGNORE_VLAN
2469 + * If the MAC is a port of an aggregation, pass FLOW_IGNORE_VLAN
2429 2470 * to mac_flow_lookup() so that the VLAN packets can be successfully
2430 2471 * passed to the non-VLAN aggregation flows.
2431 2472 *
2432 2473 * Note that there is possibly a race between this and
2433 2474 * mac_unicast_remove/add() and VLAN packets could be incorrectly
2434 - * classified to non-VLAN flows of non-aggregation mac clients. These
2435 - * VLAN packets will be then filtered out by the mac module.
2475 + * classified to non-VLAN flows of non-aggregation MAC clients. These
2476 + * VLAN packets will be then filtered out by the MAC module.
2436 2477 */
2437 2478 if ((mip->mi_state_flags & MIS_EXCLUSIVE) != 0)
2438 2479 flags |= FLOW_IGNORE_VLAN;
2439 2480
2440 2481 err = mac_flow_lookup(mip->mi_flow_tab, mp, flags, &flent);
2441 2482 if (err != 0) {
2442 2483 /* no registered receive function */
2443 2484 return (mp);
2444 2485 } else {
2445 2486 mac_client_impl_t *mcip;
2446 2487
2447 2488 /*
2448 2489 * This flent might just be an additional one on the MAC client,
2449 2490 * i.e. for classification purposes (different fdesc), however
2450 2491 * the resources, SRS et. al., are in the mci_flent, so if
2451 2492 * this isn't the mci_flent, we need to get it.
2452 2493 */
2453 2494 if ((mcip = flent->fe_mcip) != NULL &&
2454 2495 mcip->mci_flent != flent) {
2455 2496 FLOW_REFRELE(flent);
2456 2497 flent = mcip->mci_flent;
2457 2498 FLOW_TRY_REFHOLD(flent, err);
2458 2499 if (err != 0)
2459 2500 return (mp);
2460 2501 }
2461 2502 (flent->fe_cb_fn)(flent->fe_cb_arg1, flent->fe_cb_arg2, mp,
2462 2503 B_FALSE);
2463 2504 FLOW_REFRELE(flent);
2464 2505 }
2465 2506 return (NULL);
2466 2507 }
2467 2508
2468 2509 mblk_t *
2469 2510 mac_rx_flow(mac_handle_t mh, mac_resource_handle_t mrh, mblk_t *mp_chain)
2470 2511 {
2471 2512 mac_impl_t *mip = (mac_impl_t *)mh;
2472 2513 mblk_t *bp, *bp1, **bpp, *list = NULL;
2473 2514
2474 2515 /*
2475 2516 * We walk the chain and attempt to classify each packet.
2476 2517 * The packets that couldn't be classified will be returned
2477 2518 * back to the caller.
2478 2519 */
2479 2520 bp = mp_chain;
2480 2521 bpp = &list;
2481 2522 while (bp != NULL) {
2482 2523 bp1 = bp;
2483 2524 bp = bp->b_next;
2484 2525 bp1->b_next = NULL;
2485 2526
2486 2527 if (mac_rx_classify(mip, mrh, bp1) != NULL) {
2487 2528 *bpp = bp1;
2488 2529 bpp = &bp1->b_next;
2489 2530 }
2490 2531 }
2491 2532 return (list);
2492 2533 }
2493 2534
2494 2535 static int
2495 2536 mac_tx_flow_srs_wakeup(flow_entry_t *flent, void *arg)
2496 2537 {
2497 2538 mac_ring_handle_t ring = arg;
2498 2539
2499 2540 if (flent->fe_tx_srs)
2500 2541 mac_tx_srs_wakeup(flent->fe_tx_srs, ring);
2501 2542 return (0);
2502 2543 }
2503 2544
2504 2545 void
2505 2546 i_mac_tx_srs_notify(mac_impl_t *mip, mac_ring_handle_t ring)
2506 2547 {
2507 2548 mac_client_impl_t *cclient;
2508 2549 mac_soft_ring_set_t *mac_srs;
2509 2550
2510 2551 /*
2511 2552 * After grabbing the mi_rw_lock, the list of clients can't change.
2512 2553 * If there are any clients mi_disabled must be B_FALSE and can't
2513 2554 * get set since there are clients. If there aren't any clients we
2514 2555 * don't do anything. In any case the mip has to be valid. The driver
2515 2556 * must make sure that it goes single threaded (with respect to mac
2516 2557 * calls) and wait for all pending mac calls to finish before calling
2517 2558 * mac_unregister.
2518 2559 */
2519 2560 rw_enter(&i_mac_impl_lock, RW_READER);
2520 2561 if (mip->mi_state_flags & MIS_DISABLED) {
2521 2562 rw_exit(&i_mac_impl_lock);
2522 2563 return;
2523 2564 }
2524 2565
2525 2566 /*
2526 2567 * Get MAC tx srs from walking mac_client_handle list.
2527 2568 */
2528 2569 rw_enter(&mip->mi_rw_lock, RW_READER);
2529 2570 for (cclient = mip->mi_clients_list; cclient != NULL;
2530 2571 cclient = cclient->mci_client_next) {
2531 2572 if ((mac_srs = MCIP_TX_SRS(cclient)) != NULL) {
2532 2573 mac_tx_srs_wakeup(mac_srs, ring);
2533 2574 } else {
2534 2575 /*
2535 2576 * Aggr opens underlying ports in exclusive mode
2536 2577 * and registers flow control callbacks using
2537 2578 * mac_tx_client_notify(). When opened in
2538 2579 * exclusive mode, Tx SRS won't be created
2539 2580 * during mac_unicast_add().
2540 2581 */
2541 2582 if (cclient->mci_state_flags & MCIS_EXCLUSIVE) {
2542 2583 mac_tx_invoke_callbacks(cclient,
2543 2584 (mac_tx_cookie_t)ring);
2544 2585 }
2545 2586 }
2546 2587 (void) mac_flow_walk(cclient->mci_subflow_tab,
2547 2588 mac_tx_flow_srs_wakeup, ring);
2548 2589 }
2549 2590 rw_exit(&mip->mi_rw_lock);
2550 2591 rw_exit(&i_mac_impl_lock);
2551 2592 }
2552 2593
2553 2594 /* ARGSUSED */
2554 2595 void
2555 2596 mac_multicast_refresh(mac_handle_t mh, mac_multicst_t refresh, void *arg,
2556 2597 boolean_t add)
2557 2598 {
2558 2599 mac_impl_t *mip = (mac_impl_t *)mh;
2559 2600
2560 2601 i_mac_perim_enter((mac_impl_t *)mh);
2561 2602 /*
2562 2603 * If no specific refresh function was given then default to the
2563 2604 * driver's m_multicst entry point.
2564 2605 */
2565 2606 if (refresh == NULL) {
2566 2607 refresh = mip->mi_multicst;
2567 2608 arg = mip->mi_driver;
2568 2609 }
2569 2610
2570 2611 mac_bcast_refresh(mip, refresh, arg, add);
2571 2612 i_mac_perim_exit((mac_impl_t *)mh);
2572 2613 }
2573 2614
2574 2615 void
2575 2616 mac_promisc_refresh(mac_handle_t mh, mac_setpromisc_t refresh, void *arg)
2576 2617 {
2577 2618 mac_impl_t *mip = (mac_impl_t *)mh;
2578 2619
2579 2620 /*
2580 2621 * If no specific refresh function was given then default to the
2581 2622 * driver's m_promisc entry point.
2582 2623 */
2583 2624 if (refresh == NULL) {
2584 2625 refresh = mip->mi_setpromisc;
2585 2626 arg = mip->mi_driver;
2586 2627 }
2587 2628 ASSERT(refresh != NULL);
2588 2629
2589 2630 /*
2590 2631 * Call the refresh function with the current promiscuity.
2591 2632 */
2592 2633 refresh(arg, (mip->mi_devpromisc != 0));
2593 2634 }
2594 2635
2595 2636 /*
2596 2637 * The mac client requests that the mac not to change its margin size to
2597 2638 * be less than the specified value. If "current" is B_TRUE, then the client
2598 2639 * requests the mac not to change its margin size to be smaller than the
2599 2640 * current size. Further, return the current margin size value in this case.
2600 2641 *
2601 2642 * We keep every requested size in an ordered list from largest to smallest.
2602 2643 */
2603 2644 int
2604 2645 mac_margin_add(mac_handle_t mh, uint32_t *marginp, boolean_t current)
2605 2646 {
2606 2647 mac_impl_t *mip = (mac_impl_t *)mh;
2607 2648 mac_margin_req_t **pp, *p;
2608 2649 int err = 0;
2609 2650
2610 2651 rw_enter(&(mip->mi_rw_lock), RW_WRITER);
2611 2652 if (current)
2612 2653 *marginp = mip->mi_margin;
2613 2654
2614 2655 /*
2615 2656 * If the current margin value cannot satisfy the margin requested,
2616 2657 * return ENOTSUP directly.
2617 2658 */
2618 2659 if (*marginp > mip->mi_margin) {
2619 2660 err = ENOTSUP;
2620 2661 goto done;
2621 2662 }
2622 2663
2623 2664 /*
2624 2665 * Check whether the given margin is already in the list. If so,
2625 2666 * bump the reference count.
2626 2667 */
2627 2668 for (pp = &mip->mi_mmrp; (p = *pp) != NULL; pp = &p->mmr_nextp) {
2628 2669 if (p->mmr_margin == *marginp) {
2629 2670 /*
2630 2671 * The margin requested is already in the list,
2631 2672 * so just bump the reference count.
2632 2673 */
2633 2674 p->mmr_ref++;
2634 2675 goto done;
2635 2676 }
2636 2677 if (p->mmr_margin < *marginp)
2637 2678 break;
2638 2679 }
2639 2680
2640 2681
2641 2682 p = kmem_zalloc(sizeof (mac_margin_req_t), KM_SLEEP);
2642 2683 p->mmr_margin = *marginp;
2643 2684 p->mmr_ref++;
2644 2685 p->mmr_nextp = *pp;
2645 2686 *pp = p;
2646 2687
2647 2688 done:
2648 2689 rw_exit(&(mip->mi_rw_lock));
2649 2690 return (err);
2650 2691 }
2651 2692
2652 2693 /*
2653 2694 * The mac client requests to cancel its previous mac_margin_add() request.
2654 2695 * We remove the requested margin size from the list.
2655 2696 */
2656 2697 int
2657 2698 mac_margin_remove(mac_handle_t mh, uint32_t margin)
2658 2699 {
2659 2700 mac_impl_t *mip = (mac_impl_t *)mh;
2660 2701 mac_margin_req_t **pp, *p;
2661 2702 int err = 0;
2662 2703
2663 2704 rw_enter(&(mip->mi_rw_lock), RW_WRITER);
2664 2705 /*
2665 2706 * Find the entry in the list for the given margin.
2666 2707 */
2667 2708 for (pp = &(mip->mi_mmrp); (p = *pp) != NULL; pp = &(p->mmr_nextp)) {
2668 2709 if (p->mmr_margin == margin) {
2669 2710 if (--p->mmr_ref == 0)
2670 2711 break;
2671 2712
2672 2713 /*
2673 2714 * There is still a reference to this address so
2674 2715 * there's nothing more to do.
2675 2716 */
2676 2717 goto done;
2677 2718 }
2678 2719 }
2679 2720
2680 2721 /*
2681 2722 * We did not find an entry for the given margin.
2682 2723 */
2683 2724 if (p == NULL) {
2684 2725 err = ENOENT;
2685 2726 goto done;
2686 2727 }
2687 2728
2688 2729 ASSERT(p->mmr_ref == 0);
2689 2730
2690 2731 /*
2691 2732 * Remove it from the list.
2692 2733 */
2693 2734 *pp = p->mmr_nextp;
2694 2735 kmem_free(p, sizeof (mac_margin_req_t));
2695 2736 done:
2696 2737 rw_exit(&(mip->mi_rw_lock));
2697 2738 return (err);
2698 2739 }
2699 2740
2700 2741 boolean_t
2701 2742 mac_margin_update(mac_handle_t mh, uint32_t margin)
2702 2743 {
2703 2744 mac_impl_t *mip = (mac_impl_t *)mh;
2704 2745 uint32_t margin_needed = 0;
2705 2746
2706 2747 rw_enter(&(mip->mi_rw_lock), RW_WRITER);
2707 2748
2708 2749 if (mip->mi_mmrp != NULL)
2709 2750 margin_needed = mip->mi_mmrp->mmr_margin;
2710 2751
2711 2752 if (margin_needed <= margin)
2712 2753 mip->mi_margin = margin;
2713 2754
2714 2755 rw_exit(&(mip->mi_rw_lock));
2715 2756
2716 2757 if (margin_needed <= margin)
2717 2758 i_mac_notify(mip, MAC_NOTE_MARGIN);
2718 2759
2719 2760 return (margin_needed <= margin);
2720 2761 }
2721 2762
2722 2763 /*
2723 2764 * MAC clients use this interface to request that a MAC device not change its
2724 2765 * MTU below the specified amount. At this time, that amount must be within the
2725 2766 * range of the device's current minimum and the device's current maximum. eg. a
2726 2767 * client cannot request a 3000 byte MTU when the device's MTU is currently
2727 2768 * 2000.
2728 2769 *
2729 2770 * If "current" is set to B_TRUE, then the request is to simply to reserve the
2730 2771 * current underlying mac's maximum for this mac client and return it in mtup.
2731 2772 */
2732 2773 int
2733 2774 mac_mtu_add(mac_handle_t mh, uint32_t *mtup, boolean_t current)
2734 2775 {
2735 2776 mac_impl_t *mip = (mac_impl_t *)mh;
2736 2777 mac_mtu_req_t *prev, *cur;
2737 2778 mac_propval_range_t mpr;
2738 2779 int err;
2739 2780
2740 2781 i_mac_perim_enter(mip);
2741 2782 rw_enter(&mip->mi_rw_lock, RW_WRITER);
2742 2783
2743 2784 if (current == B_TRUE)
2744 2785 *mtup = mip->mi_sdu_max;
2745 2786 mpr.mpr_count = 1;
2746 2787 err = mac_prop_info(mh, MAC_PROP_MTU, "mtu", NULL, 0, &mpr, NULL);
2747 2788 if (err != 0) {
2748 2789 rw_exit(&mip->mi_rw_lock);
2749 2790 i_mac_perim_exit(mip);
2750 2791 return (err);
2751 2792 }
2752 2793
2753 2794 if (*mtup > mip->mi_sdu_max ||
2754 2795 *mtup < mpr.mpr_range_uint32[0].mpur_min) {
2755 2796 rw_exit(&mip->mi_rw_lock);
2756 2797 i_mac_perim_exit(mip);
2757 2798 return (ENOTSUP);
2758 2799 }
2759 2800
2760 2801 prev = NULL;
2761 2802 for (cur = mip->mi_mtrp; cur != NULL; cur = cur->mtr_nextp) {
2762 2803 if (*mtup == cur->mtr_mtu) {
2763 2804 cur->mtr_ref++;
2764 2805 rw_exit(&mip->mi_rw_lock);
2765 2806 i_mac_perim_exit(mip);
2766 2807 return (0);
2767 2808 }
2768 2809
2769 2810 if (*mtup > cur->mtr_mtu)
2770 2811 break;
2771 2812
2772 2813 prev = cur;
2773 2814 }
2774 2815
2775 2816 cur = kmem_alloc(sizeof (mac_mtu_req_t), KM_SLEEP);
2776 2817 cur->mtr_mtu = *mtup;
2777 2818 cur->mtr_ref = 1;
2778 2819 if (prev != NULL) {
2779 2820 cur->mtr_nextp = prev->mtr_nextp;
2780 2821 prev->mtr_nextp = cur;
2781 2822 } else {
2782 2823 cur->mtr_nextp = mip->mi_mtrp;
2783 2824 mip->mi_mtrp = cur;
2784 2825 }
2785 2826
2786 2827 rw_exit(&mip->mi_rw_lock);
2787 2828 i_mac_perim_exit(mip);
2788 2829 return (0);
2789 2830 }
2790 2831
2791 2832 int
2792 2833 mac_mtu_remove(mac_handle_t mh, uint32_t mtu)
2793 2834 {
2794 2835 mac_impl_t *mip = (mac_impl_t *)mh;
2795 2836 mac_mtu_req_t *cur, *prev;
2796 2837
2797 2838 i_mac_perim_enter(mip);
2798 2839 rw_enter(&mip->mi_rw_lock, RW_WRITER);
2799 2840
2800 2841 prev = NULL;
2801 2842 for (cur = mip->mi_mtrp; cur != NULL; cur = cur->mtr_nextp) {
2802 2843 if (cur->mtr_mtu == mtu) {
2803 2844 ASSERT(cur->mtr_ref > 0);
2804 2845 cur->mtr_ref--;
2805 2846 if (cur->mtr_ref == 0) {
2806 2847 if (prev == NULL) {
2807 2848 mip->mi_mtrp = cur->mtr_nextp;
2808 2849 } else {
2809 2850 prev->mtr_nextp = cur->mtr_nextp;
2810 2851 }
2811 2852 kmem_free(cur, sizeof (mac_mtu_req_t));
2812 2853 }
2813 2854 rw_exit(&mip->mi_rw_lock);
2814 2855 i_mac_perim_exit(mip);
2815 2856 return (0);
2816 2857 }
2817 2858
2818 2859 prev = cur;
2819 2860 }
2820 2861
2821 2862 rw_exit(&mip->mi_rw_lock);
2822 2863 i_mac_perim_exit(mip);
2823 2864 return (ENOENT);
2824 2865 }
2825 2866
2826 2867 /*
2827 2868 * MAC Type Plugin functions.
2828 2869 */
2829 2870
2830 2871 mactype_t *
2831 2872 mactype_getplugin(const char *pname)
2832 2873 {
2833 2874 mactype_t *mtype = NULL;
2834 2875 boolean_t tried_modload = B_FALSE;
2835 2876
2836 2877 mutex_enter(&i_mactype_lock);
2837 2878
2838 2879 find_registered_mactype:
2839 2880 if (mod_hash_find(i_mactype_hash, (mod_hash_key_t)pname,
2840 2881 (mod_hash_val_t *)&mtype) != 0) {
2841 2882 if (!tried_modload) {
2842 2883 /*
2843 2884 * If the plugin has not yet been loaded, then
2844 2885 * attempt to load it now. If modload() succeeds,
2845 2886 * the plugin should have registered using
2846 2887 * mactype_register(), in which case we can go back
2847 2888 * and attempt to find it again.
2848 2889 */
2849 2890 if (modload(MACTYPE_KMODDIR, (char *)pname) != -1) {
2850 2891 tried_modload = B_TRUE;
2851 2892 goto find_registered_mactype;
2852 2893 }
2853 2894 }
2854 2895 } else {
2855 2896 /*
2856 2897 * Note that there's no danger that the plugin we've loaded
2857 2898 * could be unloaded between the modload() step and the
2858 2899 * reference count bump here, as we're holding
2859 2900 * i_mactype_lock, which mactype_unregister() also holds.
2860 2901 */
2861 2902 atomic_inc_32(&mtype->mt_ref);
2862 2903 }
2863 2904
2864 2905 mutex_exit(&i_mactype_lock);
2865 2906 return (mtype);
2866 2907 }
2867 2908
2868 2909 mactype_register_t *
2869 2910 mactype_alloc(uint_t mactype_version)
2870 2911 {
2871 2912 mactype_register_t *mtrp;
2872 2913
2873 2914 /*
2874 2915 * Make sure there isn't a version mismatch between the plugin and
2875 2916 * the framework. In the future, if multiple versions are
2876 2917 * supported, this check could become more sophisticated.
2877 2918 */
2878 2919 if (mactype_version != MACTYPE_VERSION)
2879 2920 return (NULL);
2880 2921
2881 2922 mtrp = kmem_zalloc(sizeof (mactype_register_t), KM_SLEEP);
2882 2923 mtrp->mtr_version = mactype_version;
2883 2924 return (mtrp);
2884 2925 }
2885 2926
2886 2927 void
2887 2928 mactype_free(mactype_register_t *mtrp)
2888 2929 {
2889 2930 kmem_free(mtrp, sizeof (mactype_register_t));
2890 2931 }
2891 2932
2892 2933 int
2893 2934 mactype_register(mactype_register_t *mtrp)
2894 2935 {
2895 2936 mactype_t *mtp;
2896 2937 mactype_ops_t *ops = mtrp->mtr_ops;
2897 2938
2898 2939 /* Do some sanity checking before we register this MAC type. */
2899 2940 if (mtrp->mtr_ident == NULL || ops == NULL)
2900 2941 return (EINVAL);
2901 2942
2902 2943 /*
2903 2944 * Verify that all mandatory callbacks are set in the ops
2904 2945 * vector.
2905 2946 */
2906 2947 if (ops->mtops_unicst_verify == NULL ||
2907 2948 ops->mtops_multicst_verify == NULL ||
2908 2949 ops->mtops_sap_verify == NULL ||
2909 2950 ops->mtops_header == NULL ||
2910 2951 ops->mtops_header_info == NULL) {
2911 2952 return (EINVAL);
2912 2953 }
2913 2954
2914 2955 mtp = kmem_zalloc(sizeof (*mtp), KM_SLEEP);
2915 2956 mtp->mt_ident = mtrp->mtr_ident;
2916 2957 mtp->mt_ops = *ops;
2917 2958 mtp->mt_type = mtrp->mtr_mactype;
2918 2959 mtp->mt_nativetype = mtrp->mtr_nativetype;
2919 2960 mtp->mt_addr_length = mtrp->mtr_addrlen;
2920 2961 if (mtrp->mtr_brdcst_addr != NULL) {
2921 2962 mtp->mt_brdcst_addr = kmem_alloc(mtrp->mtr_addrlen, KM_SLEEP);
2922 2963 bcopy(mtrp->mtr_brdcst_addr, mtp->mt_brdcst_addr,
2923 2964 mtrp->mtr_addrlen);
2924 2965 }
2925 2966
2926 2967 mtp->mt_stats = mtrp->mtr_stats;
2927 2968 mtp->mt_statcount = mtrp->mtr_statcount;
2928 2969
2929 2970 mtp->mt_mapping = mtrp->mtr_mapping;
2930 2971 mtp->mt_mappingcount = mtrp->mtr_mappingcount;
2931 2972
2932 2973 if (mod_hash_insert(i_mactype_hash,
2933 2974 (mod_hash_key_t)mtp->mt_ident, (mod_hash_val_t)mtp) != 0) {
2934 2975 kmem_free(mtp->mt_brdcst_addr, mtp->mt_addr_length);
2935 2976 kmem_free(mtp, sizeof (*mtp));
2936 2977 return (EEXIST);
2937 2978 }
2938 2979 return (0);
2939 2980 }
2940 2981
2941 2982 int
2942 2983 mactype_unregister(const char *ident)
2943 2984 {
2944 2985 mactype_t *mtp;
2945 2986 mod_hash_val_t val;
2946 2987 int err;
2947 2988
2948 2989 /*
2949 2990 * Let's not allow MAC drivers to use this plugin while we're
2950 2991 * trying to unregister it. Holding i_mactype_lock also prevents a
2951 2992 * plugin from unregistering while a MAC driver is attempting to
2952 2993 * hold a reference to it in i_mactype_getplugin().
2953 2994 */
2954 2995 mutex_enter(&i_mactype_lock);
2955 2996
2956 2997 if ((err = mod_hash_find(i_mactype_hash, (mod_hash_key_t)ident,
2957 2998 (mod_hash_val_t *)&mtp)) != 0) {
2958 2999 /* A plugin is trying to unregister, but it never registered. */
2959 3000 err = ENXIO;
2960 3001 goto done;
2961 3002 }
2962 3003
2963 3004 if (mtp->mt_ref != 0) {
2964 3005 err = EBUSY;
2965 3006 goto done;
2966 3007 }
2967 3008
2968 3009 err = mod_hash_remove(i_mactype_hash, (mod_hash_key_t)ident, &val);
2969 3010 ASSERT(err == 0);
2970 3011 if (err != 0) {
2971 3012 /* This should never happen, thus the ASSERT() above. */
2972 3013 err = EINVAL;
2973 3014 goto done;
2974 3015 }
2975 3016 ASSERT(mtp == (mactype_t *)val);
2976 3017
2977 3018 if (mtp->mt_brdcst_addr != NULL)
2978 3019 kmem_free(mtp->mt_brdcst_addr, mtp->mt_addr_length);
2979 3020 kmem_free(mtp, sizeof (mactype_t));
2980 3021 done:
2981 3022 mutex_exit(&i_mactype_lock);
2982 3023 return (err);
2983 3024 }
2984 3025
2985 3026 /*
2986 3027 * Checks the size of the value size specified for a property as
2987 3028 * part of a property operation. Returns B_TRUE if the size is
2988 3029 * correct, B_FALSE otherwise.
2989 3030 */
2990 3031 boolean_t
2991 3032 mac_prop_check_size(mac_prop_id_t id, uint_t valsize, boolean_t is_range)
2992 3033 {
2993 3034 uint_t minsize = 0;
2994 3035
2995 3036 if (is_range)
2996 3037 return (valsize >= sizeof (mac_propval_range_t));
2997 3038
2998 3039 switch (id) {
2999 3040 case MAC_PROP_ZONE:
3000 3041 minsize = sizeof (dld_ioc_zid_t);
3001 3042 break;
3002 3043 case MAC_PROP_AUTOPUSH:
3003 3044 if (valsize != 0)
3004 3045 minsize = sizeof (struct dlautopush);
3005 3046 break;
3006 3047 case MAC_PROP_TAGMODE:
3007 3048 minsize = sizeof (link_tagmode_t);
3008 3049 break;
3009 3050 case MAC_PROP_RESOURCE:
3010 3051 case MAC_PROP_RESOURCE_EFF:
3011 3052 minsize = sizeof (mac_resource_props_t);
3012 3053 break;
3013 3054 case MAC_PROP_DUPLEX:
3014 3055 minsize = sizeof (link_duplex_t);
3015 3056 break;
3016 3057 case MAC_PROP_SPEED:
3017 3058 minsize = sizeof (uint64_t);
3018 3059 break;
3019 3060 case MAC_PROP_STATUS:
3020 3061 minsize = sizeof (link_state_t);
3021 3062 break;
3022 3063 case MAC_PROP_AUTONEG:
3023 3064 case MAC_PROP_EN_AUTONEG:
3024 3065 minsize = sizeof (uint8_t);
3025 3066 break;
3026 3067 case MAC_PROP_MTU:
3027 3068 case MAC_PROP_LLIMIT:
3028 3069 case MAC_PROP_LDECAY:
3029 3070 minsize = sizeof (uint32_t);
3030 3071 break;
3031 3072 case MAC_PROP_FLOWCTRL:
3032 3073 minsize = sizeof (link_flowctrl_t);
3033 3074 break;
3034 3075 case MAC_PROP_ADV_5000FDX_CAP:
3035 3076 case MAC_PROP_EN_5000FDX_CAP:
3036 3077 case MAC_PROP_ADV_2500FDX_CAP:
3037 3078 case MAC_PROP_EN_2500FDX_CAP:
3038 3079 case MAC_PROP_ADV_100GFDX_CAP:
3039 3080 case MAC_PROP_EN_100GFDX_CAP:
3040 3081 case MAC_PROP_ADV_50GFDX_CAP:
3041 3082 case MAC_PROP_EN_50GFDX_CAP:
3042 3083 case MAC_PROP_ADV_40GFDX_CAP:
3043 3084 case MAC_PROP_EN_40GFDX_CAP:
3044 3085 case MAC_PROP_ADV_25GFDX_CAP:
3045 3086 case MAC_PROP_EN_25GFDX_CAP:
3046 3087 case MAC_PROP_ADV_10GFDX_CAP:
3047 3088 case MAC_PROP_EN_10GFDX_CAP:
3048 3089 case MAC_PROP_ADV_1000HDX_CAP:
3049 3090 case MAC_PROP_EN_1000HDX_CAP:
3050 3091 case MAC_PROP_ADV_100FDX_CAP:
3051 3092 case MAC_PROP_EN_100FDX_CAP:
3052 3093 case MAC_PROP_ADV_100HDX_CAP:
3053 3094 case MAC_PROP_EN_100HDX_CAP:
3054 3095 case MAC_PROP_ADV_10FDX_CAP:
3055 3096 case MAC_PROP_EN_10FDX_CAP:
3056 3097 case MAC_PROP_ADV_10HDX_CAP:
3057 3098 case MAC_PROP_EN_10HDX_CAP:
3058 3099 case MAC_PROP_ADV_100T4_CAP:
3059 3100 case MAC_PROP_EN_100T4_CAP:
3060 3101 minsize = sizeof (uint8_t);
3061 3102 break;
3062 3103 case MAC_PROP_PVID:
3063 3104 minsize = sizeof (uint16_t);
3064 3105 break;
3065 3106 case MAC_PROP_IPTUN_HOPLIMIT:
3066 3107 minsize = sizeof (uint32_t);
3067 3108 break;
3068 3109 case MAC_PROP_IPTUN_ENCAPLIMIT:
3069 3110 minsize = sizeof (uint32_t);
3070 3111 break;
3071 3112 case MAC_PROP_MAX_TX_RINGS_AVAIL:
3072 3113 case MAC_PROP_MAX_RX_RINGS_AVAIL:
3073 3114 case MAC_PROP_MAX_RXHWCLNT_AVAIL:
3074 3115 case MAC_PROP_MAX_TXHWCLNT_AVAIL:
3075 3116 minsize = sizeof (uint_t);
3076 3117 break;
3077 3118 case MAC_PROP_WL_ESSID:
3078 3119 minsize = sizeof (wl_linkstatus_t);
3079 3120 break;
3080 3121 case MAC_PROP_WL_BSSID:
3081 3122 minsize = sizeof (wl_bssid_t);
3082 3123 break;
3083 3124 case MAC_PROP_WL_BSSTYPE:
3084 3125 minsize = sizeof (wl_bss_type_t);
3085 3126 break;
3086 3127 case MAC_PROP_WL_LINKSTATUS:
3087 3128 minsize = sizeof (wl_linkstatus_t);
3088 3129 break;
3089 3130 case MAC_PROP_WL_DESIRED_RATES:
3090 3131 minsize = sizeof (wl_rates_t);
3091 3132 break;
3092 3133 case MAC_PROP_WL_SUPPORTED_RATES:
3093 3134 minsize = sizeof (wl_rates_t);
3094 3135 break;
3095 3136 case MAC_PROP_WL_AUTH_MODE:
3096 3137 minsize = sizeof (wl_authmode_t);
3097 3138 break;
3098 3139 case MAC_PROP_WL_ENCRYPTION:
3099 3140 minsize = sizeof (wl_encryption_t);
3100 3141 break;
3101 3142 case MAC_PROP_WL_RSSI:
3102 3143 minsize = sizeof (wl_rssi_t);
3103 3144 break;
3104 3145 case MAC_PROP_WL_PHY_CONFIG:
3105 3146 minsize = sizeof (wl_phy_conf_t);
3106 3147 break;
3107 3148 case MAC_PROP_WL_CAPABILITY:
3108 3149 minsize = sizeof (wl_capability_t);
3109 3150 break;
3110 3151 case MAC_PROP_WL_WPA:
3111 3152 minsize = sizeof (wl_wpa_t);
3112 3153 break;
3113 3154 case MAC_PROP_WL_SCANRESULTS:
3114 3155 minsize = sizeof (wl_wpa_ess_t);
3115 3156 break;
3116 3157 case MAC_PROP_WL_POWER_MODE:
3117 3158 minsize = sizeof (wl_ps_mode_t);
3118 3159 break;
3119 3160 case MAC_PROP_WL_RADIO:
3120 3161 minsize = sizeof (wl_radio_t);
3121 3162 break;
3122 3163 case MAC_PROP_WL_ESS_LIST:
3123 3164 minsize = sizeof (wl_ess_list_t);
3124 3165 break;
3125 3166 case MAC_PROP_WL_KEY_TAB:
3126 3167 minsize = sizeof (wl_wep_key_tab_t);
3127 3168 break;
3128 3169 case MAC_PROP_WL_CREATE_IBSS:
3129 3170 minsize = sizeof (wl_create_ibss_t);
3130 3171 break;
3131 3172 case MAC_PROP_WL_SETOPTIE:
3132 3173 minsize = sizeof (wl_wpa_ie_t);
3133 3174 break;
3134 3175 case MAC_PROP_WL_DELKEY:
3135 3176 minsize = sizeof (wl_del_key_t);
3136 3177 break;
3137 3178 case MAC_PROP_WL_KEY:
3138 3179 minsize = sizeof (wl_key_t);
3139 3180 break;
3140 3181 case MAC_PROP_WL_MLME:
3141 3182 minsize = sizeof (wl_mlme_t);
3142 3183 break;
3143 3184 case MAC_PROP_VN_PROMISC_FILTERED:
3144 3185 minsize = sizeof (boolean_t);
3145 3186 break;
3146 3187 }
3147 3188
3148 3189 return (valsize >= minsize);
3149 3190 }
3150 3191
3151 3192 /*
3152 3193 * mac_set_prop() sets MAC or hardware driver properties:
3153 3194 *
3154 3195 * - MAC-managed properties such as resource properties include maxbw,
3155 3196 * priority, and cpu binding list, as well as the default port VID
3156 3197 * used by bridging. These properties are consumed by the MAC layer
3157 3198 * itself and not passed down to the driver. For resource control
3158 3199 * properties, this function invokes mac_set_resources() which will
3159 3200 * cache the property value in mac_impl_t and may call
3160 3201 * mac_client_set_resource() to update property value of the primary
3161 3202 * mac client, if it exists.
3162 3203 *
3163 3204 * - Properties which act on the hardware and must be passed to the
3164 3205 * driver, such as MTU, through the driver's mc_setprop() entry point.
3165 3206 */
3166 3207 int
3167 3208 mac_set_prop(mac_handle_t mh, mac_prop_id_t id, char *name, void *val,
3168 3209 uint_t valsize)
3169 3210 {
3170 3211 int err = ENOTSUP;
3171 3212 mac_impl_t *mip = (mac_impl_t *)mh;
3172 3213
3173 3214 ASSERT(MAC_PERIM_HELD(mh));
3174 3215
3175 3216 switch (id) {
3176 3217 case MAC_PROP_RESOURCE: {
3177 3218 mac_resource_props_t *mrp;
3178 3219
3179 3220 /* call mac_set_resources() for MAC properties */
3180 3221 ASSERT(valsize >= sizeof (mac_resource_props_t));
3181 3222 mrp = kmem_zalloc(sizeof (*mrp), KM_SLEEP);
3182 3223 bcopy(val, mrp, sizeof (*mrp));
3183 3224 err = mac_set_resources(mh, mrp);
3184 3225 kmem_free(mrp, sizeof (*mrp));
3185 3226 break;
3186 3227 }
3187 3228
3188 3229 case MAC_PROP_PVID:
3189 3230 ASSERT(valsize >= sizeof (uint16_t));
3190 3231 if (mip->mi_state_flags & MIS_IS_VNIC)
3191 3232 return (EINVAL);
3192 3233 err = mac_set_pvid(mh, *(uint16_t *)val);
3193 3234 break;
3194 3235
3195 3236 case MAC_PROP_MTU: {
3196 3237 uint32_t mtu;
3197 3238
3198 3239 ASSERT(valsize >= sizeof (uint32_t));
3199 3240 bcopy(val, &mtu, sizeof (mtu));
3200 3241 err = mac_set_mtu(mh, mtu, NULL);
3201 3242 break;
3202 3243 }
3203 3244
3204 3245 case MAC_PROP_LLIMIT:
3205 3246 case MAC_PROP_LDECAY: {
3206 3247 uint32_t learnval;
3207 3248
3208 3249 if (valsize < sizeof (learnval) ||
3209 3250 (mip->mi_state_flags & MIS_IS_VNIC))
3210 3251 return (EINVAL);
3211 3252 bcopy(val, &learnval, sizeof (learnval));
3212 3253 if (learnval == 0 && id == MAC_PROP_LDECAY)
3213 3254 return (EINVAL);
3214 3255 if (id == MAC_PROP_LLIMIT)
3215 3256 mip->mi_llimit = learnval;
3216 3257 else
3217 3258 mip->mi_ldecay = learnval;
3218 3259 err = 0;
3219 3260 break;
3220 3261 }
3221 3262
3222 3263 default:
3223 3264 /* For other driver properties, call driver's callback */
3224 3265 if (mip->mi_callbacks->mc_callbacks & MC_SETPROP) {
3225 3266 err = mip->mi_callbacks->mc_setprop(mip->mi_driver,
3226 3267 name, id, valsize, val);
3227 3268 }
3228 3269 }
3229 3270 return (err);
3230 3271 }
3231 3272
3232 3273 /*
3233 3274 * mac_get_prop() gets MAC or device driver properties.
3234 3275 *
3235 3276 * If the property is a driver property, mac_get_prop() calls driver's callback
3236 3277 * entry point to get it.
3237 3278 * If the property is a MAC property, mac_get_prop() invokes mac_get_resources()
3238 3279 * which returns the cached value in mac_impl_t.
3239 3280 */
3240 3281 int
3241 3282 mac_get_prop(mac_handle_t mh, mac_prop_id_t id, char *name, void *val,
3242 3283 uint_t valsize)
3243 3284 {
3244 3285 int err = ENOTSUP;
3245 3286 mac_impl_t *mip = (mac_impl_t *)mh;
3246 3287 uint_t rings;
3247 3288 uint_t vlinks;
3248 3289
3249 3290 bzero(val, valsize);
3250 3291
3251 3292 switch (id) {
3252 3293 case MAC_PROP_RESOURCE: {
3253 3294 mac_resource_props_t *mrp;
3254 3295
3255 3296 /* If mac property, read from cache */
3256 3297 ASSERT(valsize >= sizeof (mac_resource_props_t));
3257 3298 mrp = kmem_zalloc(sizeof (*mrp), KM_SLEEP);
3258 3299 mac_get_resources(mh, mrp);
3259 3300 bcopy(mrp, val, sizeof (*mrp));
3260 3301 kmem_free(mrp, sizeof (*mrp));
3261 3302 return (0);
3262 3303 }
3263 3304 case MAC_PROP_RESOURCE_EFF: {
3264 3305 mac_resource_props_t *mrp;
3265 3306
3266 3307 /* If mac effective property, read from client */
3267 3308 ASSERT(valsize >= sizeof (mac_resource_props_t));
3268 3309 mrp = kmem_zalloc(sizeof (*mrp), KM_SLEEP);
3269 3310 mac_get_effective_resources(mh, mrp);
3270 3311 bcopy(mrp, val, sizeof (*mrp));
3271 3312 kmem_free(mrp, sizeof (*mrp));
3272 3313 return (0);
3273 3314 }
3274 3315
3275 3316 case MAC_PROP_PVID:
3276 3317 ASSERT(valsize >= sizeof (uint16_t));
3277 3318 if (mip->mi_state_flags & MIS_IS_VNIC)
3278 3319 return (EINVAL);
3279 3320 *(uint16_t *)val = mac_get_pvid(mh);
3280 3321 return (0);
3281 3322
3282 3323 case MAC_PROP_LLIMIT:
3283 3324 case MAC_PROP_LDECAY:
3284 3325 ASSERT(valsize >= sizeof (uint32_t));
3285 3326 if (mip->mi_state_flags & MIS_IS_VNIC)
3286 3327 return (EINVAL);
3287 3328 if (id == MAC_PROP_LLIMIT)
3288 3329 bcopy(&mip->mi_llimit, val, sizeof (mip->mi_llimit));
3289 3330 else
3290 3331 bcopy(&mip->mi_ldecay, val, sizeof (mip->mi_ldecay));
3291 3332 return (0);
3292 3333
3293 3334 case MAC_PROP_MTU: {
3294 3335 uint32_t sdu;
3295 3336
3296 3337 ASSERT(valsize >= sizeof (uint32_t));
3297 3338 mac_sdu_get2(mh, NULL, &sdu, NULL);
3298 3339 bcopy(&sdu, val, sizeof (sdu));
3299 3340
3300 3341 return (0);
3301 3342 }
3302 3343 case MAC_PROP_STATUS: {
3303 3344 link_state_t link_state;
3304 3345
3305 3346 if (valsize < sizeof (link_state))
3306 3347 return (EINVAL);
3307 3348 link_state = mac_link_get(mh);
3308 3349 bcopy(&link_state, val, sizeof (link_state));
3309 3350
3310 3351 return (0);
3311 3352 }
3312 3353
3313 3354 case MAC_PROP_MAX_RX_RINGS_AVAIL:
3314 3355 case MAC_PROP_MAX_TX_RINGS_AVAIL:
3315 3356 ASSERT(valsize >= sizeof (uint_t));
3316 3357 rings = id == MAC_PROP_MAX_RX_RINGS_AVAIL ?
3317 3358 mac_rxavail_get(mh) : mac_txavail_get(mh);
3318 3359 bcopy(&rings, val, sizeof (uint_t));
3319 3360 return (0);
3320 3361
3321 3362 case MAC_PROP_MAX_RXHWCLNT_AVAIL:
3322 3363 case MAC_PROP_MAX_TXHWCLNT_AVAIL:
3323 3364 ASSERT(valsize >= sizeof (uint_t));
3324 3365 vlinks = id == MAC_PROP_MAX_RXHWCLNT_AVAIL ?
3325 3366 mac_rxhwlnksavail_get(mh) : mac_txhwlnksavail_get(mh);
3326 3367 bcopy(&vlinks, val, sizeof (uint_t));
3327 3368 return (0);
3328 3369
3329 3370 case MAC_PROP_RXRINGSRANGE:
3330 3371 case MAC_PROP_TXRINGSRANGE:
3331 3372 /*
3332 3373 * The value for these properties are returned through
3333 3374 * the MAC_PROP_RESOURCE property.
3334 3375 */
3335 3376 return (0);
3336 3377
3337 3378 default:
3338 3379 break;
3339 3380
3340 3381 }
3341 3382
3342 3383 /* If driver property, request from driver */
3343 3384 if (mip->mi_callbacks->mc_callbacks & MC_GETPROP) {
3344 3385 err = mip->mi_callbacks->mc_getprop(mip->mi_driver, name, id,
3345 3386 valsize, val);
3346 3387 }
3347 3388
3348 3389 return (err);
3349 3390 }
3350 3391
3351 3392 /*
3352 3393 * Helper function to initialize the range structure for use in
3353 3394 * mac_get_prop. If the type can be other than uint32, we can
3354 3395 * pass that as an arg.
3355 3396 */
3356 3397 static void
3357 3398 _mac_set_range(mac_propval_range_t *range, uint32_t min, uint32_t max)
3358 3399 {
3359 3400 range->mpr_count = 1;
3360 3401 range->mpr_type = MAC_PROPVAL_UINT32;
3361 3402 range->mpr_range_uint32[0].mpur_min = min;
3362 3403 range->mpr_range_uint32[0].mpur_max = max;
3363 3404 }
3364 3405
3365 3406 /*
3366 3407 * Returns information about the specified property, such as default
3367 3408 * values or permissions.
3368 3409 */
3369 3410 int
3370 3411 mac_prop_info(mac_handle_t mh, mac_prop_id_t id, char *name,
3371 3412 void *default_val, uint_t default_size, mac_propval_range_t *range,
3372 3413 uint_t *perm)
3373 3414 {
3374 3415 mac_prop_info_state_t state;
3375 3416 mac_impl_t *mip = (mac_impl_t *)mh;
3376 3417 uint_t max;
3377 3418
3378 3419 /*
3379 3420 * A property is read/write by default unless the driver says
3380 3421 * otherwise.
3381 3422 */
3382 3423 if (perm != NULL)
3383 3424 *perm = MAC_PROP_PERM_RW;
3384 3425
3385 3426 if (default_val != NULL)
3386 3427 bzero(default_val, default_size);
3387 3428
3388 3429 /*
3389 3430 * First, handle framework properties for which we don't need to
3390 3431 * involve the driver.
3391 3432 */
3392 3433 switch (id) {
3393 3434 case MAC_PROP_RESOURCE:
3394 3435 case MAC_PROP_PVID:
3395 3436 case MAC_PROP_LLIMIT:
3396 3437 case MAC_PROP_LDECAY:
3397 3438 return (0);
3398 3439
3399 3440 case MAC_PROP_MAX_RX_RINGS_AVAIL:
3400 3441 case MAC_PROP_MAX_TX_RINGS_AVAIL:
3401 3442 case MAC_PROP_MAX_RXHWCLNT_AVAIL:
3402 3443 case MAC_PROP_MAX_TXHWCLNT_AVAIL:
3403 3444 if (perm != NULL)
3404 3445 *perm = MAC_PROP_PERM_READ;
3405 3446 return (0);
3406 3447
3407 3448 case MAC_PROP_RXRINGSRANGE:
3408 3449 case MAC_PROP_TXRINGSRANGE:
3409 3450 /*
3410 3451 * Currently, we support range for RX and TX rings properties.
3411 3452 * When we extend this support to maxbw, cpus and priority,
3412 3453 * we should move this to mac_get_resources.
3413 3454 * There is no default value for RX or TX rings.
3414 3455 */
3415 3456 if ((mip->mi_state_flags & MIS_IS_VNIC) &&
3416 3457 mac_is_vnic_primary(mh)) {
3417 3458 /*
3418 3459 * We don't support setting rings for a VLAN
3419 3460 * data link because it shares its ring with the
3420 3461 * primary MAC client.
3421 3462 */
3422 3463 if (perm != NULL)
3423 3464 *perm = MAC_PROP_PERM_READ;
3424 3465 if (range != NULL)
3425 3466 range->mpr_count = 0;
3426 3467 } else if (range != NULL) {
3427 3468 if (mip->mi_state_flags & MIS_IS_VNIC)
3428 3469 mh = mac_get_lower_mac_handle(mh);
3429 3470 mip = (mac_impl_t *)mh;
3430 3471 if ((id == MAC_PROP_RXRINGSRANGE &&
3431 3472 mip->mi_rx_group_type == MAC_GROUP_TYPE_STATIC) ||
3432 3473 (id == MAC_PROP_TXRINGSRANGE &&
3433 3474 mip->mi_tx_group_type == MAC_GROUP_TYPE_STATIC)) {
3434 3475 if (id == MAC_PROP_RXRINGSRANGE) {
3435 3476 if ((mac_rxhwlnksavail_get(mh) +
3436 3477 mac_rxhwlnksrsvd_get(mh)) <= 1) {
3437 3478 /*
3438 3479 * doesn't support groups or
3439 3480 * rings
3440 3481 */
3441 3482 range->mpr_count = 0;
3442 3483 } else {
3443 3484 /*
3444 3485 * supports specifying groups,
3445 3486 * but not rings
3446 3487 */
3447 3488 _mac_set_range(range, 0, 0);
3448 3489 }
3449 3490 } else {
3450 3491 if ((mac_txhwlnksavail_get(mh) +
3451 3492 mac_txhwlnksrsvd_get(mh)) <= 1) {
3452 3493 /*
3453 3494 * doesn't support groups or
3454 3495 * rings
3455 3496 */
3456 3497 range->mpr_count = 0;
3457 3498 } else {
3458 3499 /*
3459 3500 * supports specifying groups,
3460 3501 * but not rings
3461 3502 */
3462 3503 _mac_set_range(range, 0, 0);
3463 3504 }
3464 3505 }
3465 3506 } else {
3466 3507 max = id == MAC_PROP_RXRINGSRANGE ?
3467 3508 mac_rxavail_get(mh) + mac_rxrsvd_get(mh) :
3468 3509 mac_txavail_get(mh) + mac_txrsvd_get(mh);
3469 3510 if (max <= 1) {
3470 3511 /*
3471 3512 * doesn't support groups or
3472 3513 * rings
3473 3514 */
3474 3515 range->mpr_count = 0;
3475 3516 } else {
3476 3517 /*
3477 3518 * -1 because we have to leave out the
3478 3519 * default ring.
3479 3520 */
3480 3521 _mac_set_range(range, 1, max - 1);
3481 3522 }
3482 3523 }
3483 3524 }
3484 3525 return (0);
3485 3526
3486 3527 case MAC_PROP_STATUS:
3487 3528 if (perm != NULL)
3488 3529 *perm = MAC_PROP_PERM_READ;
3489 3530 return (0);
3490 3531 }
3491 3532
3492 3533 /*
3493 3534 * Get the property info from the driver if it implements the
3494 3535 * property info entry point.
3495 3536 */
3496 3537 bzero(&state, sizeof (state));
3497 3538
3498 3539 if (mip->mi_callbacks->mc_callbacks & MC_PROPINFO) {
3499 3540 state.pr_default = default_val;
3500 3541 state.pr_default_size = default_size;
3501 3542
3502 3543 /*
3503 3544 * The caller specifies the maximum number of ranges
3504 3545 * it can accomodate using mpr_count. We don't touch
3505 3546 * this value until the driver returns from its
3506 3547 * mc_propinfo() callback, and ensure we don't exceed
3507 3548 * this number of range as the driver defines
3508 3549 * supported range from its mc_propinfo().
3509 3550 *
3510 3551 * pr_range_cur_count keeps track of how many ranges
3511 3552 * were defined by the driver from its mc_propinfo()
3512 3553 * entry point.
3513 3554 *
3514 3555 * On exit, the user-specified range mpr_count returns
3515 3556 * the number of ranges specified by the driver on
3516 3557 * success, or the number of ranges it wanted to
3517 3558 * define if that number of ranges could not be
3518 3559 * accomodated by the specified range structure. In
3519 3560 * the latter case, the caller will be able to
3520 3561 * allocate a larger range structure, and query the
3521 3562 * property again.
3522 3563 */
3523 3564 state.pr_range_cur_count = 0;
3524 3565 state.pr_range = range;
3525 3566
3526 3567 mip->mi_callbacks->mc_propinfo(mip->mi_driver, name, id,
3527 3568 (mac_prop_info_handle_t)&state);
3528 3569
3529 3570 if (state.pr_flags & MAC_PROP_INFO_RANGE)
3530 3571 range->mpr_count = state.pr_range_cur_count;
3531 3572
3532 3573 /*
3533 3574 * The operation could fail if the buffer supplied by
3534 3575 * the user was too small for the range or default
3535 3576 * value of the property.
3536 3577 */
3537 3578 if (state.pr_errno != 0)
3538 3579 return (state.pr_errno);
3539 3580
3540 3581 if (perm != NULL && state.pr_flags & MAC_PROP_INFO_PERM)
3541 3582 *perm = state.pr_perm;
3542 3583 }
3543 3584
3544 3585 /*
3545 3586 * The MAC layer may want to provide default values or allowed
3546 3587 * ranges for properties if the driver does not provide a
3547 3588 * property info entry point, or that entry point exists, but
3548 3589 * it did not provide a default value or allowed ranges for
3549 3590 * that property.
3550 3591 */
3551 3592 switch (id) {
3552 3593 case MAC_PROP_MTU: {
3553 3594 uint32_t sdu;
3554 3595
3555 3596 mac_sdu_get2(mh, NULL, &sdu, NULL);
3556 3597
3557 3598 if (range != NULL && !(state.pr_flags &
3558 3599 MAC_PROP_INFO_RANGE)) {
3559 3600 /* MTU range */
3560 3601 _mac_set_range(range, sdu, sdu);
3561 3602 }
3562 3603
3563 3604 if (default_val != NULL && !(state.pr_flags &
3564 3605 MAC_PROP_INFO_DEFAULT)) {
3565 3606 if (mip->mi_info.mi_media == DL_ETHER)
3566 3607 sdu = ETHERMTU;
3567 3608 /* default MTU value */
3568 3609 bcopy(&sdu, default_val, sizeof (sdu));
3569 3610 }
3570 3611 }
3571 3612 }
3572 3613
3573 3614 return (0);
3574 3615 }
3575 3616
3576 3617 int
3577 3618 mac_fastpath_disable(mac_handle_t mh)
3578 3619 {
3579 3620 mac_impl_t *mip = (mac_impl_t *)mh;
3580 3621
3581 3622 if ((mip->mi_state_flags & MIS_LEGACY) == 0)
3582 3623 return (0);
3583 3624
3584 3625 return (mip->mi_capab_legacy.ml_fastpath_disable(mip->mi_driver));
3585 3626 }
3586 3627
3587 3628 void
3588 3629 mac_fastpath_enable(mac_handle_t mh)
3589 3630 {
3590 3631 mac_impl_t *mip = (mac_impl_t *)mh;
3591 3632
3592 3633 if ((mip->mi_state_flags & MIS_LEGACY) == 0)
3593 3634 return;
3594 3635
3595 3636 mip->mi_capab_legacy.ml_fastpath_enable(mip->mi_driver);
3596 3637 }
3597 3638
3598 3639 void
3599 3640 mac_register_priv_prop(mac_impl_t *mip, char **priv_props)
3600 3641 {
3601 3642 uint_t nprops, i;
3602 3643
3603 3644 if (priv_props == NULL)
3604 3645 return;
3605 3646
3606 3647 nprops = 0;
3607 3648 while (priv_props[nprops] != NULL)
3608 3649 nprops++;
3609 3650 if (nprops == 0)
3610 3651 return;
3611 3652
3612 3653
3613 3654 mip->mi_priv_prop = kmem_zalloc(nprops * sizeof (char *), KM_SLEEP);
3614 3655
3615 3656 for (i = 0; i < nprops; i++) {
3616 3657 mip->mi_priv_prop[i] = kmem_zalloc(MAXLINKPROPNAME, KM_SLEEP);
3617 3658 (void) strlcpy(mip->mi_priv_prop[i], priv_props[i],
3618 3659 MAXLINKPROPNAME);
3619 3660 }
3620 3661
3621 3662 mip->mi_priv_prop_count = nprops;
3622 3663 }
3623 3664
3624 3665 void
3625 3666 mac_unregister_priv_prop(mac_impl_t *mip)
3626 3667 {
3627 3668 uint_t i;
3628 3669
3629 3670 if (mip->mi_priv_prop_count == 0) {
3630 3671 ASSERT(mip->mi_priv_prop == NULL);
3631 3672 return;
3632 3673 }
3633 3674
3634 3675 for (i = 0; i < mip->mi_priv_prop_count; i++)
3635 3676 kmem_free(mip->mi_priv_prop[i], MAXLINKPROPNAME);
3636 3677 kmem_free(mip->mi_priv_prop, mip->mi_priv_prop_count *
3637 3678 sizeof (char *));
3638 3679
3639 3680 mip->mi_priv_prop = NULL;
3640 3681 mip->mi_priv_prop_count = 0;
3641 3682 }
3642 3683
3643 3684 /*
3644 3685 * mac_ring_t 'mr' macros. Some rogue drivers may access ring structure
3645 3686 * (by invoking mac_rx()) even after processing mac_stop_ring(). In such
3646 3687 * cases if MAC free's the ring structure after mac_stop_ring(), any
3647 3688 * illegal access to the ring structure coming from the driver will panic
3648 3689 * the system. In order to protect the system from such inadverent access,
3649 3690 * we maintain a cache of rings in the mac_impl_t after they get free'd up.
3650 3691 * When packets are received on free'd up rings, MAC (through the generation
3651 3692 * count mechanism) will drop such packets.
3652 3693 */
3653 3694 static mac_ring_t *
3654 3695 mac_ring_alloc(mac_impl_t *mip)
3655 3696 {
3656 3697 mac_ring_t *ring;
3657 3698
3658 3699 mutex_enter(&mip->mi_ring_lock);
3659 3700 if (mip->mi_ring_freelist != NULL) {
3660 3701 ring = mip->mi_ring_freelist;
3661 3702 mip->mi_ring_freelist = ring->mr_next;
3662 3703 bzero(ring, sizeof (mac_ring_t));
3663 3704 mutex_exit(&mip->mi_ring_lock);
3664 3705 } else {
3665 3706 mutex_exit(&mip->mi_ring_lock);
3666 3707 ring = kmem_cache_alloc(mac_ring_cache, KM_SLEEP);
3667 3708 }
3668 3709 ASSERT((ring != NULL) && (ring->mr_state == MR_FREE));
3669 3710 return (ring);
3670 3711 }
3671 3712
3672 3713 static void
3673 3714 mac_ring_free(mac_impl_t *mip, mac_ring_t *ring)
3674 3715 {
3675 3716 ASSERT(ring->mr_state == MR_FREE);
3676 3717
3677 3718 mutex_enter(&mip->mi_ring_lock);
3678 3719 ring->mr_state = MR_FREE;
3679 3720 ring->mr_flag = 0;
3680 3721 ring->mr_next = mip->mi_ring_freelist;
3681 3722 ring->mr_mip = NULL;
3682 3723 mip->mi_ring_freelist = ring;
3683 3724 mac_ring_stat_delete(ring);
3684 3725 mutex_exit(&mip->mi_ring_lock);
3685 3726 }
3686 3727
3687 3728 static void
3688 3729 mac_ring_freeall(mac_impl_t *mip)
3689 3730 {
3690 3731 mac_ring_t *ring_next;
3691 3732 mutex_enter(&mip->mi_ring_lock);
3692 3733 mac_ring_t *ring = mip->mi_ring_freelist;
3693 3734 while (ring != NULL) {
3694 3735 ring_next = ring->mr_next;
3695 3736 kmem_cache_free(mac_ring_cache, ring);
3696 3737 ring = ring_next;
3697 3738 }
3698 3739 mip->mi_ring_freelist = NULL;
3699 3740 mutex_exit(&mip->mi_ring_lock);
3700 3741 }
3701 3742
3702 3743 int
3703 3744 mac_start_ring(mac_ring_t *ring)
3704 3745 {
3705 3746 int rv = 0;
3706 3747
3707 3748 ASSERT(ring->mr_state == MR_FREE);
3708 3749
3709 3750 if (ring->mr_start != NULL) {
3710 3751 rv = ring->mr_start(ring->mr_driver, ring->mr_gen_num);
3711 3752 if (rv != 0)
3712 3753 return (rv);
3713 3754 }
3714 3755
3715 3756 ring->mr_state = MR_INUSE;
3716 3757 return (rv);
3717 3758 }
3718 3759
3719 3760 void
3720 3761 mac_stop_ring(mac_ring_t *ring)
3721 3762 {
3722 3763 ASSERT(ring->mr_state == MR_INUSE);
3723 3764
3724 3765 if (ring->mr_stop != NULL)
3725 3766 ring->mr_stop(ring->mr_driver);
3726 3767
3727 3768 ring->mr_state = MR_FREE;
3728 3769
3729 3770 /*
3730 3771 * Increment the ring generation number for this ring.
3731 3772 */
3732 3773 ring->mr_gen_num++;
3733 3774 }
3734 3775
3735 3776 int
3736 3777 mac_start_group(mac_group_t *group)
3737 3778 {
3738 3779 int rv = 0;
3739 3780
3740 3781 if (group->mrg_start != NULL)
3741 3782 rv = group->mrg_start(group->mrg_driver);
3742 3783
3743 3784 return (rv);
3744 3785 }
3745 3786
3746 3787 void
3747 3788 mac_stop_group(mac_group_t *group)
3748 3789 {
3749 3790 if (group->mrg_stop != NULL)
3750 3791 group->mrg_stop(group->mrg_driver);
3751 3792 }
3752 3793
3753 3794 /*
3754 3795 * Called from mac_start() on the default Rx group. Broadcast and multicast
3755 3796 * packets are received only on the default group. Hence the default group
3756 3797 * needs to be up even if the primary client is not up, for the other groups
3757 3798 * to be functional. We do this by calling this function at mac_start time
3758 3799 * itself. However the broadcast packets that are received can't make their
3759 3800 * way beyond mac_rx until a mac client creates a broadcast flow.
3760 3801 */
3761 3802 static int
3762 3803 mac_start_group_and_rings(mac_group_t *group)
3763 3804 {
3764 3805 mac_ring_t *ring;
3765 3806 int rv = 0;
3766 3807
3767 3808 ASSERT(group->mrg_state == MAC_GROUP_STATE_REGISTERED);
3768 3809 if ((rv = mac_start_group(group)) != 0)
3769 3810 return (rv);
3770 3811
3771 3812 for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) {
3772 3813 ASSERT(ring->mr_state == MR_FREE);
3773 3814 if ((rv = mac_start_ring(ring)) != 0)
3774 3815 goto error;
3775 3816 ring->mr_classify_type = MAC_SW_CLASSIFIER;
3776 3817 }
3777 3818 return (0);
3778 3819
3779 3820 error:
3780 3821 mac_stop_group_and_rings(group);
3781 3822 return (rv);
3782 3823 }
3783 3824
3784 3825 /* Called from mac_stop on the default Rx group */
3785 3826 static void
3786 3827 mac_stop_group_and_rings(mac_group_t *group)
3787 3828 {
3788 3829 mac_ring_t *ring;
3789 3830
3790 3831 for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) {
3791 3832 if (ring->mr_state != MR_FREE) {
3792 3833 mac_stop_ring(ring);
3793 3834 ring->mr_flag = 0;
3794 3835 ring->mr_classify_type = MAC_NO_CLASSIFIER;
3795 3836 }
3796 3837 }
3797 3838 mac_stop_group(group);
3798 3839 }
3799 3840
3800 3841
3801 3842 static mac_ring_t *
3802 3843 mac_init_ring(mac_impl_t *mip, mac_group_t *group, int index,
3803 3844 mac_capab_rings_t *cap_rings)
3804 3845 {
3805 3846 mac_ring_t *ring, *rnext;
3806 3847 mac_ring_info_t ring_info;
3807 3848 ddi_intr_handle_t ddi_handle;
3808 3849
3809 3850 ring = mac_ring_alloc(mip);
3810 3851
3811 3852 /* Prepare basic information of ring */
3812 3853
3813 3854 /*
3814 3855 * Ring index is numbered to be unique across a particular device.
3815 3856 * Ring index computation makes following assumptions:
3816 3857 * - For drivers with static grouping (e.g. ixgbe, bge),
3817 3858 * ring index exchanged with the driver (e.g. during mr_rget)
3818 3859 * is unique only across the group the ring belongs to.
3819 3860 * - Drivers with dynamic grouping (e.g. nxge), start
3820 3861 * with single group (mrg_index = 0).
3821 3862 */
3822 3863 ring->mr_index = group->mrg_index * group->mrg_info.mgi_count + index;
3823 3864 ring->mr_type = group->mrg_type;
3824 3865 ring->mr_gh = (mac_group_handle_t)group;
3825 3866
3826 3867 /* Insert the new ring to the list. */
3827 3868 ring->mr_next = group->mrg_rings;
3828 3869 group->mrg_rings = ring;
3829 3870
3830 3871 /* Zero to reuse the info data structure */
3831 3872 bzero(&ring_info, sizeof (ring_info));
3832 3873
3833 3874 /* Query ring information from driver */
3834 3875 cap_rings->mr_rget(mip->mi_driver, group->mrg_type, group->mrg_index,
3835 3876 index, &ring_info, (mac_ring_handle_t)ring);
3836 3877
3837 3878 ring->mr_info = ring_info;
3838 3879
3839 3880 /*
3840 3881 * The interrupt handle could be shared among multiple rings.
3841 3882 * Thus if there is a bunch of rings that are sharing an
3842 3883 * interrupt, then only one ring among the bunch will be made
3843 3884 * available for interrupt re-targeting; the rest will have
3844 3885 * ddi_shared flag set to TRUE and would not be available for
3845 3886 * be interrupt re-targeting.
3846 3887 */
3847 3888 if ((ddi_handle = ring_info.mri_intr.mi_ddi_handle) != NULL) {
3848 3889 rnext = ring->mr_next;
3849 3890 while (rnext != NULL) {
3850 3891 if (rnext->mr_info.mri_intr.mi_ddi_handle ==
3851 3892 ddi_handle) {
3852 3893 /*
3853 3894 * If default ring (mr_index == 0) is part
3854 3895 * of a group of rings sharing an
3855 3896 * interrupt, then set ddi_shared flag for
3856 3897 * the default ring and give another ring
3857 3898 * the chance to be re-targeted.
3858 3899 */
3859 3900 if (rnext->mr_index == 0 &&
3860 3901 !rnext->mr_info.mri_intr.mi_ddi_shared) {
3861 3902 rnext->mr_info.mri_intr.mi_ddi_shared =
3862 3903 B_TRUE;
3863 3904 } else {
3864 3905 ring->mr_info.mri_intr.mi_ddi_shared =
3865 3906 B_TRUE;
3866 3907 }
3867 3908 break;
3868 3909 }
3869 3910 rnext = rnext->mr_next;
3870 3911 }
3871 3912 /*
3872 3913 * If rnext is NULL, then no matching ddi_handle was found.
3873 3914 * Rx rings get registered first. So if this is a Tx ring,
3874 3915 * then go through all the Rx rings and see if there is a
3875 3916 * matching ddi handle.
3876 3917 */
3877 3918 if (rnext == NULL && ring->mr_type == MAC_RING_TYPE_TX) {
3878 3919 mac_compare_ddi_handle(mip->mi_rx_groups,
3879 3920 mip->mi_rx_group_count, ring);
3880 3921 }
3881 3922 }
3882 3923
3883 3924 /* Update ring's status */
3884 3925 ring->mr_state = MR_FREE;
3885 3926 ring->mr_flag = 0;
3886 3927
3887 3928 /* Update the ring count of the group */
3888 3929 group->mrg_cur_count++;
3889 3930
3890 3931 /* Create per ring kstats */
3891 3932 if (ring->mr_stat != NULL) {
3892 3933 ring->mr_mip = mip;
3893 3934 mac_ring_stat_create(ring);
3894 3935 }
3895 3936
3896 3937 return (ring);
3897 3938 }
3898 3939
3899 3940 /*
3900 3941 * Rings are chained together for easy regrouping.
3901 3942 */
3902 3943 static void
3903 3944 mac_init_group(mac_impl_t *mip, mac_group_t *group, int size,
3904 3945 mac_capab_rings_t *cap_rings)
3905 3946 {
3906 3947 int index;
3907 3948
3908 3949 /*
3909 3950 * Initialize all ring members of this group. Size of zero will not
3910 3951 * enter the loop, so it's safe for initializing an empty group.
3911 3952 */
3912 3953 for (index = size - 1; index >= 0; index--)
3913 3954 (void) mac_init_ring(mip, group, index, cap_rings);
3914 3955 }
3915 3956
3916 3957 int
3917 3958 mac_init_rings(mac_impl_t *mip, mac_ring_type_t rtype)
3918 3959 {
3919 3960 mac_capab_rings_t *cap_rings;
3920 3961 mac_group_t *group;
3921 3962 mac_group_t *groups;
3922 3963 mac_group_info_t group_info;
3923 3964 uint_t group_free = 0;
3924 3965 uint_t ring_left;
3925 3966 mac_ring_t *ring;
3926 3967 int g;
3927 3968 int err = 0;
3928 3969 uint_t grpcnt;
3929 3970 boolean_t pseudo_txgrp = B_FALSE;
3930 3971
3931 3972 switch (rtype) {
3932 3973 case MAC_RING_TYPE_RX:
3933 3974 ASSERT(mip->mi_rx_groups == NULL);
3934 3975
3935 3976 cap_rings = &mip->mi_rx_rings_cap;
3936 3977 cap_rings->mr_type = MAC_RING_TYPE_RX;
3937 3978 break;
3938 3979 case MAC_RING_TYPE_TX:
3939 3980 ASSERT(mip->mi_tx_groups == NULL);
3940 3981
3941 3982 cap_rings = &mip->mi_tx_rings_cap;
3942 3983 cap_rings->mr_type = MAC_RING_TYPE_TX;
3943 3984 break;
3944 3985 default:
3945 3986 ASSERT(B_FALSE);
3946 3987 }
3947 3988
3948 3989 if (!i_mac_capab_get((mac_handle_t)mip, MAC_CAPAB_RINGS, cap_rings))
3949 3990 return (0);
3950 3991 grpcnt = cap_rings->mr_gnum;
3951 3992
3952 3993 /*
3953 3994 * If we have multiple TX rings, but only one TX group, we can
3954 3995 * create pseudo TX groups (one per TX ring) in the MAC layer,
3955 3996 * except for an aggr. For an aggr currently we maintain only
3956 3997 * one group with all the rings (for all its ports), going
3957 3998 * forwards we might change this.
3958 3999 */
3959 4000 if (rtype == MAC_RING_TYPE_TX &&
3960 4001 cap_rings->mr_gnum == 0 && cap_rings->mr_rnum > 0 &&
3961 4002 (mip->mi_state_flags & MIS_IS_AGGR) == 0) {
3962 4003 /*
3963 4004 * The -1 here is because we create a default TX group
3964 4005 * with all the rings in it.
3965 4006 */
3966 4007 grpcnt = cap_rings->mr_rnum - 1;
3967 4008 pseudo_txgrp = B_TRUE;
3968 4009 }
3969 4010
3970 4011 /*
3971 4012 * Allocate a contiguous buffer for all groups.
3972 4013 */
3973 4014 groups = kmem_zalloc(sizeof (mac_group_t) * (grpcnt+ 1), KM_SLEEP);
3974 4015
3975 4016 ring_left = cap_rings->mr_rnum;
3976 4017
3977 4018 /*
3978 4019 * Get all ring groups if any, and get their ring members
3979 4020 * if any.
3980 4021 */
3981 4022 for (g = 0; g < grpcnt; g++) {
3982 4023 group = groups + g;
3983 4024
3984 4025 /* Prepare basic information of the group */
3985 4026 group->mrg_index = g;
3986 4027 group->mrg_type = rtype;
3987 4028 group->mrg_state = MAC_GROUP_STATE_UNINIT;
3988 4029 group->mrg_mh = (mac_handle_t)mip;
3989 4030 group->mrg_next = group + 1;
3990 4031
3991 4032 /* Zero to reuse the info data structure */
3992 4033 bzero(&group_info, sizeof (group_info));
3993 4034
3994 4035 if (pseudo_txgrp) {
3995 4036 /*
3996 4037 * This is a pseudo group that we created, apart
3997 4038 * from setting the state there is nothing to be
3998 4039 * done.
3999 4040 */
4000 4041 group->mrg_state = MAC_GROUP_STATE_REGISTERED;
4001 4042 group_free++;
4002 4043 continue;
4003 4044 }
4004 4045 /* Query group information from driver */
4005 4046 cap_rings->mr_gget(mip->mi_driver, rtype, g, &group_info,
4006 4047 (mac_group_handle_t)group);
4007 4048
4008 4049 switch (cap_rings->mr_group_type) {
4009 4050 case MAC_GROUP_TYPE_DYNAMIC:
4010 4051 if (cap_rings->mr_gaddring == NULL ||
4011 4052 cap_rings->mr_gremring == NULL) {
4012 4053 DTRACE_PROBE3(
4013 4054 mac__init__rings_no_addremring,
4014 4055 char *, mip->mi_name,
4015 4056 mac_group_add_ring_t,
4016 4057 cap_rings->mr_gaddring,
4017 4058 mac_group_add_ring_t,
4018 4059 cap_rings->mr_gremring);
4019 4060 err = EINVAL;
4020 4061 goto bail;
4021 4062 }
4022 4063
4023 4064 switch (rtype) {
4024 4065 case MAC_RING_TYPE_RX:
4025 4066 /*
4026 4067 * The first RX group must have non-zero
4027 4068 * rings, and the following groups must
4028 4069 * have zero rings.
4029 4070 */
4030 4071 if (g == 0 && group_info.mgi_count == 0) {
4031 4072 DTRACE_PROBE1(
4032 4073 mac__init__rings__rx__def__zero,
4033 4074 char *, mip->mi_name);
4034 4075 err = EINVAL;
4035 4076 goto bail;
4036 4077 }
4037 4078 if (g > 0 && group_info.mgi_count != 0) {
4038 4079 DTRACE_PROBE3(
4039 4080 mac__init__rings__rx__nonzero,
4040 4081 char *, mip->mi_name,
4041 4082 int, g, int, group_info.mgi_count);
4042 4083 err = EINVAL;
4043 4084 goto bail;
4044 4085 }
4045 4086 break;
4046 4087 case MAC_RING_TYPE_TX:
4047 4088 /*
4048 4089 * All TX ring groups must have zero rings.
4049 4090 */
4050 4091 if (group_info.mgi_count != 0) {
4051 4092 DTRACE_PROBE3(
4052 4093 mac__init__rings__tx__nonzero,
4053 4094 char *, mip->mi_name,
4054 4095 int, g, int, group_info.mgi_count);
4055 4096 err = EINVAL;
4056 4097 goto bail;
4057 4098 }
4058 4099 break;
4059 4100 }
4060 4101 break;
4061 4102 case MAC_GROUP_TYPE_STATIC:
4062 4103 /*
4063 4104 * Note that an empty group is allowed, e.g., an aggr
4064 4105 * would start with an empty group.
4065 4106 */
4066 4107 break;
4067 4108 default:
↓ open down ↓ |
1622 lines elided |
↑ open up ↑ |
4068 4109 /* unknown group type */
4069 4110 DTRACE_PROBE2(mac__init__rings__unknown__type,
4070 4111 char *, mip->mi_name,
4071 4112 int, cap_rings->mr_group_type);
4072 4113 err = EINVAL;
4073 4114 goto bail;
4074 4115 }
4075 4116
4076 4117
4077 4118 /*
4078 - * Driver must register group->mgi_addmac/remmac() for rx groups
4079 - * to support multiple MAC addresses.
4119 + * The driver must register some form of hardware MAC
4120 + * filter in order for Rx groups to support multiple
4121 + * MAC addresses.
4080 4122 */
4081 4123 if (rtype == MAC_RING_TYPE_RX &&
4082 - ((group_info.mgi_addmac == NULL) ||
4083 - (group_info.mgi_remmac == NULL))) {
4124 + (group_info.mgi_addmac == NULL ||
4125 + group_info.mgi_remmac == NULL)) {
4126 + DTRACE_PROBE1(mac__init__rings__no__mac__filter,
4127 + char *, mip->mi_name);
4084 4128 err = EINVAL;
4085 4129 goto bail;
4086 4130 }
4087 4131
4088 4132 /* Cache driver-supplied information */
4089 4133 group->mrg_info = group_info;
4090 4134
4091 4135 /* Update the group's status and group count. */
4092 4136 mac_set_group_state(group, MAC_GROUP_STATE_REGISTERED);
4093 4137 group_free++;
4094 4138
4095 4139 group->mrg_rings = NULL;
4096 4140 group->mrg_cur_count = 0;
4097 4141 mac_init_group(mip, group, group_info.mgi_count, cap_rings);
4098 4142 ring_left -= group_info.mgi_count;
4099 4143
4100 4144 /* The current group size should be equal to default value */
4101 4145 ASSERT(group->mrg_cur_count == group_info.mgi_count);
4102 4146 }
4103 4147
4104 4148 /* Build up a dummy group for free resources as a pool */
4105 4149 group = groups + grpcnt;
4106 4150
4107 4151 /* Prepare basic information of the group */
4108 4152 group->mrg_index = -1;
4109 4153 group->mrg_type = rtype;
4110 4154 group->mrg_state = MAC_GROUP_STATE_UNINIT;
4111 4155 group->mrg_mh = (mac_handle_t)mip;
4112 4156 group->mrg_next = NULL;
4113 4157
4114 4158 /*
4115 4159 * If there are ungrouped rings, allocate a continuous buffer for
4116 4160 * remaining resources.
4117 4161 */
4118 4162 if (ring_left != 0) {
4119 4163 group->mrg_rings = NULL;
↓ open down ↓ |
26 lines elided |
↑ open up ↑ |
4120 4164 group->mrg_cur_count = 0;
4121 4165 mac_init_group(mip, group, ring_left, cap_rings);
4122 4166
4123 4167 /* The current group size should be equal to ring_left */
4124 4168 ASSERT(group->mrg_cur_count == ring_left);
4125 4169
4126 4170 ring_left = 0;
4127 4171
4128 4172 /* Update this group's status */
4129 4173 mac_set_group_state(group, MAC_GROUP_STATE_REGISTERED);
4130 - } else
4174 + } else {
4131 4175 group->mrg_rings = NULL;
4176 + }
4132 4177
4133 4178 ASSERT(ring_left == 0);
4134 4179
4135 4180 bail:
4136 4181
4137 4182 /* Cache other important information to finalize the initialization */
4138 4183 switch (rtype) {
4139 4184 case MAC_RING_TYPE_RX:
4140 4185 mip->mi_rx_group_type = cap_rings->mr_group_type;
4141 4186 mip->mi_rx_group_count = cap_rings->mr_gnum;
4142 4187 mip->mi_rx_groups = groups;
4143 4188 mip->mi_rx_donor_grp = groups;
4144 4189 if (mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
4145 4190 /*
4146 4191 * The default ring is reserved since it is
4147 4192 * used for sending the broadcast etc. packets.
4148 4193 */
4149 4194 mip->mi_rxrings_avail =
4150 4195 mip->mi_rx_groups->mrg_cur_count - 1;
4151 4196 mip->mi_rxrings_rsvd = 1;
4152 4197 }
4153 4198 /*
4154 4199 * The default group cannot be reserved. It is used by
4155 4200 * all the clients that do not have an exclusive group.
4156 4201 */
4157 4202 mip->mi_rxhwclnt_avail = mip->mi_rx_group_count - 1;
4158 4203 mip->mi_rxhwclnt_used = 1;
4159 4204 break;
4160 4205 case MAC_RING_TYPE_TX:
4161 4206 mip->mi_tx_group_type = pseudo_txgrp ? MAC_GROUP_TYPE_DYNAMIC :
4162 4207 cap_rings->mr_group_type;
4163 4208 mip->mi_tx_group_count = grpcnt;
4164 4209 mip->mi_tx_group_free = group_free;
4165 4210 mip->mi_tx_groups = groups;
4166 4211
4167 4212 group = groups + grpcnt;
4168 4213 ring = group->mrg_rings;
4169 4214 /*
4170 4215 * The ring can be NULL in the case of aggr. Aggr will
4171 4216 * have an empty Tx group which will get populated
4172 4217 * later when pseudo Tx rings are added after
4173 4218 * mac_register() is done.
4174 4219 */
4175 4220 if (ring == NULL) {
4176 4221 ASSERT(mip->mi_state_flags & MIS_IS_AGGR);
4177 4222 /*
4178 4223 * pass the group to aggr so it can add Tx
4179 4224 * rings to the group later.
4180 4225 */
4181 4226 cap_rings->mr_gget(mip->mi_driver, rtype, 0, NULL,
4182 4227 (mac_group_handle_t)group);
4183 4228 /*
4184 4229 * Even though there are no rings at this time
4185 4230 * (rings will come later), set the group
4186 4231 * state to registered.
4187 4232 */
4188 4233 group->mrg_state = MAC_GROUP_STATE_REGISTERED;
4189 4234 } else {
4190 4235 /*
4191 4236 * Ring 0 is used as the default one and it could be
4192 4237 * assigned to a client as well.
4193 4238 */
4194 4239 while ((ring->mr_index != 0) && (ring->mr_next != NULL))
4195 4240 ring = ring->mr_next;
4196 4241 ASSERT(ring->mr_index == 0);
4197 4242 mip->mi_default_tx_ring = (mac_ring_handle_t)ring;
4198 4243 }
4199 4244 if (mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
4200 4245 mip->mi_txrings_avail = group->mrg_cur_count - 1;
4201 4246 /*
4202 4247 * The default ring cannot be reserved.
4203 4248 */
4204 4249 mip->mi_txrings_rsvd = 1;
4205 4250 }
4206 4251 /*
4207 4252 * The default group cannot be reserved. It will be shared
4208 4253 * by clients that do not have an exclusive group.
4209 4254 */
4210 4255 mip->mi_txhwclnt_avail = mip->mi_tx_group_count;
4211 4256 mip->mi_txhwclnt_used = 1;
4212 4257 break;
4213 4258 default:
4214 4259 ASSERT(B_FALSE);
4215 4260 }
4216 4261
4217 4262 if (err != 0)
4218 4263 mac_free_rings(mip, rtype);
4219 4264
4220 4265 return (err);
4221 4266 }
4222 4267
4223 4268 /*
4224 4269 * The ddi interrupt handle could be shared amoung rings. If so, compare
4225 4270 * the new ring's ddi handle with the existing ones and set ddi_shared
4226 4271 * flag.
4227 4272 */
4228 4273 void
4229 4274 mac_compare_ddi_handle(mac_group_t *groups, uint_t grpcnt, mac_ring_t *cring)
4230 4275 {
4231 4276 mac_group_t *group;
4232 4277 mac_ring_t *ring;
4233 4278 ddi_intr_handle_t ddi_handle;
4234 4279 int g;
4235 4280
4236 4281 ddi_handle = cring->mr_info.mri_intr.mi_ddi_handle;
4237 4282 for (g = 0; g < grpcnt; g++) {
4238 4283 group = groups + g;
4239 4284 for (ring = group->mrg_rings; ring != NULL;
4240 4285 ring = ring->mr_next) {
4241 4286 if (ring == cring)
4242 4287 continue;
4243 4288 if (ring->mr_info.mri_intr.mi_ddi_handle ==
4244 4289 ddi_handle) {
4245 4290 if (cring->mr_type == MAC_RING_TYPE_RX &&
4246 4291 ring->mr_index == 0 &&
4247 4292 !ring->mr_info.mri_intr.mi_ddi_shared) {
4248 4293 ring->mr_info.mri_intr.mi_ddi_shared =
4249 4294 B_TRUE;
4250 4295 } else {
4251 4296 cring->mr_info.mri_intr.mi_ddi_shared =
4252 4297 B_TRUE;
4253 4298 }
4254 4299 return;
4255 4300 }
4256 4301 }
4257 4302 }
4258 4303 }
4259 4304
4260 4305 /*
4261 4306 * Called to free all groups of particular type (RX or TX). It's assumed that
4262 4307 * no clients are using these groups.
4263 4308 */
4264 4309 void
4265 4310 mac_free_rings(mac_impl_t *mip, mac_ring_type_t rtype)
4266 4311 {
4267 4312 mac_group_t *group, *groups;
4268 4313 uint_t group_count;
4269 4314
4270 4315 switch (rtype) {
4271 4316 case MAC_RING_TYPE_RX:
4272 4317 if (mip->mi_rx_groups == NULL)
4273 4318 return;
4274 4319
4275 4320 groups = mip->mi_rx_groups;
4276 4321 group_count = mip->mi_rx_group_count;
4277 4322
4278 4323 mip->mi_rx_groups = NULL;
4279 4324 mip->mi_rx_donor_grp = NULL;
4280 4325 mip->mi_rx_group_count = 0;
4281 4326 break;
4282 4327 case MAC_RING_TYPE_TX:
4283 4328 ASSERT(mip->mi_tx_group_count == mip->mi_tx_group_free);
4284 4329
4285 4330 if (mip->mi_tx_groups == NULL)
4286 4331 return;
4287 4332
4288 4333 groups = mip->mi_tx_groups;
4289 4334 group_count = mip->mi_tx_group_count;
4290 4335
4291 4336 mip->mi_tx_groups = NULL;
4292 4337 mip->mi_tx_group_count = 0;
4293 4338 mip->mi_tx_group_free = 0;
4294 4339 mip->mi_default_tx_ring = NULL;
4295 4340 break;
4296 4341 default:
4297 4342 ASSERT(B_FALSE);
4298 4343 }
4299 4344
4300 4345 for (group = groups; group != NULL; group = group->mrg_next) {
4301 4346 mac_ring_t *ring;
4302 4347
4303 4348 if (group->mrg_cur_count == 0)
4304 4349 continue;
4305 4350
4306 4351 ASSERT(group->mrg_rings != NULL);
4307 4352
4308 4353 while ((ring = group->mrg_rings) != NULL) {
4309 4354 group->mrg_rings = ring->mr_next;
4310 4355 mac_ring_free(mip, ring);
↓ open down ↓ |
169 lines elided |
↑ open up ↑ |
4311 4356 }
4312 4357 }
4313 4358
4314 4359 /* Free all the cached rings */
4315 4360 mac_ring_freeall(mip);
4316 4361 /* Free the block of group data strutures */
4317 4362 kmem_free(groups, sizeof (mac_group_t) * (group_count + 1));
4318 4363 }
4319 4364
4320 4365 /*
4366 + * Associate the VLAN filter to the receive group.
4367 + */
4368 +int
4369 +mac_group_addvlan(mac_group_t *group, uint16_t vlan)
4370 +{
4371 + VERIFY3S(group->mrg_type, ==, MAC_RING_TYPE_RX);
4372 + VERIFY3P(group->mrg_info.mgi_addvlan, !=, NULL);
4373 +
4374 + if (vlan > VLAN_ID_MAX)
4375 + return (EINVAL);
4376 +
4377 + vlan = MAC_VLAN_UNTAGGED_VID(vlan);
4378 + return (group->mrg_info.mgi_addvlan(group->mrg_info.mgi_driver, vlan));
4379 +}
4380 +
4381 +/*
4382 + * Dissociate the VLAN from the receive group.
4383 + */
4384 +int
4385 +mac_group_remvlan(mac_group_t *group, uint16_t vlan)
4386 +{
4387 + VERIFY3S(group->mrg_type, ==, MAC_RING_TYPE_RX);
4388 + VERIFY3P(group->mrg_info.mgi_remvlan, !=, NULL);
4389 +
4390 + if (vlan > VLAN_ID_MAX)
4391 + return (EINVAL);
4392 +
4393 + vlan = MAC_VLAN_UNTAGGED_VID(vlan);
4394 + return (group->mrg_info.mgi_remvlan(group->mrg_info.mgi_driver, vlan));
4395 +}
4396 +
4397 +/*
4321 4398 * Associate a MAC address with a receive group.
4322 4399 *
4323 4400 * The return value of this function should always be checked properly, because
4324 4401 * any type of failure could cause unexpected results. A group can be added
4325 4402 * or removed with a MAC address only after it has been reserved. Ideally,
4326 4403 * a successful reservation always leads to calling mac_group_addmac() to
4327 4404 * steer desired traffic. Failure of adding an unicast MAC address doesn't
4328 4405 * always imply that the group is functioning abnormally.
4329 4406 *
4330 4407 * Currently this function is called everywhere, and it reflects assumptions
4331 4408 * about MAC addresses in the implementation. CR 6735196.
4332 4409 */
4333 4410 int
4334 4411 mac_group_addmac(mac_group_t *group, const uint8_t *addr)
4335 4412 {
4336 - ASSERT(group->mrg_type == MAC_RING_TYPE_RX);
4337 - ASSERT(group->mrg_info.mgi_addmac != NULL);
4413 + VERIFY3S(group->mrg_type, ==, MAC_RING_TYPE_RX);
4414 + VERIFY3P(group->mrg_info.mgi_addmac, !=, NULL);
4338 4415
4339 4416 return (group->mrg_info.mgi_addmac(group->mrg_info.mgi_driver, addr));
4340 4417 }
4341 4418
4342 4419 /*
4343 4420 * Remove the association between MAC address and receive group.
4344 4421 */
4345 4422 int
4346 4423 mac_group_remmac(mac_group_t *group, const uint8_t *addr)
4347 4424 {
4348 - ASSERT(group->mrg_type == MAC_RING_TYPE_RX);
4349 - ASSERT(group->mrg_info.mgi_remmac != NULL);
4425 + VERIFY3S(group->mrg_type, ==, MAC_RING_TYPE_RX);
4426 + VERIFY3P(group->mrg_info.mgi_remmac, !=, NULL);
4350 4427
4351 4428 return (group->mrg_info.mgi_remmac(group->mrg_info.mgi_driver, addr));
4352 4429 }
4353 4430
4354 4431 /*
4355 4432 * This is the entry point for packets transmitted through the bridging code.
4356 4433 * If no bridge is in place, MAC_RING_TX transmits using tx ring. The 'rh'
4357 4434 * pointer may be NULL to select the default ring.
4358 4435 */
4359 4436 mblk_t *
4360 4437 mac_bridge_tx(mac_impl_t *mip, mac_ring_handle_t rh, mblk_t *mp)
4361 4438 {
4362 4439 mac_handle_t mh;
4363 4440
4364 4441 /*
4365 4442 * Once we take a reference on the bridge link, the bridge
4366 4443 * module itself can't unload, so the callback pointers are
4367 4444 * stable.
4368 4445 */
4369 4446 mutex_enter(&mip->mi_bridge_lock);
4370 4447 if ((mh = mip->mi_bridge_link) != NULL)
4371 4448 mac_bridge_ref_cb(mh, B_TRUE);
4372 4449 mutex_exit(&mip->mi_bridge_lock);
4373 4450 if (mh == NULL) {
4374 4451 MAC_RING_TX(mip, rh, mp, mp);
4375 4452 } else {
4376 4453 mp = mac_bridge_tx_cb(mh, rh, mp);
4377 4454 mac_bridge_ref_cb(mh, B_FALSE);
4378 4455 }
4379 4456
4380 4457 return (mp);
4381 4458 }
4382 4459
4383 4460 /*
4384 4461 * Find a ring from its index.
4385 4462 */
4386 4463 mac_ring_handle_t
4387 4464 mac_find_ring(mac_group_handle_t gh, int index)
4388 4465 {
4389 4466 mac_group_t *group = (mac_group_t *)gh;
4390 4467 mac_ring_t *ring = group->mrg_rings;
4391 4468
4392 4469 for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next)
4393 4470 if (ring->mr_index == index)
4394 4471 break;
4395 4472
4396 4473 return ((mac_ring_handle_t)ring);
4397 4474 }
4398 4475 /*
4399 4476 * Add a ring to an existing group.
4400 4477 *
4401 4478 * The ring must be either passed directly (for example if the ring
4402 4479 * movement is initiated by the framework), or specified through a driver
4403 4480 * index (for example when the ring is added by the driver.
4404 4481 *
4405 4482 * The caller needs to call mac_perim_enter() before calling this function.
4406 4483 */
4407 4484 int
4408 4485 i_mac_group_add_ring(mac_group_t *group, mac_ring_t *ring, int index)
4409 4486 {
4410 4487 mac_impl_t *mip = (mac_impl_t *)group->mrg_mh;
4411 4488 mac_capab_rings_t *cap_rings;
4412 4489 boolean_t driver_call = (ring == NULL);
4413 4490 mac_group_type_t group_type;
4414 4491 int ret = 0;
4415 4492 flow_entry_t *flent;
4416 4493
4417 4494 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4418 4495
4419 4496 switch (group->mrg_type) {
4420 4497 case MAC_RING_TYPE_RX:
4421 4498 cap_rings = &mip->mi_rx_rings_cap;
4422 4499 group_type = mip->mi_rx_group_type;
4423 4500 break;
4424 4501 case MAC_RING_TYPE_TX:
4425 4502 cap_rings = &mip->mi_tx_rings_cap;
4426 4503 group_type = mip->mi_tx_group_type;
4427 4504 break;
4428 4505 default:
4429 4506 ASSERT(B_FALSE);
4430 4507 }
4431 4508
4432 4509 /*
4433 4510 * There should be no ring with the same ring index in the target
4434 4511 * group.
4435 4512 */
4436 4513 ASSERT(mac_find_ring((mac_group_handle_t)group,
4437 4514 driver_call ? index : ring->mr_index) == NULL);
4438 4515
4439 4516 if (driver_call) {
4440 4517 /*
4441 4518 * The function is called as a result of a request from
4442 4519 * a driver to add a ring to an existing group, for example
4443 4520 * from the aggregation driver. Allocate a new mac_ring_t
4444 4521 * for that ring.
4445 4522 */
4446 4523 ring = mac_init_ring(mip, group, index, cap_rings);
4447 4524 ASSERT(group->mrg_state > MAC_GROUP_STATE_UNINIT);
4448 4525 } else {
4449 4526 /*
4450 4527 * The function is called as a result of a MAC layer request
4451 4528 * to add a ring to an existing group. In this case the
4452 4529 * ring is being moved between groups, which requires
4453 4530 * the underlying driver to support dynamic grouping,
4454 4531 * and the mac_ring_t already exists.
4455 4532 */
4456 4533 ASSERT(group_type == MAC_GROUP_TYPE_DYNAMIC);
4457 4534 ASSERT(group->mrg_driver == NULL ||
4458 4535 cap_rings->mr_gaddring != NULL);
4459 4536 ASSERT(ring->mr_gh == NULL);
4460 4537 }
4461 4538
4462 4539 /*
4463 4540 * At this point the ring should not be in use, and it should be
4464 4541 * of the right for the target group.
4465 4542 */
4466 4543 ASSERT(ring->mr_state < MR_INUSE);
4467 4544 ASSERT(ring->mr_srs == NULL);
4468 4545 ASSERT(ring->mr_type == group->mrg_type);
4469 4546
4470 4547 if (!driver_call) {
4471 4548 /*
4472 4549 * Add the driver level hardware ring if the process was not
4473 4550 * initiated by the driver, and the target group is not the
4474 4551 * group.
4475 4552 */
4476 4553 if (group->mrg_driver != NULL) {
4477 4554 cap_rings->mr_gaddring(group->mrg_driver,
4478 4555 ring->mr_driver, ring->mr_type);
4479 4556 }
4480 4557
4481 4558 /*
4482 4559 * Insert the ring ahead existing rings.
4483 4560 */
4484 4561 ring->mr_next = group->mrg_rings;
4485 4562 group->mrg_rings = ring;
4486 4563 ring->mr_gh = (mac_group_handle_t)group;
4487 4564 group->mrg_cur_count++;
4488 4565 }
4489 4566
4490 4567 /*
4491 4568 * If the group has not been actively used, we're done.
4492 4569 */
4493 4570 if (group->mrg_index != -1 &&
4494 4571 group->mrg_state < MAC_GROUP_STATE_RESERVED)
4495 4572 return (0);
4496 4573
4497 4574 /*
4498 4575 * Start the ring if needed. Failure causes to undo the grouping action.
4499 4576 */
4500 4577 if (ring->mr_state != MR_INUSE) {
4501 4578 if ((ret = mac_start_ring(ring)) != 0) {
4502 4579 if (!driver_call) {
4503 4580 cap_rings->mr_gremring(group->mrg_driver,
4504 4581 ring->mr_driver, ring->mr_type);
4505 4582 }
4506 4583 group->mrg_cur_count--;
4507 4584 group->mrg_rings = ring->mr_next;
4508 4585
4509 4586 ring->mr_gh = NULL;
4510 4587
4511 4588 if (driver_call)
4512 4589 mac_ring_free(mip, ring);
4513 4590
↓ open down ↓ |
154 lines elided |
↑ open up ↑ |
4514 4591 return (ret);
4515 4592 }
4516 4593 }
4517 4594
4518 4595 /*
4519 4596 * Set up SRS/SR according to the ring type.
4520 4597 */
4521 4598 switch (ring->mr_type) {
4522 4599 case MAC_RING_TYPE_RX:
4523 4600 /*
4524 - * Setup SRS on top of the new ring if the group is
4525 - * reserved for someones exclusive use.
4601 + * Setup an SRS on top of the new ring if the group is
4602 + * reserved for someone's exclusive use.
4526 4603 */
4527 4604 if (group->mrg_state == MAC_GROUP_STATE_RESERVED) {
4528 - mac_client_impl_t *mcip;
4605 + mac_client_impl_t *mcip = MAC_GROUP_ONLY_CLIENT(group);
4529 4606
4530 - mcip = MAC_GROUP_ONLY_CLIENT(group);
4531 - /*
4532 - * Even though this group is reserved we migth still
4533 - * have multiple clients, i.e a VLAN shares the
4534 - * group with the primary mac client.
4535 - */
4536 - if (mcip != NULL) {
4537 - flent = mcip->mci_flent;
4538 - ASSERT(flent->fe_rx_srs_cnt > 0);
4539 - mac_rx_srs_group_setup(mcip, flent, SRST_LINK);
4540 - mac_fanout_setup(mcip, flent,
4541 - MCIP_RESOURCE_PROPS(mcip), mac_rx_deliver,
4542 - mcip, NULL, NULL);
4543 - } else {
4544 - ring->mr_classify_type = MAC_SW_CLASSIFIER;
4545 - }
4607 + VERIFY3P(mcip, !=, NULL);
4608 + flent = mcip->mci_flent;
4609 + VERIFY3S(flent->fe_rx_srs_cnt, >, 0);
4610 + mac_rx_srs_group_setup(mcip, flent, SRST_LINK);
4611 + mac_fanout_setup(mcip, flent, MCIP_RESOURCE_PROPS(mcip),
4612 + mac_rx_deliver, mcip, NULL, NULL);
4613 + } else {
4614 + ring->mr_classify_type = MAC_SW_CLASSIFIER;
4546 4615 }
4547 4616 break;
4548 4617 case MAC_RING_TYPE_TX:
4549 4618 {
4550 4619 mac_grp_client_t *mgcp = group->mrg_clients;
4551 4620 mac_client_impl_t *mcip;
4552 4621 mac_soft_ring_set_t *mac_srs;
4553 4622 mac_srs_tx_t *tx;
4554 4623
4555 4624 if (MAC_GROUP_NO_CLIENT(group)) {
4556 4625 if (ring->mr_state == MR_INUSE)
4557 4626 mac_stop_ring(ring);
4558 4627 ring->mr_flag = 0;
4559 4628 break;
4560 4629 }
↓ open down ↓ |
5 lines elided |
↑ open up ↑ |
4561 4630 /*
4562 4631 * If the rings are being moved to a group that has
4563 4632 * clients using it, then add the new rings to the
4564 4633 * clients SRS.
4565 4634 */
4566 4635 while (mgcp != NULL) {
4567 4636 boolean_t is_aggr;
4568 4637
4569 4638 mcip = mgcp->mgc_client;
4570 4639 flent = mcip->mci_flent;
4571 - is_aggr = (mcip->mci_state_flags & MCIS_IS_AGGR);
4640 + is_aggr = (mcip->mci_state_flags & MCIS_IS_AGGR_CLIENT);
4572 4641 mac_srs = MCIP_TX_SRS(mcip);
4573 4642 tx = &mac_srs->srs_tx;
4574 4643 mac_tx_client_quiesce((mac_client_handle_t)mcip);
4575 4644 /*
4576 4645 * If we are growing from 1 to multiple rings.
4577 4646 */
4578 4647 if (tx->st_mode == SRS_TX_BW ||
4579 4648 tx->st_mode == SRS_TX_SERIALIZE ||
4580 4649 tx->st_mode == SRS_TX_DEFAULT) {
4581 4650 mac_ring_t *tx_ring = tx->st_arg2;
4582 4651
4583 4652 tx->st_arg2 = NULL;
4584 4653 mac_tx_srs_stat_recreate(mac_srs, B_TRUE);
4585 4654 mac_tx_srs_add_ring(mac_srs, tx_ring);
4586 4655 if (mac_srs->srs_type & SRST_BW_CONTROL) {
4587 4656 tx->st_mode = is_aggr ? SRS_TX_BW_AGGR :
4588 4657 SRS_TX_BW_FANOUT;
4589 4658 } else {
4590 4659 tx->st_mode = is_aggr ? SRS_TX_AGGR :
4591 4660 SRS_TX_FANOUT;
4592 4661 }
4593 4662 tx->st_func = mac_tx_get_func(tx->st_mode);
4594 4663 }
4595 4664 mac_tx_srs_add_ring(mac_srs, ring);
4596 4665 mac_fanout_setup(mcip, flent, MCIP_RESOURCE_PROPS(mcip),
4597 4666 mac_rx_deliver, mcip, NULL, NULL);
4598 4667 mac_tx_client_restart((mac_client_handle_t)mcip);
4599 4668 mgcp = mgcp->mgc_next;
4600 4669 }
4601 4670 break;
4602 4671 }
4603 4672 default:
4604 4673 ASSERT(B_FALSE);
4605 4674 }
4606 4675 /*
4607 4676 * For aggr, the default ring will be NULL to begin with. If it
4608 4677 * is NULL, then pick the first ring that gets added as the
4609 4678 * default ring. Any ring in an aggregation can be removed at
4610 4679 * any time (by the user action of removing a link) and if the
4611 4680 * current default ring gets removed, then a new one gets
4612 4681 * picked (see i_mac_group_rem_ring()).
4613 4682 */
4614 4683 if (mip->mi_state_flags & MIS_IS_AGGR &&
4615 4684 mip->mi_default_tx_ring == NULL &&
4616 4685 ring->mr_type == MAC_RING_TYPE_TX) {
4617 4686 mip->mi_default_tx_ring = (mac_ring_handle_t)ring;
4618 4687 }
4619 4688
4620 4689 MAC_RING_UNMARK(ring, MR_INCIPIENT);
4621 4690 return (0);
4622 4691 }
4623 4692
4624 4693 /*
4625 4694 * Remove a ring from it's current group. MAC internal function for dynamic
4626 4695 * grouping.
4627 4696 *
4628 4697 * The caller needs to call mac_perim_enter() before calling this function.
4629 4698 */
4630 4699 void
4631 4700 i_mac_group_rem_ring(mac_group_t *group, mac_ring_t *ring,
4632 4701 boolean_t driver_call)
4633 4702 {
4634 4703 mac_impl_t *mip = (mac_impl_t *)group->mrg_mh;
4635 4704 mac_capab_rings_t *cap_rings = NULL;
4636 4705 mac_group_type_t group_type;
4637 4706
4638 4707 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4639 4708
4640 4709 ASSERT(mac_find_ring((mac_group_handle_t)group,
4641 4710 ring->mr_index) == (mac_ring_handle_t)ring);
4642 4711 ASSERT((mac_group_t *)ring->mr_gh == group);
4643 4712 ASSERT(ring->mr_type == group->mrg_type);
4644 4713
4645 4714 if (ring->mr_state == MR_INUSE)
4646 4715 mac_stop_ring(ring);
4647 4716 switch (ring->mr_type) {
4648 4717 case MAC_RING_TYPE_RX:
4649 4718 group_type = mip->mi_rx_group_type;
4650 4719 cap_rings = &mip->mi_rx_rings_cap;
4651 4720
4652 4721 /*
4653 4722 * Only hardware classified packets hold a reference to the
4654 4723 * ring all the way up the Rx path. mac_rx_srs_remove()
4655 4724 * will take care of quiescing the Rx path and removing the
4656 4725 * SRS. The software classified path neither holds a reference
4657 4726 * nor any association with the ring in mac_rx.
4658 4727 */
4659 4728 if (ring->mr_srs != NULL) {
4660 4729 mac_rx_srs_remove(ring->mr_srs);
4661 4730 ring->mr_srs = NULL;
4662 4731 }
4663 4732
4664 4733 break;
4665 4734 case MAC_RING_TYPE_TX:
4666 4735 {
4667 4736 mac_grp_client_t *mgcp;
4668 4737 mac_client_impl_t *mcip;
4669 4738 mac_soft_ring_set_t *mac_srs;
4670 4739 mac_srs_tx_t *tx;
4671 4740 mac_ring_t *rem_ring;
4672 4741 mac_group_t *defgrp;
4673 4742 uint_t ring_info = 0;
4674 4743
4675 4744 /*
4676 4745 * For TX this function is invoked in three
4677 4746 * cases:
4678 4747 *
4679 4748 * 1) In the case of a failure during the
4680 4749 * initial creation of a group when a share is
4681 4750 * associated with a MAC client. So the SRS is not
4682 4751 * yet setup, and will be setup later after the
4683 4752 * group has been reserved and populated.
4684 4753 *
4685 4754 * 2) From mac_release_tx_group() when freeing
4686 4755 * a TX SRS.
4687 4756 *
4688 4757 * 3) In the case of aggr, when a port gets removed,
4689 4758 * the pseudo Tx rings that it exposed gets removed.
4690 4759 *
4691 4760 * In the first two cases the SRS and its soft
4692 4761 * rings are already quiesced.
4693 4762 */
4694 4763 if (driver_call) {
4695 4764 mac_client_impl_t *mcip;
4696 4765 mac_soft_ring_set_t *mac_srs;
4697 4766 mac_soft_ring_t *sringp;
4698 4767 mac_srs_tx_t *srs_tx;
4699 4768
4700 4769 if (mip->mi_state_flags & MIS_IS_AGGR &&
4701 4770 mip->mi_default_tx_ring ==
4702 4771 (mac_ring_handle_t)ring) {
4703 4772 /* pick a new default Tx ring */
4704 4773 mip->mi_default_tx_ring =
↓ open down ↓ |
123 lines elided |
↑ open up ↑ |
4705 4774 (group->mrg_rings != ring) ?
4706 4775 (mac_ring_handle_t)group->mrg_rings :
4707 4776 (mac_ring_handle_t)(ring->mr_next);
4708 4777 }
4709 4778 /* Presently only aggr case comes here */
4710 4779 if (group->mrg_state != MAC_GROUP_STATE_RESERVED)
4711 4780 break;
4712 4781
4713 4782 mcip = MAC_GROUP_ONLY_CLIENT(group);
4714 4783 ASSERT(mcip != NULL);
4715 - ASSERT(mcip->mci_state_flags & MCIS_IS_AGGR);
4784 + ASSERT(mcip->mci_state_flags & MCIS_IS_AGGR_CLIENT);
4716 4785 mac_srs = MCIP_TX_SRS(mcip);
4717 4786 ASSERT(mac_srs->srs_tx.st_mode == SRS_TX_AGGR ||
4718 4787 mac_srs->srs_tx.st_mode == SRS_TX_BW_AGGR);
4719 4788 srs_tx = &mac_srs->srs_tx;
4720 4789 /*
4721 4790 * Wakeup any callers blocked on this
4722 4791 * Tx ring due to flow control.
4723 4792 */
4724 4793 sringp = srs_tx->st_soft_rings[ring->mr_index];
4725 4794 ASSERT(sringp != NULL);
4726 4795 mac_tx_invoke_callbacks(mcip, (mac_tx_cookie_t)sringp);
4727 4796 mac_tx_client_quiesce((mac_client_handle_t)mcip);
4728 4797 mac_tx_srs_del_ring(mac_srs, ring);
4729 4798 mac_tx_client_restart((mac_client_handle_t)mcip);
4730 4799 break;
4731 4800 }
4732 4801 ASSERT(ring != (mac_ring_t *)mip->mi_default_tx_ring);
4733 4802 group_type = mip->mi_tx_group_type;
4734 4803 cap_rings = &mip->mi_tx_rings_cap;
4735 4804 /*
4736 4805 * See if we need to take it out of the MAC clients using
4737 4806 * this group
4738 4807 */
4739 4808 if (MAC_GROUP_NO_CLIENT(group))
4740 4809 break;
4741 4810 mgcp = group->mrg_clients;
4742 4811 defgrp = MAC_DEFAULT_TX_GROUP(mip);
4743 4812 while (mgcp != NULL) {
4744 4813 mcip = mgcp->mgc_client;
4745 4814 mac_srs = MCIP_TX_SRS(mcip);
4746 4815 tx = &mac_srs->srs_tx;
4747 4816 mac_tx_client_quiesce((mac_client_handle_t)mcip);
4748 4817 /*
4749 4818 * If we are here when removing rings from the
4750 4819 * defgroup, mac_reserve_tx_ring would have
4751 4820 * already deleted the ring from the MAC
4752 4821 * clients in the group.
4753 4822 */
4754 4823 if (group != defgrp) {
4755 4824 mac_tx_invoke_callbacks(mcip,
4756 4825 (mac_tx_cookie_t)
4757 4826 mac_tx_srs_get_soft_ring(mac_srs, ring));
4758 4827 mac_tx_srs_del_ring(mac_srs, ring);
4759 4828 }
4760 4829 /*
4761 4830 * Additionally, if we are left with only
4762 4831 * one ring in the group after this, we need
4763 4832 * to modify the mode etc. to. (We haven't
4764 4833 * yet taken the ring out, so we check with 2).
4765 4834 */
4766 4835 if (group->mrg_cur_count == 2) {
4767 4836 if (ring->mr_next == NULL)
4768 4837 rem_ring = group->mrg_rings;
4769 4838 else
4770 4839 rem_ring = ring->mr_next;
4771 4840 mac_tx_invoke_callbacks(mcip,
4772 4841 (mac_tx_cookie_t)
4773 4842 mac_tx_srs_get_soft_ring(mac_srs,
4774 4843 rem_ring));
4775 4844 mac_tx_srs_del_ring(mac_srs, rem_ring);
4776 4845 if (rem_ring->mr_state != MR_INUSE) {
4777 4846 (void) mac_start_ring(rem_ring);
4778 4847 }
4779 4848 tx->st_arg2 = (void *)rem_ring;
4780 4849 mac_tx_srs_stat_recreate(mac_srs, B_FALSE);
4781 4850 ring_info = mac_hwring_getinfo(
4782 4851 (mac_ring_handle_t)rem_ring);
4783 4852 /*
4784 4853 * We are shrinking from multiple
4785 4854 * to 1 ring.
4786 4855 */
4787 4856 if (mac_srs->srs_type & SRST_BW_CONTROL) {
4788 4857 tx->st_mode = SRS_TX_BW;
4789 4858 } else if (mac_tx_serialize ||
4790 4859 (ring_info & MAC_RING_TX_SERIALIZE)) {
4791 4860 tx->st_mode = SRS_TX_SERIALIZE;
4792 4861 } else {
4793 4862 tx->st_mode = SRS_TX_DEFAULT;
4794 4863 }
4795 4864 tx->st_func = mac_tx_get_func(tx->st_mode);
4796 4865 }
4797 4866 mac_tx_client_restart((mac_client_handle_t)mcip);
4798 4867 mgcp = mgcp->mgc_next;
4799 4868 }
4800 4869 break;
4801 4870 }
4802 4871 default:
4803 4872 ASSERT(B_FALSE);
4804 4873 }
4805 4874
4806 4875 /*
4807 4876 * Remove the ring from the group.
4808 4877 */
4809 4878 if (ring == group->mrg_rings)
4810 4879 group->mrg_rings = ring->mr_next;
4811 4880 else {
4812 4881 mac_ring_t *pre;
4813 4882
4814 4883 pre = group->mrg_rings;
4815 4884 while (pre->mr_next != ring)
4816 4885 pre = pre->mr_next;
4817 4886 pre->mr_next = ring->mr_next;
4818 4887 }
4819 4888 group->mrg_cur_count--;
4820 4889
4821 4890 if (!driver_call) {
4822 4891 ASSERT(group_type == MAC_GROUP_TYPE_DYNAMIC);
4823 4892 ASSERT(group->mrg_driver == NULL ||
4824 4893 cap_rings->mr_gremring != NULL);
4825 4894
4826 4895 /*
4827 4896 * Remove the driver level hardware ring.
4828 4897 */
4829 4898 if (group->mrg_driver != NULL) {
4830 4899 cap_rings->mr_gremring(group->mrg_driver,
4831 4900 ring->mr_driver, ring->mr_type);
4832 4901 }
4833 4902 }
4834 4903
4835 4904 ring->mr_gh = NULL;
4836 4905 if (driver_call)
4837 4906 mac_ring_free(mip, ring);
4838 4907 else
4839 4908 ring->mr_flag = 0;
4840 4909 }
4841 4910
4842 4911 /*
4843 4912 * Move a ring to the target group. If needed, remove the ring from the group
4844 4913 * that it currently belongs to.
4845 4914 *
4846 4915 * The caller need to enter MAC's perimeter by calling mac_perim_enter().
4847 4916 */
4848 4917 static int
4849 4918 mac_group_mov_ring(mac_impl_t *mip, mac_group_t *d_group, mac_ring_t *ring)
4850 4919 {
4851 4920 mac_group_t *s_group = (mac_group_t *)ring->mr_gh;
4852 4921 int rv;
4853 4922
4854 4923 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4855 4924 ASSERT(d_group != NULL);
4856 4925 ASSERT(s_group == NULL || s_group->mrg_mh == d_group->mrg_mh);
4857 4926
4858 4927 if (s_group == d_group)
4859 4928 return (0);
4860 4929
4861 4930 /*
4862 4931 * Remove it from current group first.
4863 4932 */
4864 4933 if (s_group != NULL)
4865 4934 i_mac_group_rem_ring(s_group, ring, B_FALSE);
4866 4935
4867 4936 /*
4868 4937 * Add it to the new group.
4869 4938 */
4870 4939 rv = i_mac_group_add_ring(d_group, ring, 0);
4871 4940 if (rv != 0) {
4872 4941 /*
4873 4942 * Failed to add ring back to source group. If
4874 4943 * that fails, the ring is stuck in limbo, log message.
4875 4944 */
4876 4945 if (i_mac_group_add_ring(s_group, ring, 0)) {
4877 4946 cmn_err(CE_WARN, "%s: failed to move ring %p\n",
4878 4947 mip->mi_name, (void *)ring);
4879 4948 }
4880 4949 }
4881 4950
4882 4951 return (rv);
4883 4952 }
4884 4953
4885 4954 /*
4886 4955 * Find a MAC address according to its value.
4887 4956 */
4888 4957 mac_address_t *
4889 4958 mac_find_macaddr(mac_impl_t *mip, uint8_t *mac_addr)
4890 4959 {
4891 4960 mac_address_t *map;
4892 4961
4893 4962 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4894 4963
4895 4964 for (map = mip->mi_addresses; map != NULL; map = map->ma_next) {
4896 4965 if (bcmp(mac_addr, map->ma_addr, map->ma_len) == 0)
4897 4966 break;
4898 4967 }
4899 4968
4900 4969 return (map);
4901 4970 }
4902 4971
4903 4972 /*
4904 4973 * Check whether the MAC address is shared by multiple clients.
4905 4974 */
4906 4975 boolean_t
4907 4976 mac_check_macaddr_shared(mac_address_t *map)
4908 4977 {
4909 4978 ASSERT(MAC_PERIM_HELD((mac_handle_t)map->ma_mip));
4910 4979
4911 4980 return (map->ma_nusers > 1);
4912 4981 }
↓ open down ↓ |
187 lines elided |
↑ open up ↑ |
4913 4982
4914 4983 /*
4915 4984 * Remove the specified MAC address from the MAC address list and free it.
4916 4985 */
4917 4986 static void
4918 4987 mac_free_macaddr(mac_address_t *map)
4919 4988 {
4920 4989 mac_impl_t *mip = map->ma_mip;
4921 4990
4922 4991 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4923 - ASSERT(mip->mi_addresses != NULL);
4992 + VERIFY3P(mip->mi_addresses, !=, NULL);
4924 4993
4925 - map = mac_find_macaddr(mip, map->ma_addr);
4994 + VERIFY3P(map, ==, mac_find_macaddr(mip, map->ma_addr));
4995 + VERIFY3P(map, !=, NULL);
4996 + VERIFY3S(map->ma_nusers, ==, 0);
4997 + VERIFY3P(map->ma_vlans, ==, NULL);
4926 4998
4927 - ASSERT(map != NULL);
4928 - ASSERT(map->ma_nusers == 0);
4929 -
4930 4999 if (map == mip->mi_addresses) {
4931 5000 mip->mi_addresses = map->ma_next;
4932 5001 } else {
4933 5002 mac_address_t *pre;
4934 5003
4935 5004 pre = mip->mi_addresses;
4936 5005 while (pre->ma_next != map)
4937 5006 pre = pre->ma_next;
4938 5007 pre->ma_next = map->ma_next;
4939 5008 }
4940 5009
4941 5010 kmem_free(map, sizeof (mac_address_t));
4942 5011 }
4943 5012
5013 +static mac_vlan_t *
5014 +mac_find_vlan(mac_address_t *map, uint16_t vid)
5015 +{
5016 + mac_vlan_t *mvp;
5017 +
5018 + for (mvp = map->ma_vlans; mvp != NULL; mvp = mvp->mv_next) {
5019 + if (mvp->mv_vid == vid)
5020 + return (mvp);
5021 + }
5022 +
5023 + return (NULL);
5024 +}
5025 +
5026 +static mac_vlan_t *
5027 +mac_add_vlan(mac_address_t *map, uint16_t vid)
5028 +{
5029 + mac_vlan_t *mvp;
5030 +
5031 + /*
5032 + * We should never add the same {addr, VID} tuple more
5033 + * than once, but let's be sure.
5034 + */
5035 + for (mvp = map->ma_vlans; mvp != NULL; mvp = mvp->mv_next)
5036 + VERIFY3U(mvp->mv_vid, !=, vid);
5037 +
5038 + /* Add the VLAN to the head of the VLAN list. */
5039 + mvp = kmem_zalloc(sizeof (mac_vlan_t), KM_SLEEP);
5040 + mvp->mv_vid = vid;
5041 + mvp->mv_next = map->ma_vlans;
5042 + map->ma_vlans = mvp;
5043 +
5044 + return (mvp);
5045 +}
5046 +
5047 +static void
5048 +mac_rem_vlan(mac_address_t *map, mac_vlan_t *mvp)
5049 +{
5050 + mac_vlan_t *pre;
5051 +
5052 + if (map->ma_vlans == mvp) {
5053 + map->ma_vlans = mvp->mv_next;
5054 + } else {
5055 + pre = map->ma_vlans;
5056 + while (pre->mv_next != mvp) {
5057 + pre = pre->mv_next;
5058 +
5059 + /*
5060 + * We've reached the end of the list without
5061 + * finding mvp.
5062 + */
5063 + VERIFY3P(pre, !=, NULL);
5064 + }
5065 + pre->mv_next = mvp->mv_next;
5066 + }
5067 +
5068 + kmem_free(mvp, sizeof (mac_vlan_t));
5069 +}
5070 +
4944 5071 /*
4945 - * Add a MAC address reference for a client. If the desired MAC address
4946 - * exists, add a reference to it. Otherwise, add the new address by adding
4947 - * it to a reserved group or setting promiscuous mode. Won't try different
4948 - * group is the group is non-NULL, so the caller must explictly share
4949 - * default group when needed.
4950 - *
4951 - * Note, the primary MAC address is initialized at registration time, so
4952 - * to add it to default group only need to activate it if its reference
4953 - * count is still zero. Also, some drivers may not have advertised RINGS
4954 - * capability.
5072 + * Create a new mac_address_t if this is the first use of the address
5073 + * or add a VID to an existing address. In either case, the
5074 + * mac_address_t acts as a list of {addr, VID} tuples where each tuple
5075 + * shares the same addr. If group is non-NULL then attempt to program
5076 + * the MAC's HW filters for this group. Otherwise, if group is NULL,
5077 + * then the MAC has no rings and there is nothing to program.
4955 5078 */
4956 5079 int
4957 -mac_add_macaddr(mac_impl_t *mip, mac_group_t *group, uint8_t *mac_addr,
4958 - boolean_t use_hw)
5080 +mac_add_macaddr_vlan(mac_impl_t *mip, mac_group_t *group, uint8_t *addr,
5081 + uint16_t vid, boolean_t use_hw)
4959 5082 {
4960 - mac_address_t *map;
4961 - int err = 0;
4962 - boolean_t allocated_map = B_FALSE;
5083 + mac_address_t *map;
5084 + mac_vlan_t *mvp;
5085 + int err = 0;
5086 + boolean_t allocated_map = B_FALSE;
5087 + boolean_t hw_mac = B_FALSE;
5088 + boolean_t hw_vlan = B_FALSE;
4963 5089
4964 5090 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
4965 5091
4966 - map = mac_find_macaddr(mip, mac_addr);
5092 + map = mac_find_macaddr(mip, addr);
4967 5093
4968 5094 /*
4969 - * If the new MAC address has not been added. Allocate a new one
4970 - * and set it up.
5095 + * If this is the first use of this MAC address then allocate
5096 + * and initialize a new structure.
4971 5097 */
4972 5098 if (map == NULL) {
4973 5099 map = kmem_zalloc(sizeof (mac_address_t), KM_SLEEP);
4974 5100 map->ma_len = mip->mi_type->mt_addr_length;
4975 - bcopy(mac_addr, map->ma_addr, map->ma_len);
5101 + bcopy(addr, map->ma_addr, map->ma_len);
4976 5102 map->ma_nusers = 0;
4977 5103 map->ma_group = group;
4978 5104 map->ma_mip = mip;
5105 + map->ma_untagged = B_FALSE;
4979 5106
4980 - /* add the new MAC address to the head of the address list */
5107 + /* Add the new MAC address to the head of the address list. */
4981 5108 map->ma_next = mip->mi_addresses;
4982 5109 mip->mi_addresses = map;
4983 5110
4984 5111 allocated_map = B_TRUE;
4985 5112 }
4986 5113
4987 - ASSERT(map->ma_group == NULL || map->ma_group == group);
5114 + VERIFY(map->ma_group == NULL || map->ma_group == group);
4988 5115 if (map->ma_group == NULL)
4989 5116 map->ma_group = group;
4990 5117
5118 + if (vid == VLAN_ID_NONE) {
5119 + map->ma_untagged = B_TRUE;
5120 + mvp = NULL;
5121 + } else {
5122 + mvp = mac_add_vlan(map, vid);
5123 + }
5124 +
4991 5125 /*
4992 - * If the MAC address is already in use, simply account for the
4993 - * new client.
5126 + * Set the VLAN HW filter if:
5127 + *
5128 + * o the MAC's VLAN HW filtering is enabled, and
5129 + * o the address does not currently rely on promisc mode.
5130 + *
5131 + * This is called even when the client specifies an untagged
5132 + * address (VLAN_ID_NONE) because some MAC providers require
5133 + * setting additional bits to accept untagged traffic when
5134 + * VLAN HW filtering is enabled.
4994 5135 */
4995 - if (map->ma_nusers++ > 0)
5136 + if (MAC_GROUP_HW_VLAN(group) &&
5137 + map->ma_type != MAC_ADDRESS_TYPE_UNICAST_PROMISC) {
5138 + if ((err = mac_group_addvlan(group, vid)) != 0)
5139 + goto bail;
5140 +
5141 + hw_vlan = B_TRUE;
5142 + }
5143 +
5144 + VERIFY3S(map->ma_nusers, >=, 0);
5145 + map->ma_nusers++;
5146 +
5147 + /*
5148 + * If this MAC address already has a HW filter then simply
5149 + * increment the counter.
5150 + */
5151 + if (map->ma_nusers > 1)
4996 5152 return (0);
4997 5153
4998 5154 /*
5155 + * All logic from here on out is executed during initial
5156 + * creation only.
5157 + */
5158 + VERIFY3S(map->ma_nusers, ==, 1);
5159 +
5160 + /*
4999 5161 * Activate this MAC address by adding it to the reserved group.
5000 5162 */
5001 5163 if (group != NULL) {
5002 - err = mac_group_addmac(group, (const uint8_t *)mac_addr);
5003 - if (err == 0) {
5004 - map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED;
5005 - return (0);
5164 + err = mac_group_addmac(group, (const uint8_t *)addr);
5165 +
5166 + /*
5167 + * If the driver is out of filters then we can
5168 + * continue and use promisc mode. For any other error,
5169 + * assume the driver is in a state where we can't
5170 + * program the filters or use promisc mode; so we must
5171 + * bail.
5172 + */
5173 + if (err != 0 && err != ENOSPC) {
5174 + map->ma_nusers--;
5175 + goto bail;
5006 5176 }
5177 +
5178 + hw_mac = (err == 0);
5007 5179 }
5008 5180
5181 + if (hw_mac) {
5182 + map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED;
5183 + return (0);
5184 + }
5185 +
5009 5186 /*
5010 5187 * The MAC address addition failed. If the client requires a
5011 - * hardware classified MAC address, fail the operation.
5188 + * hardware classified MAC address, fail the operation. This
5189 + * feature is only used by sun4v vsw.
5012 5190 */
5013 - if (use_hw) {
5191 + if (use_hw && !hw_mac) {
5014 5192 err = ENOSPC;
5193 + map->ma_nusers--;
5015 5194 goto bail;
5016 5195 }
5017 5196
5018 5197 /*
5019 - * Try promiscuous mode.
5020 - *
5021 - * For drivers that don't advertise RINGS capability, do
5022 - * nothing for the primary address.
5198 + * If we reach this point then either the MAC doesn't have
5199 + * RINGS capability or we are out of MAC address HW filters.
5200 + * In any case we must put the MAC into promiscuous mode.
5023 5201 */
5202 + VERIFY(group == NULL || !hw_mac);
5203 +
5204 + /*
5205 + * The one exception is the primary address. A non-RINGS
5206 + * driver filters the primary address by default; promisc mode
5207 + * is not needed.
5208 + */
5024 5209 if ((group == NULL) &&
5025 5210 (bcmp(map->ma_addr, mip->mi_addr, map->ma_len) == 0)) {
5026 5211 map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED;
5027 5212 return (0);
5028 5213 }
5029 5214
5030 5215 /*
5031 - * Enable promiscuous mode in order to receive traffic
5032 - * to the new MAC address.
5216 + * Enable promiscuous mode in order to receive traffic to the
5217 + * new MAC address. All existing HW filters still send their
5218 + * traffic to their respective group/SRSes. But with promisc
5219 + * enabled all unknown traffic is delivered to the default
5220 + * group where it is SW classified via mac_rx_classify().
5033 5221 */
5034 5222 if ((err = i_mac_promisc_set(mip, B_TRUE)) == 0) {
5035 5223 map->ma_type = MAC_ADDRESS_TYPE_UNICAST_PROMISC;
5036 5224 return (0);
5037 5225 }
5038 5226
5039 - /*
5040 - * Free the MAC address that could not be added. Don't free
5041 - * a pre-existing address, it could have been the entry
5042 - * for the primary MAC address which was pre-allocated by
5043 - * mac_init_macaddr(), and which must remain on the list.
5044 - */
5045 5227 bail:
5046 - map->ma_nusers--;
5228 + if (hw_vlan) {
5229 + int err2 = mac_group_remvlan(group, vid);
5230 +
5231 + if (err2 != 0) {
5232 + cmn_err(CE_WARN, "Failed to remove VLAN %u from group"
5233 + " %d on MAC %s: %d.", vid, group->mrg_index,
5234 + mip->mi_name, err2);
5235 + }
5236 + }
5237 +
5238 + if (mvp != NULL)
5239 + mac_rem_vlan(map, mvp);
5240 +
5047 5241 if (allocated_map)
5048 5242 mac_free_macaddr(map);
5243 +
5049 5244 return (err);
5050 5245 }
5051 5246
5052 -/*
5053 - * Remove a reference to a MAC address. This may cause to remove the MAC
5054 - * address from an associated group or to turn off promiscuous mode.
5055 - * The caller needs to handle the failure properly.
5056 - */
5057 5247 int
5058 -mac_remove_macaddr(mac_address_t *map)
5248 +mac_remove_macaddr_vlan(mac_address_t *map, uint16_t vid)
5059 5249 {
5060 - mac_impl_t *mip = map->ma_mip;
5061 - int err = 0;
5250 + mac_vlan_t *mvp;
5251 + mac_impl_t *mip = map->ma_mip;
5252 + mac_group_t *group = map->ma_group;
5253 + int err = 0;
5062 5254
5063 5255 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
5256 + VERIFY3P(map, ==, mac_find_macaddr(mip, map->ma_addr));
5064 5257
5065 - ASSERT(map == mac_find_macaddr(mip, map->ma_addr));
5258 + if (vid == VLAN_ID_NONE) {
5259 + map->ma_untagged = B_FALSE;
5260 + mvp = NULL;
5261 + } else {
5262 + mvp = mac_find_vlan(map, vid);
5263 + VERIFY3P(mvp, !=, NULL);
5264 + }
5066 5265
5266 + if (MAC_GROUP_HW_VLAN(group) &&
5267 + map->ma_type == MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED &&
5268 + ((err = mac_group_remvlan(group, vid)) != 0))
5269 + return (err);
5270 +
5271 + if (mvp != NULL)
5272 + mac_rem_vlan(map, mvp);
5273 +
5067 5274 /*
5068 5275 * If it's not the last client using this MAC address, only update
5069 5276 * the MAC clients count.
5070 5277 */
5071 - if (--map->ma_nusers > 0)
5278 + map->ma_nusers--;
5279 + if (map->ma_nusers > 0)
5072 5280 return (0);
5073 5281
5074 5282 /*
5075 - * The MAC address is no longer used by any MAC client, so remove
5076 - * it from its associated group, or turn off promiscuous mode
5077 - * if it was enabled for the MAC address.
5283 + * The MAC address is no longer used by any MAC client, so
5284 + * remove it from its associated group. Turn off promiscuous
5285 + * mode if this is the last address relying on it.
5078 5286 */
5079 5287 switch (map->ma_type) {
5080 5288 case MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED:
5081 5289 /*
5082 5290 * Don't free the preset primary address for drivers that
5083 5291 * don't advertise RINGS capability.
5084 5292 */
5085 - if (map->ma_group == NULL)
5293 + if (group == NULL)
5086 5294 return (0);
5087 5295
5088 - err = mac_group_remmac(map->ma_group, map->ma_addr);
5089 - if (err == 0)
5090 - map->ma_group = NULL;
5296 + if ((err = mac_group_remmac(group, map->ma_addr)) != 0) {
5297 + if (vid == VLAN_ID_NONE)
5298 + map->ma_untagged = B_TRUE;
5299 + else
5300 + (void) mac_add_vlan(map, vid);
5301 +
5302 + /*
5303 + * If we fail to remove the MAC address HW
5304 + * filter but then also fail to re-add the
5305 + * VLAN HW filter then we are in a busted
5306 + * state and should just crash.
5307 + */
5308 + if (MAC_GROUP_HW_VLAN(group)) {
5309 + int err2;
5310 +
5311 + err2 = mac_group_addvlan(group, vid);
5312 + if (err2 != 0) {
5313 + cmn_err(CE_WARN, "Failed to readd VLAN"
5314 + " %u to group %d on MAC %s: %d.",
5315 + vid, group->mrg_index, mip->mi_name,
5316 + err2);
5317 + }
5318 + }
5319 +
5320 + return (err);
5321 + }
5322 +
5323 + map->ma_group = NULL;
5091 5324 break;
5092 5325 case MAC_ADDRESS_TYPE_UNICAST_PROMISC:
5093 5326 err = i_mac_promisc_set(mip, B_FALSE);
5094 5327 break;
5095 5328 default:
5096 - ASSERT(B_FALSE);
5329 + panic("Unexpected ma_type 0x%x, file: %s, line %d",
5330 + map->ma_type, __FILE__, __LINE__);
5097 5331 }
5098 5332
5099 5333 if (err != 0)
5100 5334 return (err);
5101 5335
5102 5336 /*
5103 5337 * We created MAC address for the primary one at registration, so we
5104 5338 * won't free it here. mac_fini_macaddr() will take care of it.
5105 5339 */
5106 5340 if (bcmp(map->ma_addr, mip->mi_addr, map->ma_len) != 0)
5107 5341 mac_free_macaddr(map);
5108 5342
5109 5343 return (0);
5110 5344 }
5111 5345
5112 5346 /*
5113 5347 * Update an existing MAC address. The caller need to make sure that the new
5114 5348 * value has not been used.
5115 5349 */
5116 5350 int
5117 5351 mac_update_macaddr(mac_address_t *map, uint8_t *mac_addr)
5118 5352 {
5119 5353 mac_impl_t *mip = map->ma_mip;
5120 5354 int err = 0;
5121 5355
5122 5356 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
5123 5357 ASSERT(mac_find_macaddr(mip, mac_addr) == NULL);
5124 5358
5125 5359 switch (map->ma_type) {
5126 5360 case MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED:
5127 5361 /*
5128 5362 * Update the primary address for drivers that are not
5129 5363 * RINGS capable.
5130 5364 */
5131 5365 if (mip->mi_rx_groups == NULL) {
5132 5366 err = mip->mi_unicst(mip->mi_driver, (const uint8_t *)
5133 5367 mac_addr);
5134 5368 if (err != 0)
5135 5369 return (err);
5136 5370 break;
5137 5371 }
5138 5372
5139 5373 /*
5140 5374 * If this MAC address is not currently in use,
5141 5375 * simply break out and update the value.
5142 5376 */
5143 5377 if (map->ma_nusers == 0)
5144 5378 break;
5145 5379
5146 5380 /*
5147 5381 * Need to replace the MAC address associated with a group.
5148 5382 */
5149 5383 err = mac_group_remmac(map->ma_group, map->ma_addr);
5150 5384 if (err != 0)
5151 5385 return (err);
5152 5386
5153 5387 err = mac_group_addmac(map->ma_group, mac_addr);
5154 5388
5155 5389 /*
5156 5390 * Failure hints hardware error. The MAC layer needs to
5157 5391 * have error notification facility to handle this.
5158 5392 * Now, simply try to restore the value.
5159 5393 */
5160 5394 if (err != 0)
5161 5395 (void) mac_group_addmac(map->ma_group, map->ma_addr);
5162 5396
5163 5397 break;
5164 5398 case MAC_ADDRESS_TYPE_UNICAST_PROMISC:
5165 5399 /*
5166 5400 * Need to do nothing more if in promiscuous mode.
5167 5401 */
5168 5402 break;
5169 5403 default:
5170 5404 ASSERT(B_FALSE);
5171 5405 }
5172 5406
5173 5407 /*
5174 5408 * Successfully replaced the MAC address.
5175 5409 */
5176 5410 if (err == 0)
5177 5411 bcopy(mac_addr, map->ma_addr, map->ma_len);
5178 5412
5179 5413 return (err);
5180 5414 }
5181 5415
5182 5416 /*
5183 5417 * Freshen the MAC address with new value. Its caller must have updated the
5184 5418 * hardware MAC address before calling this function.
5185 5419 * This funcitons is supposed to be used to handle the MAC address change
5186 5420 * notification from underlying drivers.
5187 5421 */
5188 5422 void
5189 5423 mac_freshen_macaddr(mac_address_t *map, uint8_t *mac_addr)
5190 5424 {
5191 5425 mac_impl_t *mip = map->ma_mip;
5192 5426
5193 5427 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
5194 5428 ASSERT(mac_find_macaddr(mip, mac_addr) == NULL);
5195 5429
5196 5430 /*
5197 5431 * Freshen the MAC address with new value.
5198 5432 */
5199 5433 bcopy(mac_addr, map->ma_addr, map->ma_len);
5200 5434 bcopy(mac_addr, mip->mi_addr, map->ma_len);
5201 5435
5202 5436 /*
5203 5437 * Update all MAC clients that share this MAC address.
5204 5438 */
5205 5439 mac_unicast_update_clients(mip, map);
5206 5440 }
5207 5441
5208 5442 /*
5209 5443 * Set up the primary MAC address.
5210 5444 */
5211 5445 void
5212 5446 mac_init_macaddr(mac_impl_t *mip)
5213 5447 {
5214 5448 mac_address_t *map;
5215 5449
5216 5450 /*
5217 5451 * The reference count is initialized to zero, until it's really
5218 5452 * activated.
5219 5453 */
5220 5454 map = kmem_zalloc(sizeof (mac_address_t), KM_SLEEP);
5221 5455 map->ma_len = mip->mi_type->mt_addr_length;
5222 5456 bcopy(mip->mi_addr, map->ma_addr, map->ma_len);
5223 5457
5224 5458 /*
5225 5459 * If driver advertises RINGS capability, it shouldn't have initialized
5226 5460 * its primary MAC address. For other drivers, including VNIC, the
5227 5461 * primary address must work after registration.
5228 5462 */
5229 5463 if (mip->mi_rx_groups == NULL)
5230 5464 map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED;
5231 5465
5232 5466 map->ma_mip = mip;
5233 5467
5234 5468 mip->mi_addresses = map;
5235 5469 }
5236 5470
5237 5471 /*
5238 5472 * Clean up the primary MAC address. Note, only one primary MAC address
5239 5473 * is allowed. All other MAC addresses must have been freed appropriately.
5240 5474 */
5241 5475 void
5242 5476 mac_fini_macaddr(mac_impl_t *mip)
↓ open down ↓ |
136 lines elided |
↑ open up ↑ |
5243 5477 {
5244 5478 mac_address_t *map = mip->mi_addresses;
5245 5479
5246 5480 if (map == NULL)
5247 5481 return;
5248 5482
5249 5483 /*
5250 5484 * If mi_addresses is initialized, there should be exactly one
5251 5485 * entry left on the list with no users.
5252 5486 */
5253 - ASSERT(map->ma_nusers == 0);
5254 - ASSERT(map->ma_next == NULL);
5487 + VERIFY3S(map->ma_nusers, ==, 0);
5488 + VERIFY3P(map->ma_next, ==, NULL);
5489 + VERIFY3P(map->ma_vlans, ==, NULL);
5255 5490
5256 5491 kmem_free(map, sizeof (mac_address_t));
5257 5492 mip->mi_addresses = NULL;
5258 5493 }
5259 5494
5260 5495 /*
5261 5496 * Logging related functions.
5262 5497 *
5263 5498 * Note that Kernel statistics have been extended to maintain fine
5264 5499 * granularity of statistics viz. hardware lane, software lane, fanout
5265 5500 * stats etc. However, extended accounting continues to support only
5266 5501 * aggregate statistics like before.
5267 5502 */
5268 5503
5269 5504 /* Write the flow description to a netinfo_t record */
5270 5505 static netinfo_t *
5271 5506 mac_write_flow_desc(flow_entry_t *flent, mac_client_impl_t *mcip)
5272 5507 {
5273 5508 netinfo_t *ninfo;
5274 5509 net_desc_t *ndesc;
5275 5510 flow_desc_t *fdesc;
5276 5511 mac_resource_props_t *mrp;
5277 5512
5278 5513 ninfo = kmem_zalloc(sizeof (netinfo_t), KM_NOSLEEP);
5279 5514 if (ninfo == NULL)
5280 5515 return (NULL);
5281 5516 ndesc = kmem_zalloc(sizeof (net_desc_t), KM_NOSLEEP);
5282 5517 if (ndesc == NULL) {
5283 5518 kmem_free(ninfo, sizeof (netinfo_t));
5284 5519 return (NULL);
5285 5520 }
5286 5521
5287 5522 /*
5288 5523 * Grab the fe_lock to see a self-consistent fe_flow_desc.
5289 5524 * Updates to the fe_flow_desc are done under the fe_lock
5290 5525 */
5291 5526 mutex_enter(&flent->fe_lock);
5292 5527 fdesc = &flent->fe_flow_desc;
5293 5528 mrp = &flent->fe_resource_props;
5294 5529
5295 5530 ndesc->nd_name = flent->fe_flow_name;
5296 5531 ndesc->nd_devname = mcip->mci_name;
5297 5532 bcopy(fdesc->fd_src_mac, ndesc->nd_ehost, ETHERADDRL);
5298 5533 bcopy(fdesc->fd_dst_mac, ndesc->nd_edest, ETHERADDRL);
5299 5534 ndesc->nd_sap = htonl(fdesc->fd_sap);
5300 5535 ndesc->nd_isv4 = (uint8_t)fdesc->fd_ipversion == IPV4_VERSION;
5301 5536 ndesc->nd_bw_limit = mrp->mrp_maxbw;
5302 5537 if (ndesc->nd_isv4) {
5303 5538 ndesc->nd_saddr[3] = htonl(fdesc->fd_local_addr.s6_addr32[3]);
5304 5539 ndesc->nd_daddr[3] = htonl(fdesc->fd_remote_addr.s6_addr32[3]);
5305 5540 } else {
5306 5541 bcopy(&fdesc->fd_local_addr, ndesc->nd_saddr, IPV6_ADDR_LEN);
5307 5542 bcopy(&fdesc->fd_remote_addr, ndesc->nd_daddr, IPV6_ADDR_LEN);
5308 5543 }
5309 5544 ndesc->nd_sport = htons(fdesc->fd_local_port);
5310 5545 ndesc->nd_dport = htons(fdesc->fd_remote_port);
5311 5546 ndesc->nd_protocol = (uint8_t)fdesc->fd_protocol;
5312 5547 mutex_exit(&flent->fe_lock);
5313 5548
5314 5549 ninfo->ni_record = ndesc;
5315 5550 ninfo->ni_size = sizeof (net_desc_t);
5316 5551 ninfo->ni_type = EX_NET_FLDESC_REC;
5317 5552
5318 5553 return (ninfo);
5319 5554 }
5320 5555
5321 5556 /* Write the flow statistics to a netinfo_t record */
5322 5557 static netinfo_t *
5323 5558 mac_write_flow_stats(flow_entry_t *flent)
5324 5559 {
5325 5560 netinfo_t *ninfo;
5326 5561 net_stat_t *nstat;
5327 5562 mac_soft_ring_set_t *mac_srs;
5328 5563 mac_rx_stats_t *mac_rx_stat;
5329 5564 mac_tx_stats_t *mac_tx_stat;
5330 5565 int i;
5331 5566
5332 5567 ninfo = kmem_zalloc(sizeof (netinfo_t), KM_NOSLEEP);
5333 5568 if (ninfo == NULL)
5334 5569 return (NULL);
5335 5570 nstat = kmem_zalloc(sizeof (net_stat_t), KM_NOSLEEP);
5336 5571 if (nstat == NULL) {
5337 5572 kmem_free(ninfo, sizeof (netinfo_t));
5338 5573 return (NULL);
5339 5574 }
5340 5575
5341 5576 nstat->ns_name = flent->fe_flow_name;
5342 5577 for (i = 0; i < flent->fe_rx_srs_cnt; i++) {
5343 5578 mac_srs = (mac_soft_ring_set_t *)flent->fe_rx_srs[i];
5344 5579 mac_rx_stat = &mac_srs->srs_rx.sr_stat;
5345 5580
5346 5581 nstat->ns_ibytes += mac_rx_stat->mrs_intrbytes +
5347 5582 mac_rx_stat->mrs_pollbytes + mac_rx_stat->mrs_lclbytes;
5348 5583 nstat->ns_ipackets += mac_rx_stat->mrs_intrcnt +
5349 5584 mac_rx_stat->mrs_pollcnt + mac_rx_stat->mrs_lclcnt;
5350 5585 nstat->ns_oerrors += mac_rx_stat->mrs_ierrors;
5351 5586 }
5352 5587
5353 5588 mac_srs = (mac_soft_ring_set_t *)(flent->fe_tx_srs);
5354 5589 if (mac_srs != NULL) {
5355 5590 mac_tx_stat = &mac_srs->srs_tx.st_stat;
5356 5591
5357 5592 nstat->ns_obytes = mac_tx_stat->mts_obytes;
5358 5593 nstat->ns_opackets = mac_tx_stat->mts_opackets;
5359 5594 nstat->ns_oerrors = mac_tx_stat->mts_oerrors;
5360 5595 }
5361 5596
5362 5597 ninfo->ni_record = nstat;
5363 5598 ninfo->ni_size = sizeof (net_stat_t);
5364 5599 ninfo->ni_type = EX_NET_FLSTAT_REC;
5365 5600
5366 5601 return (ninfo);
5367 5602 }
5368 5603
5369 5604 /* Write the link description to a netinfo_t record */
5370 5605 static netinfo_t *
5371 5606 mac_write_link_desc(mac_client_impl_t *mcip)
5372 5607 {
5373 5608 netinfo_t *ninfo;
5374 5609 net_desc_t *ndesc;
5375 5610 flow_entry_t *flent = mcip->mci_flent;
5376 5611
5377 5612 ninfo = kmem_zalloc(sizeof (netinfo_t), KM_NOSLEEP);
5378 5613 if (ninfo == NULL)
5379 5614 return (NULL);
5380 5615 ndesc = kmem_zalloc(sizeof (net_desc_t), KM_NOSLEEP);
5381 5616 if (ndesc == NULL) {
5382 5617 kmem_free(ninfo, sizeof (netinfo_t));
5383 5618 return (NULL);
5384 5619 }
5385 5620
5386 5621 ndesc->nd_name = mcip->mci_name;
5387 5622 ndesc->nd_devname = mcip->mci_name;
5388 5623 ndesc->nd_isv4 = B_TRUE;
5389 5624 /*
5390 5625 * Grab the fe_lock to see a self-consistent fe_flow_desc.
5391 5626 * Updates to the fe_flow_desc are done under the fe_lock
5392 5627 * after removing the flent from the flow table.
5393 5628 */
5394 5629 mutex_enter(&flent->fe_lock);
5395 5630 bcopy(flent->fe_flow_desc.fd_src_mac, ndesc->nd_ehost, ETHERADDRL);
5396 5631 mutex_exit(&flent->fe_lock);
5397 5632
5398 5633 ninfo->ni_record = ndesc;
5399 5634 ninfo->ni_size = sizeof (net_desc_t);
5400 5635 ninfo->ni_type = EX_NET_LNDESC_REC;
5401 5636
5402 5637 return (ninfo);
5403 5638 }
5404 5639
5405 5640 /* Write the link statistics to a netinfo_t record */
5406 5641 static netinfo_t *
5407 5642 mac_write_link_stats(mac_client_impl_t *mcip)
5408 5643 {
5409 5644 netinfo_t *ninfo;
5410 5645 net_stat_t *nstat;
5411 5646 flow_entry_t *flent;
5412 5647 mac_soft_ring_set_t *mac_srs;
5413 5648 mac_rx_stats_t *mac_rx_stat;
5414 5649 mac_tx_stats_t *mac_tx_stat;
5415 5650 int i;
5416 5651
5417 5652 ninfo = kmem_zalloc(sizeof (netinfo_t), KM_NOSLEEP);
5418 5653 if (ninfo == NULL)
5419 5654 return (NULL);
5420 5655 nstat = kmem_zalloc(sizeof (net_stat_t), KM_NOSLEEP);
5421 5656 if (nstat == NULL) {
5422 5657 kmem_free(ninfo, sizeof (netinfo_t));
5423 5658 return (NULL);
5424 5659 }
5425 5660
5426 5661 nstat->ns_name = mcip->mci_name;
5427 5662 flent = mcip->mci_flent;
5428 5663 if (flent != NULL) {
5429 5664 for (i = 0; i < flent->fe_rx_srs_cnt; i++) {
5430 5665 mac_srs = (mac_soft_ring_set_t *)flent->fe_rx_srs[i];
5431 5666 mac_rx_stat = &mac_srs->srs_rx.sr_stat;
5432 5667
5433 5668 nstat->ns_ibytes += mac_rx_stat->mrs_intrbytes +
5434 5669 mac_rx_stat->mrs_pollbytes +
5435 5670 mac_rx_stat->mrs_lclbytes;
5436 5671 nstat->ns_ipackets += mac_rx_stat->mrs_intrcnt +
5437 5672 mac_rx_stat->mrs_pollcnt + mac_rx_stat->mrs_lclcnt;
5438 5673 nstat->ns_oerrors += mac_rx_stat->mrs_ierrors;
5439 5674 }
5440 5675 }
5441 5676
5442 5677 mac_srs = (mac_soft_ring_set_t *)(mcip->mci_flent->fe_tx_srs);
5443 5678 if (mac_srs != NULL) {
5444 5679 mac_tx_stat = &mac_srs->srs_tx.st_stat;
5445 5680
5446 5681 nstat->ns_obytes = mac_tx_stat->mts_obytes;
5447 5682 nstat->ns_opackets = mac_tx_stat->mts_opackets;
5448 5683 nstat->ns_oerrors = mac_tx_stat->mts_oerrors;
5449 5684 }
5450 5685
5451 5686 ninfo->ni_record = nstat;
5452 5687 ninfo->ni_size = sizeof (net_stat_t);
5453 5688 ninfo->ni_type = EX_NET_LNSTAT_REC;
5454 5689
5455 5690 return (ninfo);
5456 5691 }
5457 5692
5458 5693 typedef struct i_mac_log_state_s {
5459 5694 boolean_t mi_last;
5460 5695 int mi_fenable;
5461 5696 int mi_lenable;
5462 5697 list_t *mi_list;
5463 5698 } i_mac_log_state_t;
5464 5699
5465 5700 /*
5466 5701 * For a given flow, if the description has not been logged before, do it now.
5467 5702 * If it is a VNIC, then we have collected information about it from the MAC
5468 5703 * table, so skip it.
5469 5704 *
5470 5705 * Called through mac_flow_walk_nolock()
5471 5706 *
5472 5707 * Return 0 if successful.
5473 5708 */
5474 5709 static int
5475 5710 mac_log_flowinfo(flow_entry_t *flent, void *arg)
5476 5711 {
5477 5712 mac_client_impl_t *mcip = flent->fe_mcip;
5478 5713 i_mac_log_state_t *lstate = arg;
5479 5714 netinfo_t *ninfo;
5480 5715
5481 5716 if (mcip == NULL)
5482 5717 return (0);
5483 5718
5484 5719 /*
5485 5720 * If the name starts with "vnic", and fe_user_generated is true (to
5486 5721 * exclude the mcast and active flow entries created implicitly for
5487 5722 * a vnic, it is a VNIC flow. i.e. vnic1 is a vnic flow,
5488 5723 * vnic/bge1/mcast1 is not and neither is vnic/bge1/active.
5489 5724 */
5490 5725 if (strncasecmp(flent->fe_flow_name, "vnic", 4) == 0 &&
5491 5726 (flent->fe_type & FLOW_USER) != 0) {
5492 5727 return (0);
5493 5728 }
5494 5729
5495 5730 if (!flent->fe_desc_logged) {
5496 5731 /*
5497 5732 * We don't return error because we want to continue the
5498 5733 * walk in case this is the last walk which means we
5499 5734 * need to reset fe_desc_logged in all the flows.
5500 5735 */
5501 5736 if ((ninfo = mac_write_flow_desc(flent, mcip)) == NULL)
5502 5737 return (0);
5503 5738 list_insert_tail(lstate->mi_list, ninfo);
5504 5739 flent->fe_desc_logged = B_TRUE;
5505 5740 }
5506 5741
5507 5742 /*
5508 5743 * Regardless of the error, we want to proceed in case we have to
5509 5744 * reset fe_desc_logged.
5510 5745 */
5511 5746 ninfo = mac_write_flow_stats(flent);
5512 5747 if (ninfo == NULL)
5513 5748 return (-1);
5514 5749
5515 5750 list_insert_tail(lstate->mi_list, ninfo);
5516 5751
5517 5752 if (mcip != NULL && !(mcip->mci_state_flags & MCIS_DESC_LOGGED))
5518 5753 flent->fe_desc_logged = B_FALSE;
5519 5754
5520 5755 return (0);
5521 5756 }
5522 5757
5523 5758 /*
5524 5759 * Log the description for each mac client of this mac_impl_t, if it
5525 5760 * hasn't already been done. Additionally, log statistics for the link as
5526 5761 * well. Walk the flow table and log information for each flow as well.
5527 5762 * If it is the last walk (mci_last), then we turn off mci_desc_logged (and
5528 5763 * also fe_desc_logged, if flow logging is on) since we want to log the
5529 5764 * description if and when logging is restarted.
5530 5765 *
5531 5766 * Return 0 upon success or -1 upon failure
5532 5767 */
5533 5768 static int
5534 5769 i_mac_impl_log(mac_impl_t *mip, i_mac_log_state_t *lstate)
5535 5770 {
5536 5771 mac_client_impl_t *mcip;
5537 5772 netinfo_t *ninfo;
5538 5773
5539 5774 i_mac_perim_enter(mip);
5540 5775 /*
5541 5776 * Only walk the client list for NIC and etherstub
5542 5777 */
5543 5778 if ((mip->mi_state_flags & MIS_DISABLED) ||
5544 5779 ((mip->mi_state_flags & MIS_IS_VNIC) &&
5545 5780 (mac_get_lower_mac_handle((mac_handle_t)mip) != NULL))) {
5546 5781 i_mac_perim_exit(mip);
5547 5782 return (0);
5548 5783 }
5549 5784
5550 5785 for (mcip = mip->mi_clients_list; mcip != NULL;
5551 5786 mcip = mcip->mci_client_next) {
5552 5787 if (!MCIP_DATAPATH_SETUP(mcip))
5553 5788 continue;
5554 5789 if (lstate->mi_lenable) {
5555 5790 if (!(mcip->mci_state_flags & MCIS_DESC_LOGGED)) {
5556 5791 ninfo = mac_write_link_desc(mcip);
5557 5792 if (ninfo == NULL) {
5558 5793 /*
5559 5794 * We can't terminate it if this is the last
5560 5795 * walk, else there might be some links with
5561 5796 * mi_desc_logged set to true, which means
5562 5797 * their description won't be logged the next
5563 5798 * time logging is started (similarly for the
5564 5799 * flows within such links). We can continue
5565 5800 * without walking the flow table (i.e. to
5566 5801 * set fe_desc_logged to false) because we
5567 5802 * won't have written any flow stuff for this
5568 5803 * link as we haven't logged the link itself.
5569 5804 */
5570 5805 i_mac_perim_exit(mip);
5571 5806 if (lstate->mi_last)
5572 5807 return (0);
5573 5808 else
5574 5809 return (-1);
5575 5810 }
5576 5811 mcip->mci_state_flags |= MCIS_DESC_LOGGED;
5577 5812 list_insert_tail(lstate->mi_list, ninfo);
5578 5813 }
5579 5814 }
5580 5815
5581 5816 ninfo = mac_write_link_stats(mcip);
5582 5817 if (ninfo == NULL && !lstate->mi_last) {
5583 5818 i_mac_perim_exit(mip);
5584 5819 return (-1);
5585 5820 }
5586 5821 list_insert_tail(lstate->mi_list, ninfo);
5587 5822
5588 5823 if (lstate->mi_last)
5589 5824 mcip->mci_state_flags &= ~MCIS_DESC_LOGGED;
5590 5825
5591 5826 if (lstate->mi_fenable) {
5592 5827 if (mcip->mci_subflow_tab != NULL) {
5593 5828 (void) mac_flow_walk_nolock(
5594 5829 mcip->mci_subflow_tab, mac_log_flowinfo,
5595 5830 lstate);
5596 5831 }
5597 5832 }
5598 5833 }
5599 5834 i_mac_perim_exit(mip);
5600 5835 return (0);
5601 5836 }
5602 5837
5603 5838 /*
5604 5839 * modhash walker function to add a mac_impl_t to a list
5605 5840 */
5606 5841 /*ARGSUSED*/
5607 5842 static uint_t
5608 5843 i_mac_impl_list_walker(mod_hash_key_t key, mod_hash_val_t *val, void *arg)
5609 5844 {
5610 5845 list_t *list = (list_t *)arg;
5611 5846 mac_impl_t *mip = (mac_impl_t *)val;
5612 5847
5613 5848 if ((mip->mi_state_flags & MIS_DISABLED) == 0) {
5614 5849 list_insert_tail(list, mip);
5615 5850 mip->mi_ref++;
5616 5851 }
5617 5852
5618 5853 return (MH_WALK_CONTINUE);
5619 5854 }
5620 5855
5621 5856 void
5622 5857 i_mac_log_info(list_t *net_log_list, i_mac_log_state_t *lstate)
5623 5858 {
5624 5859 list_t mac_impl_list;
5625 5860 mac_impl_t *mip;
5626 5861 netinfo_t *ninfo;
5627 5862
5628 5863 /* Create list of mac_impls */
5629 5864 ASSERT(RW_LOCK_HELD(&i_mac_impl_lock));
5630 5865 list_create(&mac_impl_list, sizeof (mac_impl_t), offsetof(mac_impl_t,
5631 5866 mi_node));
5632 5867 mod_hash_walk(i_mac_impl_hash, i_mac_impl_list_walker, &mac_impl_list);
5633 5868 rw_exit(&i_mac_impl_lock);
5634 5869
5635 5870 /* Create log entries for each mac_impl */
5636 5871 for (mip = list_head(&mac_impl_list); mip != NULL;
5637 5872 mip = list_next(&mac_impl_list, mip)) {
5638 5873 if (i_mac_impl_log(mip, lstate) != 0)
5639 5874 continue;
5640 5875 }
5641 5876
5642 5877 /* Remove elements and destroy list of mac_impls */
5643 5878 rw_enter(&i_mac_impl_lock, RW_WRITER);
5644 5879 while ((mip = list_remove_tail(&mac_impl_list)) != NULL) {
5645 5880 mip->mi_ref--;
5646 5881 }
5647 5882 rw_exit(&i_mac_impl_lock);
5648 5883 list_destroy(&mac_impl_list);
5649 5884
5650 5885 /*
5651 5886 * Write log entries to files outside of locks, free associated
5652 5887 * structures, and remove entries from the list.
5653 5888 */
5654 5889 while ((ninfo = list_head(net_log_list)) != NULL) {
5655 5890 (void) exacct_commit_netinfo(ninfo->ni_record, ninfo->ni_type);
5656 5891 list_remove(net_log_list, ninfo);
5657 5892 kmem_free(ninfo->ni_record, ninfo->ni_size);
5658 5893 kmem_free(ninfo, sizeof (*ninfo));
5659 5894 }
5660 5895 list_destroy(net_log_list);
5661 5896 }
5662 5897
5663 5898 /*
5664 5899 * The timer thread that runs every mac_logging_interval seconds and logs
5665 5900 * link and/or flow information.
5666 5901 */
5667 5902 /* ARGSUSED */
5668 5903 void
5669 5904 mac_log_linkinfo(void *arg)
5670 5905 {
5671 5906 i_mac_log_state_t lstate;
5672 5907 list_t net_log_list;
5673 5908
5674 5909 list_create(&net_log_list, sizeof (netinfo_t),
5675 5910 offsetof(netinfo_t, ni_link));
5676 5911
5677 5912 rw_enter(&i_mac_impl_lock, RW_READER);
5678 5913 if (!mac_flow_log_enable && !mac_link_log_enable) {
5679 5914 rw_exit(&i_mac_impl_lock);
5680 5915 return;
5681 5916 }
5682 5917 lstate.mi_fenable = mac_flow_log_enable;
5683 5918 lstate.mi_lenable = mac_link_log_enable;
5684 5919 lstate.mi_last = B_FALSE;
5685 5920 lstate.mi_list = &net_log_list;
5686 5921
5687 5922 /* Write log entries for each mac_impl in the list */
5688 5923 i_mac_log_info(&net_log_list, &lstate);
5689 5924
5690 5925 if (mac_flow_log_enable || mac_link_log_enable) {
5691 5926 mac_logging_timer = timeout(mac_log_linkinfo, NULL,
5692 5927 SEC_TO_TICK(mac_logging_interval));
5693 5928 }
5694 5929 }
5695 5930
5696 5931 typedef struct i_mac_fastpath_state_s {
5697 5932 boolean_t mf_disable;
5698 5933 int mf_err;
5699 5934 } i_mac_fastpath_state_t;
5700 5935
5701 5936 /* modhash walker function to enable or disable fastpath */
5702 5937 /*ARGSUSED*/
5703 5938 static uint_t
5704 5939 i_mac_fastpath_walker(mod_hash_key_t key, mod_hash_val_t *val,
5705 5940 void *arg)
5706 5941 {
5707 5942 i_mac_fastpath_state_t *state = arg;
5708 5943 mac_handle_t mh = (mac_handle_t)val;
5709 5944
5710 5945 if (state->mf_disable)
5711 5946 state->mf_err = mac_fastpath_disable(mh);
5712 5947 else
5713 5948 mac_fastpath_enable(mh);
5714 5949
5715 5950 return (state->mf_err == 0 ? MH_WALK_CONTINUE : MH_WALK_TERMINATE);
5716 5951 }
5717 5952
5718 5953 /*
5719 5954 * Start the logging timer.
5720 5955 */
5721 5956 int
5722 5957 mac_start_logusage(mac_logtype_t type, uint_t interval)
5723 5958 {
5724 5959 i_mac_fastpath_state_t dstate = {B_TRUE, 0};
5725 5960 i_mac_fastpath_state_t estate = {B_FALSE, 0};
5726 5961 int err;
5727 5962
5728 5963 rw_enter(&i_mac_impl_lock, RW_WRITER);
5729 5964 switch (type) {
5730 5965 case MAC_LOGTYPE_FLOW:
5731 5966 if (mac_flow_log_enable) {
5732 5967 rw_exit(&i_mac_impl_lock);
5733 5968 return (0);
5734 5969 }
5735 5970 /* FALLTHRU */
5736 5971 case MAC_LOGTYPE_LINK:
5737 5972 if (mac_link_log_enable) {
5738 5973 rw_exit(&i_mac_impl_lock);
5739 5974 return (0);
5740 5975 }
5741 5976 break;
5742 5977 default:
5743 5978 ASSERT(0);
5744 5979 }
5745 5980
5746 5981 /* Disable fastpath */
5747 5982 mod_hash_walk(i_mac_impl_hash, i_mac_fastpath_walker, &dstate);
5748 5983 if ((err = dstate.mf_err) != 0) {
5749 5984 /* Reenable fastpath */
5750 5985 mod_hash_walk(i_mac_impl_hash, i_mac_fastpath_walker, &estate);
5751 5986 rw_exit(&i_mac_impl_lock);
5752 5987 return (err);
5753 5988 }
5754 5989
5755 5990 switch (type) {
5756 5991 case MAC_LOGTYPE_FLOW:
5757 5992 mac_flow_log_enable = B_TRUE;
5758 5993 /* FALLTHRU */
5759 5994 case MAC_LOGTYPE_LINK:
5760 5995 mac_link_log_enable = B_TRUE;
5761 5996 break;
5762 5997 }
5763 5998
5764 5999 mac_logging_interval = interval;
5765 6000 rw_exit(&i_mac_impl_lock);
5766 6001 mac_log_linkinfo(NULL);
5767 6002 return (0);
5768 6003 }
5769 6004
5770 6005 /*
5771 6006 * Stop the logging timer if both link and flow logging are turned off.
5772 6007 */
5773 6008 void
5774 6009 mac_stop_logusage(mac_logtype_t type)
5775 6010 {
5776 6011 i_mac_log_state_t lstate;
5777 6012 i_mac_fastpath_state_t estate = {B_FALSE, 0};
5778 6013 list_t net_log_list;
5779 6014
5780 6015 list_create(&net_log_list, sizeof (netinfo_t),
5781 6016 offsetof(netinfo_t, ni_link));
5782 6017
5783 6018 rw_enter(&i_mac_impl_lock, RW_WRITER);
5784 6019
5785 6020 lstate.mi_fenable = mac_flow_log_enable;
5786 6021 lstate.mi_lenable = mac_link_log_enable;
5787 6022 lstate.mi_list = &net_log_list;
5788 6023
5789 6024 /* Last walk */
5790 6025 lstate.mi_last = B_TRUE;
5791 6026
5792 6027 switch (type) {
5793 6028 case MAC_LOGTYPE_FLOW:
5794 6029 if (lstate.mi_fenable) {
5795 6030 ASSERT(mac_link_log_enable);
5796 6031 mac_flow_log_enable = B_FALSE;
5797 6032 mac_link_log_enable = B_FALSE;
5798 6033 break;
5799 6034 }
5800 6035 /* FALLTHRU */
5801 6036 case MAC_LOGTYPE_LINK:
5802 6037 if (!lstate.mi_lenable || mac_flow_log_enable) {
5803 6038 rw_exit(&i_mac_impl_lock);
5804 6039 return;
5805 6040 }
↓ open down ↓ |
541 lines elided |
↑ open up ↑ |
5806 6041 mac_link_log_enable = B_FALSE;
5807 6042 break;
5808 6043 default:
5809 6044 ASSERT(0);
5810 6045 }
5811 6046
5812 6047 /* Reenable fastpath */
5813 6048 mod_hash_walk(i_mac_impl_hash, i_mac_fastpath_walker, &estate);
5814 6049
5815 6050 (void) untimeout(mac_logging_timer);
5816 - mac_logging_timer = 0;
6051 + mac_logging_timer = NULL;
5817 6052
5818 6053 /* Write log entries for each mac_impl in the list */
5819 6054 i_mac_log_info(&net_log_list, &lstate);
5820 6055 }
5821 6056
5822 6057 /*
5823 6058 * Walk the rx and tx SRS/SRs for a flow and update the priority value.
5824 6059 */
5825 6060 void
5826 6061 mac_flow_update_priority(mac_client_impl_t *mcip, flow_entry_t *flent)
5827 6062 {
5828 6063 pri_t pri;
5829 6064 int count;
5830 6065 mac_soft_ring_set_t *mac_srs;
5831 6066
5832 6067 if (flent->fe_rx_srs_cnt <= 0)
5833 6068 return;
5834 6069
5835 6070 if (((mac_soft_ring_set_t *)flent->fe_rx_srs[0])->srs_type ==
5836 6071 SRST_FLOW) {
5837 6072 pri = FLOW_PRIORITY(mcip->mci_min_pri,
5838 6073 mcip->mci_max_pri,
5839 6074 flent->fe_resource_props.mrp_priority);
5840 6075 } else {
5841 6076 pri = mcip->mci_max_pri;
5842 6077 }
5843 6078
5844 6079 for (count = 0; count < flent->fe_rx_srs_cnt; count++) {
5845 6080 mac_srs = flent->fe_rx_srs[count];
5846 6081 mac_update_srs_priority(mac_srs, pri);
5847 6082 }
5848 6083 /*
5849 6084 * If we have a Tx SRS, we need to modify all the threads associated
5850 6085 * with it.
5851 6086 */
5852 6087 if (flent->fe_tx_srs != NULL)
5853 6088 mac_update_srs_priority(flent->fe_tx_srs, pri);
5854 6089 }
5855 6090
5856 6091 /*
5857 6092 * RX and TX rings are reserved according to different semantics depending
5858 6093 * on the requests from the MAC clients and type of rings:
5859 6094 *
5860 6095 * On the Tx side, by default we reserve individual rings, independently from
5861 6096 * the groups.
5862 6097 *
5863 6098 * On the Rx side, the reservation is at the granularity of the group
5864 6099 * of rings, and used for v12n level 1 only. It has a special case for the
5865 6100 * primary client.
5866 6101 *
5867 6102 * If a share is allocated to a MAC client, we allocate a TX group and an
5868 6103 * RX group to the client, and assign TX rings and RX rings to these
5869 6104 * groups according to information gathered from the driver through
5870 6105 * the share capability.
5871 6106 *
5872 6107 * The foreseable evolution of Rx rings will handle v12n level 2 and higher
5873 6108 * to allocate individual rings out of a group and program the hw classifier
5874 6109 * based on IP address or higher level criteria.
5875 6110 */
5876 6111
5877 6112 /*
5878 6113 * mac_reserve_tx_ring()
5879 6114 * Reserve a unused ring by marking it with MR_INUSE state.
5880 6115 * As reserved, the ring is ready to function.
5881 6116 *
5882 6117 * Notes for Hybrid I/O:
5883 6118 *
5884 6119 * If a specific ring is needed, it is specified through the desired_ring
5885 6120 * argument. Otherwise that argument is set to NULL.
5886 6121 * If the desired ring was previous allocated to another client, this
5887 6122 * function swaps it with a new ring from the group of unassigned rings.
5888 6123 */
5889 6124 mac_ring_t *
5890 6125 mac_reserve_tx_ring(mac_impl_t *mip, mac_ring_t *desired_ring)
5891 6126 {
5892 6127 mac_group_t *group;
5893 6128 mac_grp_client_t *mgcp;
5894 6129 mac_client_impl_t *mcip;
5895 6130 mac_soft_ring_set_t *srs;
5896 6131
5897 6132 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
5898 6133
5899 6134 /*
5900 6135 * Find an available ring and start it before changing its status.
5901 6136 * The unassigned rings are at the end of the mi_tx_groups
5902 6137 * array.
5903 6138 */
5904 6139 group = MAC_DEFAULT_TX_GROUP(mip);
5905 6140
5906 6141 /* Can't take the default ring out of the default group */
5907 6142 ASSERT(desired_ring != (mac_ring_t *)mip->mi_default_tx_ring);
5908 6143
5909 6144 if (desired_ring->mr_state == MR_FREE) {
5910 6145 ASSERT(MAC_GROUP_NO_CLIENT(group));
5911 6146 if (mac_start_ring(desired_ring) != 0)
5912 6147 return (NULL);
5913 6148 return (desired_ring);
5914 6149 }
5915 6150 /*
5916 6151 * There are clients using this ring, so let's move the clients
5917 6152 * away from using this ring.
5918 6153 */
5919 6154 for (mgcp = group->mrg_clients; mgcp != NULL; mgcp = mgcp->mgc_next) {
5920 6155 mcip = mgcp->mgc_client;
5921 6156 mac_tx_client_quiesce((mac_client_handle_t)mcip);
5922 6157 srs = MCIP_TX_SRS(mcip);
5923 6158 ASSERT(mac_tx_srs_ring_present(srs, desired_ring));
↓ open down ↓ |
97 lines elided |
↑ open up ↑ |
5924 6159 mac_tx_invoke_callbacks(mcip,
5925 6160 (mac_tx_cookie_t)mac_tx_srs_get_soft_ring(srs,
5926 6161 desired_ring));
5927 6162 mac_tx_srs_del_ring(srs, desired_ring);
5928 6163 mac_tx_client_restart((mac_client_handle_t)mcip);
5929 6164 }
5930 6165 return (desired_ring);
5931 6166 }
5932 6167
5933 6168 /*
5934 - * For a reserved group with multiple clients, return the primary client.
6169 + * For a non-default group with multiple clients, return the primary client.
5935 6170 */
5936 6171 static mac_client_impl_t *
5937 6172 mac_get_grp_primary(mac_group_t *grp)
5938 6173 {
5939 6174 mac_grp_client_t *mgcp = grp->mrg_clients;
5940 6175 mac_client_impl_t *mcip;
5941 6176
5942 6177 while (mgcp != NULL) {
5943 6178 mcip = mgcp->mgc_client;
5944 6179 if (mcip->mci_flent->fe_type & FLOW_PRIMARY_MAC)
5945 6180 return (mcip);
5946 6181 mgcp = mgcp->mgc_next;
5947 6182 }
5948 6183 return (NULL);
5949 6184 }
5950 6185
5951 6186 /*
5952 6187 * Hybrid I/O specifies the ring that should be given to a share.
5953 6188 * If the ring is already used by clients, then we need to release
5954 6189 * the ring back to the default group so that we can give it to
5955 6190 * the share. This means the clients using this ring now get a
5956 6191 * replacement ring. If there aren't any replacement rings, this
5957 6192 * function returns a failure.
5958 6193 */
5959 6194 static int
5960 6195 mac_reclaim_ring_from_grp(mac_impl_t *mip, mac_ring_type_t ring_type,
5961 6196 mac_ring_t *ring, mac_ring_t **rings, int nrings)
5962 6197 {
5963 6198 mac_group_t *group = (mac_group_t *)ring->mr_gh;
5964 6199 mac_resource_props_t *mrp;
5965 6200 mac_client_impl_t *mcip;
5966 6201 mac_group_t *defgrp;
5967 6202 mac_ring_t *tring;
5968 6203 mac_group_t *tgrp;
5969 6204 int i;
5970 6205 int j;
5971 6206
5972 6207 mcip = MAC_GROUP_ONLY_CLIENT(group);
5973 6208 if (mcip == NULL)
5974 6209 mcip = mac_get_grp_primary(group);
5975 6210 ASSERT(mcip != NULL);
5976 6211 ASSERT(mcip->mci_share == 0);
5977 6212
5978 6213 mrp = MCIP_RESOURCE_PROPS(mcip);
5979 6214 if (ring_type == MAC_RING_TYPE_RX) {
5980 6215 defgrp = mip->mi_rx_donor_grp;
5981 6216 if ((mrp->mrp_mask & MRP_RX_RINGS) == 0) {
5982 6217 /* Need to put this mac client in the default group */
5983 6218 if (mac_rx_switch_group(mcip, group, defgrp) != 0)
5984 6219 return (ENOSPC);
5985 6220 } else {
5986 6221 /*
5987 6222 * Switch this ring with some other ring from
5988 6223 * the default group.
5989 6224 */
5990 6225 for (tring = defgrp->mrg_rings; tring != NULL;
5991 6226 tring = tring->mr_next) {
5992 6227 if (tring->mr_index == 0)
5993 6228 continue;
5994 6229 for (j = 0; j < nrings; j++) {
5995 6230 if (rings[j] == tring)
5996 6231 break;
5997 6232 }
5998 6233 if (j >= nrings)
5999 6234 break;
6000 6235 }
6001 6236 if (tring == NULL)
6002 6237 return (ENOSPC);
6003 6238 if (mac_group_mov_ring(mip, group, tring) != 0)
6004 6239 return (ENOSPC);
6005 6240 if (mac_group_mov_ring(mip, defgrp, ring) != 0) {
6006 6241 (void) mac_group_mov_ring(mip, defgrp, tring);
6007 6242 return (ENOSPC);
6008 6243 }
6009 6244 }
6010 6245 ASSERT(ring->mr_gh == (mac_group_handle_t)defgrp);
6011 6246 return (0);
6012 6247 }
6013 6248
6014 6249 defgrp = MAC_DEFAULT_TX_GROUP(mip);
6015 6250 if (ring == (mac_ring_t *)mip->mi_default_tx_ring) {
6016 6251 /*
6017 6252 * See if we can get a spare ring to replace the default
6018 6253 * ring.
6019 6254 */
6020 6255 if (defgrp->mrg_cur_count == 1) {
6021 6256 /*
6022 6257 * Need to get a ring from another client, see if
6023 6258 * there are any clients that can be moved to
6024 6259 * the default group, thereby freeing some rings.
6025 6260 */
6026 6261 for (i = 0; i < mip->mi_tx_group_count; i++) {
6027 6262 tgrp = &mip->mi_tx_groups[i];
6028 6263 if (tgrp->mrg_state ==
6029 6264 MAC_GROUP_STATE_REGISTERED) {
6030 6265 continue;
6031 6266 }
6032 6267 mcip = MAC_GROUP_ONLY_CLIENT(tgrp);
6033 6268 if (mcip == NULL)
6034 6269 mcip = mac_get_grp_primary(tgrp);
6035 6270 ASSERT(mcip != NULL);
6036 6271 mrp = MCIP_RESOURCE_PROPS(mcip);
6037 6272 if ((mrp->mrp_mask & MRP_TX_RINGS) == 0) {
6038 6273 ASSERT(tgrp->mrg_cur_count == 1);
6039 6274 /*
6040 6275 * If this ring is part of the
6041 6276 * rings asked by the share we cannot
6042 6277 * use it as the default ring.
6043 6278 */
6044 6279 for (j = 0; j < nrings; j++) {
6045 6280 if (rings[j] == tgrp->mrg_rings)
6046 6281 break;
6047 6282 }
6048 6283 if (j < nrings)
6049 6284 continue;
6050 6285 mac_tx_client_quiesce(
6051 6286 (mac_client_handle_t)mcip);
6052 6287 mac_tx_switch_group(mcip, tgrp,
6053 6288 defgrp);
6054 6289 mac_tx_client_restart(
6055 6290 (mac_client_handle_t)mcip);
6056 6291 break;
6057 6292 }
6058 6293 }
6059 6294 /*
6060 6295 * All the rings are reserved, can't give up the
6061 6296 * default ring.
6062 6297 */
6063 6298 if (defgrp->mrg_cur_count <= 1)
6064 6299 return (ENOSPC);
6065 6300 }
6066 6301 /*
6067 6302 * Swap the default ring with another.
6068 6303 */
6069 6304 for (tring = defgrp->mrg_rings; tring != NULL;
6070 6305 tring = tring->mr_next) {
6071 6306 /*
6072 6307 * If this ring is part of the rings asked by the
6073 6308 * share we cannot use it as the default ring.
6074 6309 */
6075 6310 for (j = 0; j < nrings; j++) {
6076 6311 if (rings[j] == tring)
6077 6312 break;
6078 6313 }
6079 6314 if (j >= nrings)
6080 6315 break;
6081 6316 }
6082 6317 ASSERT(tring != NULL);
6083 6318 mip->mi_default_tx_ring = (mac_ring_handle_t)tring;
6084 6319 return (0);
6085 6320 }
6086 6321 /*
6087 6322 * The Tx ring is with a group reserved by a MAC client. See if
6088 6323 * we can swap it.
6089 6324 */
6090 6325 ASSERT(group->mrg_state == MAC_GROUP_STATE_RESERVED);
6091 6326 mcip = MAC_GROUP_ONLY_CLIENT(group);
6092 6327 if (mcip == NULL)
6093 6328 mcip = mac_get_grp_primary(group);
6094 6329 ASSERT(mcip != NULL);
6095 6330 mrp = MCIP_RESOURCE_PROPS(mcip);
6096 6331 mac_tx_client_quiesce((mac_client_handle_t)mcip);
6097 6332 if ((mrp->mrp_mask & MRP_TX_RINGS) == 0) {
6098 6333 ASSERT(group->mrg_cur_count == 1);
6099 6334 /* Put this mac client in the default group */
6100 6335 mac_tx_switch_group(mcip, group, defgrp);
6101 6336 } else {
6102 6337 /*
6103 6338 * Switch this ring with some other ring from
6104 6339 * the default group.
6105 6340 */
6106 6341 for (tring = defgrp->mrg_rings; tring != NULL;
6107 6342 tring = tring->mr_next) {
6108 6343 if (tring == (mac_ring_t *)mip->mi_default_tx_ring)
6109 6344 continue;
6110 6345 /*
6111 6346 * If this ring is part of the rings asked by the
6112 6347 * share we cannot use it for swapping.
6113 6348 */
6114 6349 for (j = 0; j < nrings; j++) {
6115 6350 if (rings[j] == tring)
6116 6351 break;
6117 6352 }
6118 6353 if (j >= nrings)
6119 6354 break;
6120 6355 }
6121 6356 if (tring == NULL) {
6122 6357 mac_tx_client_restart((mac_client_handle_t)mcip);
6123 6358 return (ENOSPC);
6124 6359 }
6125 6360 if (mac_group_mov_ring(mip, group, tring) != 0) {
6126 6361 mac_tx_client_restart((mac_client_handle_t)mcip);
6127 6362 return (ENOSPC);
6128 6363 }
6129 6364 if (mac_group_mov_ring(mip, defgrp, ring) != 0) {
6130 6365 (void) mac_group_mov_ring(mip, defgrp, tring);
6131 6366 mac_tx_client_restart((mac_client_handle_t)mcip);
6132 6367 return (ENOSPC);
6133 6368 }
6134 6369 }
6135 6370 mac_tx_client_restart((mac_client_handle_t)mcip);
6136 6371 ASSERT(ring->mr_gh == (mac_group_handle_t)defgrp);
6137 6372 return (0);
6138 6373 }
6139 6374
6140 6375 /*
6141 6376 * Populate a zero-ring group with rings. If the share is non-NULL,
6142 6377 * the rings are chosen according to that share.
6143 6378 * Invoked after allocating a new RX or TX group through
6144 6379 * mac_reserve_rx_group() or mac_reserve_tx_group(), respectively.
6145 6380 * Returns zero on success, an errno otherwise.
6146 6381 */
6147 6382 int
6148 6383 i_mac_group_allocate_rings(mac_impl_t *mip, mac_ring_type_t ring_type,
6149 6384 mac_group_t *src_group, mac_group_t *new_group, mac_share_handle_t share,
6150 6385 uint32_t ringcnt)
6151 6386 {
6152 6387 mac_ring_t **rings, *ring;
6153 6388 uint_t nrings;
6154 6389 int rv = 0, i = 0, j;
6155 6390
6156 6391 ASSERT((ring_type == MAC_RING_TYPE_RX &&
6157 6392 mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) ||
6158 6393 (ring_type == MAC_RING_TYPE_TX &&
6159 6394 mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC));
6160 6395
6161 6396 /*
6162 6397 * First find the rings to allocate to the group.
6163 6398 */
6164 6399 if (share != 0) {
6165 6400 /* get rings through ms_squery() */
6166 6401 mip->mi_share_capab.ms_squery(share, ring_type, NULL, &nrings);
6167 6402 ASSERT(nrings != 0);
6168 6403 rings = kmem_alloc(nrings * sizeof (mac_ring_handle_t),
6169 6404 KM_SLEEP);
6170 6405 mip->mi_share_capab.ms_squery(share, ring_type,
6171 6406 (mac_ring_handle_t *)rings, &nrings);
6172 6407 for (i = 0; i < nrings; i++) {
6173 6408 /*
6174 6409 * If we have given this ring to a non-default
6175 6410 * group, we need to check if we can get this
6176 6411 * ring.
6177 6412 */
6178 6413 ring = rings[i];
6179 6414 if (ring->mr_gh != (mac_group_handle_t)src_group ||
6180 6415 ring == (mac_ring_t *)mip->mi_default_tx_ring) {
6181 6416 if (mac_reclaim_ring_from_grp(mip, ring_type,
6182 6417 ring, rings, nrings) != 0) {
6183 6418 rv = ENOSPC;
6184 6419 goto bail;
6185 6420 }
6186 6421 }
6187 6422 }
6188 6423 } else {
6189 6424 /*
6190 6425 * Pick one ring from default group.
6191 6426 *
6192 6427 * for now pick the second ring which requires the first ring
6193 6428 * at index 0 to stay in the default group, since it is the
6194 6429 * ring which carries the multicast traffic.
6195 6430 * We need a better way for a driver to indicate this,
6196 6431 * for example a per-ring flag.
6197 6432 */
6198 6433 rings = kmem_alloc(ringcnt * sizeof (mac_ring_handle_t),
6199 6434 KM_SLEEP);
6200 6435 for (ring = src_group->mrg_rings; ring != NULL;
6201 6436 ring = ring->mr_next) {
6202 6437 if (ring_type == MAC_RING_TYPE_RX &&
6203 6438 ring->mr_index == 0) {
6204 6439 continue;
6205 6440 }
6206 6441 if (ring_type == MAC_RING_TYPE_TX &&
6207 6442 ring == (mac_ring_t *)mip->mi_default_tx_ring) {
6208 6443 continue;
6209 6444 }
6210 6445 rings[i++] = ring;
6211 6446 if (i == ringcnt)
6212 6447 break;
6213 6448 }
6214 6449 ASSERT(ring != NULL);
6215 6450 nrings = i;
6216 6451 /* Not enough rings as required */
6217 6452 if (nrings != ringcnt) {
6218 6453 rv = ENOSPC;
6219 6454 goto bail;
6220 6455 }
6221 6456 }
6222 6457
6223 6458 switch (ring_type) {
6224 6459 case MAC_RING_TYPE_RX:
6225 6460 if (src_group->mrg_cur_count - nrings < 1) {
6226 6461 /* we ran out of rings */
6227 6462 rv = ENOSPC;
6228 6463 goto bail;
6229 6464 }
6230 6465
6231 6466 /* move receive rings to new group */
6232 6467 for (i = 0; i < nrings; i++) {
6233 6468 rv = mac_group_mov_ring(mip, new_group, rings[i]);
6234 6469 if (rv != 0) {
6235 6470 /* move rings back on failure */
6236 6471 for (j = 0; j < i; j++) {
6237 6472 (void) mac_group_mov_ring(mip,
6238 6473 src_group, rings[j]);
6239 6474 }
6240 6475 goto bail;
6241 6476 }
6242 6477 }
6243 6478 break;
6244 6479
6245 6480 case MAC_RING_TYPE_TX: {
6246 6481 mac_ring_t *tmp_ring;
6247 6482
6248 6483 /* move the TX rings to the new group */
6249 6484 for (i = 0; i < nrings; i++) {
6250 6485 /* get the desired ring */
6251 6486 tmp_ring = mac_reserve_tx_ring(mip, rings[i]);
6252 6487 if (tmp_ring == NULL) {
6253 6488 rv = ENOSPC;
6254 6489 goto bail;
6255 6490 }
6256 6491 ASSERT(tmp_ring == rings[i]);
6257 6492 rv = mac_group_mov_ring(mip, new_group, rings[i]);
6258 6493 if (rv != 0) {
6259 6494 /* cleanup on failure */
6260 6495 for (j = 0; j < i; j++) {
6261 6496 (void) mac_group_mov_ring(mip,
6262 6497 MAC_DEFAULT_TX_GROUP(mip),
6263 6498 rings[j]);
6264 6499 }
6265 6500 goto bail;
6266 6501 }
6267 6502 }
6268 6503 break;
6269 6504 }
6270 6505 }
6271 6506
6272 6507 /* add group to share */
6273 6508 if (share != 0)
6274 6509 mip->mi_share_capab.ms_sadd(share, new_group->mrg_driver);
6275 6510
6276 6511 bail:
6277 6512 /* free temporary array of rings */
6278 6513 kmem_free(rings, nrings * sizeof (mac_ring_handle_t));
6279 6514
6280 6515 return (rv);
6281 6516 }
6282 6517
↓ open down ↓ |
338 lines elided |
↑ open up ↑ |
6283 6518 void
6284 6519 mac_group_add_client(mac_group_t *grp, mac_client_impl_t *mcip)
6285 6520 {
6286 6521 mac_grp_client_t *mgcp;
6287 6522
6288 6523 for (mgcp = grp->mrg_clients; mgcp != NULL; mgcp = mgcp->mgc_next) {
6289 6524 if (mgcp->mgc_client == mcip)
6290 6525 break;
6291 6526 }
6292 6527
6293 - VERIFY(mgcp == NULL);
6528 + ASSERT(mgcp == NULL);
6294 6529
6295 6530 mgcp = kmem_zalloc(sizeof (mac_grp_client_t), KM_SLEEP);
6296 6531 mgcp->mgc_client = mcip;
6297 6532 mgcp->mgc_next = grp->mrg_clients;
6298 6533 grp->mrg_clients = mgcp;
6299 -
6300 6534 }
6301 6535
6302 6536 void
6303 6537 mac_group_remove_client(mac_group_t *grp, mac_client_impl_t *mcip)
6304 6538 {
6305 6539 mac_grp_client_t *mgcp, **pprev;
6306 6540
6307 6541 for (pprev = &grp->mrg_clients, mgcp = *pprev; mgcp != NULL;
6308 6542 pprev = &mgcp->mgc_next, mgcp = *pprev) {
6309 6543 if (mgcp->mgc_client == mcip)
6310 6544 break;
6311 6545 }
6312 6546
6313 6547 ASSERT(mgcp != NULL);
6314 6548
6315 6549 *pprev = mgcp->mgc_next;
6316 6550 kmem_free(mgcp, sizeof (mac_grp_client_t));
6317 6551 }
6318 6552
6319 6553 /*
6320 - * mac_reserve_rx_group()
6321 - *
6554 + * Return true if any client on this group explicitly asked for HW
6555 + * rings (of type mask) or have a bound share.
6556 + */
6557 +static boolean_t
6558 +i_mac_clients_hw(mac_group_t *grp, uint32_t mask)
6559 +{
6560 + mac_grp_client_t *mgcip;
6561 + mac_client_impl_t *mcip;
6562 + mac_resource_props_t *mrp;
6563 +
6564 + for (mgcip = grp->mrg_clients; mgcip != NULL; mgcip = mgcip->mgc_next) {
6565 + mcip = mgcip->mgc_client;
6566 + mrp = MCIP_RESOURCE_PROPS(mcip);
6567 + if (mcip->mci_share != 0 || (mrp->mrp_mask & mask) != 0)
6568 + return (B_TRUE);
6569 + }
6570 +
6571 + return (B_FALSE);
6572 +}
6573 +
6574 +/*
6322 6575 * Finds an available group and exclusively reserves it for a client.
6323 6576 * The group is chosen to suit the flow's resource controls (bandwidth and
6324 6577 * fanout requirements) and the address type.
6325 6578 * If the requestor is the pimary MAC then return the group with the
6326 6579 * largest number of rings, otherwise the default ring when available.
6327 6580 */
6328 6581 mac_group_t *
6329 6582 mac_reserve_rx_group(mac_client_impl_t *mcip, uint8_t *mac_addr, boolean_t move)
6330 6583 {
6331 6584 mac_share_handle_t share = mcip->mci_share;
6332 6585 mac_impl_t *mip = mcip->mci_mip;
6333 6586 mac_group_t *grp = NULL;
↓ open down ↓ |
2 lines elided |
↑ open up ↑ |
6334 6587 int i;
6335 6588 int err = 0;
6336 6589 mac_address_t *map;
6337 6590 mac_resource_props_t *mrp = MCIP_RESOURCE_PROPS(mcip);
6338 6591 int nrings;
6339 6592 int donor_grp_rcnt;
6340 6593 boolean_t need_exclgrp = B_FALSE;
6341 6594 int need_rings = 0;
6342 6595 mac_group_t *candidate_grp = NULL;
6343 6596 mac_client_impl_t *gclient;
6344 - mac_resource_props_t *gmrp;
6345 6597 mac_group_t *donorgrp = NULL;
6346 6598 boolean_t rxhw = mrp->mrp_mask & MRP_RX_RINGS;
6347 6599 boolean_t unspec = mrp->mrp_mask & MRP_RXRINGS_UNSPEC;
6348 6600 boolean_t isprimary;
6349 6601
6350 6602 ASSERT(MAC_PERIM_HELD((mac_handle_t)mip));
6351 6603
6352 6604 isprimary = mcip->mci_flent->fe_type & FLOW_PRIMARY_MAC;
6353 6605
6354 6606 /*
6355 - * Check if a group already has this mac address (case of VLANs)
6607 + * Check if a group already has this MAC address (case of VLANs)
6356 6608 * unless we are moving this MAC client from one group to another.
6357 6609 */
6358 6610 if (!move && (map = mac_find_macaddr(mip, mac_addr)) != NULL) {
6359 6611 if (map->ma_group != NULL)
6360 6612 return (map->ma_group);
6361 6613 }
6614 +
6362 6615 if (mip->mi_rx_groups == NULL || mip->mi_rx_group_count == 0)
6363 6616 return (NULL);
6617 +
6364 6618 /*
6365 - * If exclusive open, return NULL which will enable the
6366 - * caller to use the default group.
6619 + * If this client is requesting exclusive MAC access then
6620 + * return NULL to ensure the client uses the default group.
6367 6621 */
6368 6622 if (mcip->mci_state_flags & MCIS_EXCLUSIVE)
6369 6623 return (NULL);
6370 6624
6371 6625 /* For dynamic groups default unspecified to 1 */
6372 6626 if (rxhw && unspec &&
6373 6627 mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
6374 6628 mrp->mrp_nrxrings = 1;
6375 6629 }
6630 +
6376 6631 /*
6377 6632 * For static grouping we allow only specifying rings=0 and
6378 6633 * unspecified
6379 6634 */
6380 6635 if (rxhw && mrp->mrp_nrxrings > 0 &&
6381 6636 mip->mi_rx_group_type == MAC_GROUP_TYPE_STATIC) {
6382 6637 return (NULL);
6383 6638 }
6639 +
6384 6640 if (rxhw) {
6385 6641 /*
6386 6642 * We have explicitly asked for a group (with nrxrings,
6387 6643 * if unspec).
6388 6644 */
6389 6645 if (unspec || mrp->mrp_nrxrings > 0) {
6390 6646 need_exclgrp = B_TRUE;
6391 6647 need_rings = mrp->mrp_nrxrings;
6392 6648 } else if (mrp->mrp_nrxrings == 0) {
6393 6649 /*
6394 6650 * We have asked for a software group.
6395 6651 */
6396 6652 return (NULL);
6397 6653 }
6398 6654 } else if (isprimary && mip->mi_nactiveclients == 1 &&
6399 6655 mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
6400 6656 /*
6401 6657 * If the primary is the only active client on this
6402 6658 * mip and we have not asked for any rings, we give
6403 6659 * it the default group so that the primary gets to
6404 6660 * use all the rings.
6405 6661 */
6406 6662 return (NULL);
6407 6663 }
6408 6664
6409 6665 /* The group that can donate rings */
6410 6666 donorgrp = mip->mi_rx_donor_grp;
6411 6667
6412 6668 /*
6413 6669 * The number of rings that the default group can donate.
6414 6670 * We need to leave at least one ring.
6415 6671 */
6416 6672 donor_grp_rcnt = donorgrp->mrg_cur_count - 1;
6417 6673
6418 6674 /*
6419 6675 * Try to exclusively reserve a RX group.
6420 6676 *
6421 6677 * For flows requiring HW_DEFAULT_RING (unicast flow of the primary
6422 6678 * client), try to reserve the a non-default RX group and give
6423 6679 * it all the rings from the donor group, except the default ring
6424 6680 *
6425 6681 * For flows requiring HW_RING (unicast flow of other clients), try
6426 6682 * to reserve non-default RX group with the specified number of
6427 6683 * rings, if available.
6428 6684 *
6429 6685 * For flows that have not asked for software or hardware ring,
6430 6686 * try to reserve a non-default group with 1 ring, if available.
6431 6687 */
6432 6688 for (i = 1; i < mip->mi_rx_group_count; i++) {
6433 6689 grp = &mip->mi_rx_groups[i];
6434 6690
↓ open down ↓ |
41 lines elided |
↑ open up ↑ |
6435 6691 DTRACE_PROBE3(rx__group__trying, char *, mip->mi_name,
6436 6692 int, grp->mrg_index, mac_group_state_t, grp->mrg_state);
6437 6693
6438 6694 /*
6439 6695 * Check if this group could be a candidate group for
6440 6696 * eviction if we need a group for this MAC client,
6441 6697 * but there aren't any. A candidate group is one
6442 6698 * that didn't ask for an exclusive group, but got
6443 6699 * one and it has enough rings (combined with what
6444 6700 * the donor group can donate) for the new MAC
6445 - * client
6701 + * client.
6446 6702 */
6447 6703 if (grp->mrg_state >= MAC_GROUP_STATE_RESERVED) {
6448 6704 /*
6449 - * If the primary/donor group is not the default
6450 - * group, don't bother looking for a candidate group.
6451 - * If we don't have enough rings we will check
6452 - * if the primary group can be vacated.
6705 + * If the donor group is not the default
6706 + * group, don't bother looking for a candidate
6707 + * group. If we don't have enough rings we
6708 + * will check if the primary group can be
6709 + * vacated.
6453 6710 */
6454 6711 if (candidate_grp == NULL &&
6455 6712 donorgrp == MAC_DEFAULT_RX_GROUP(mip)) {
6456 - ASSERT(!MAC_GROUP_NO_CLIENT(grp));
6457 - gclient = MAC_GROUP_ONLY_CLIENT(grp);
6458 - if (gclient == NULL)
6459 - gclient = mac_get_grp_primary(grp);
6460 - ASSERT(gclient != NULL);
6461 - gmrp = MCIP_RESOURCE_PROPS(gclient);
6462 - if (gclient->mci_share == 0 &&
6463 - (gmrp->mrp_mask & MRP_RX_RINGS) == 0 &&
6713 + if (!i_mac_clients_hw(grp, MRP_RX_RINGS) &&
6464 6714 (unspec ||
6465 6715 (grp->mrg_cur_count + donor_grp_rcnt >=
6466 6716 need_rings))) {
6467 6717 candidate_grp = grp;
6468 6718 }
6469 6719 }
6470 6720 continue;
6471 6721 }
6472 6722 /*
6473 6723 * This group could already be SHARED by other multicast
6474 6724 * flows on this client. In that case, the group would
6475 6725 * be shared and has already been started.
6476 6726 */
6477 6727 ASSERT(grp->mrg_state != MAC_GROUP_STATE_UNINIT);
6478 6728
6479 6729 if ((grp->mrg_state == MAC_GROUP_STATE_REGISTERED) &&
6480 6730 (mac_start_group(grp) != 0)) {
6481 6731 continue;
6482 6732 }
6483 6733
6484 6734 if (mip->mi_rx_group_type != MAC_GROUP_TYPE_DYNAMIC)
6485 6735 break;
6486 6736 ASSERT(grp->mrg_cur_count == 0);
6487 6737
6488 6738 /*
6489 6739 * Populate the group. Rings should be taken
6490 6740 * from the donor group.
6491 6741 */
6492 6742 nrings = rxhw ? need_rings : isprimary ? donor_grp_rcnt: 1;
6493 6743
6494 6744 /*
6495 6745 * If the donor group can't donate, let's just walk and
6496 6746 * see if someone can vacate a group, so that we have
6497 6747 * enough rings for this, unless we already have
6498 6748 * identified a candiate group..
6499 6749 */
6500 6750 if (nrings <= donor_grp_rcnt) {
6501 6751 err = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_RX,
6502 6752 donorgrp, grp, share, nrings);
6503 6753 if (err == 0) {
6504 6754 /*
6505 6755 * For a share i_mac_group_allocate_rings gets
6506 6756 * the rings from the driver, let's populate
6507 6757 * the property for the client now.
6508 6758 */
6509 6759 if (share != 0) {
6510 6760 mac_client_set_rings(
6511 6761 (mac_client_handle_t)mcip,
6512 6762 grp->mrg_cur_count, -1);
6513 6763 }
6514 6764 if (mac_is_primary_client(mcip) && !rxhw)
6515 6765 mip->mi_rx_donor_grp = grp;
6516 6766 break;
6517 6767 }
6518 6768 }
↓ open down ↓ |
45 lines elided |
↑ open up ↑ |
6519 6769
6520 6770 DTRACE_PROBE3(rx__group__reserve__alloc__rings, char *,
6521 6771 mip->mi_name, int, grp->mrg_index, int, err);
6522 6772
6523 6773 /*
6524 6774 * It's a dynamic group but the grouping operation
6525 6775 * failed.
6526 6776 */
6527 6777 mac_stop_group(grp);
6528 6778 }
6779 +
6529 6780 /* We didn't find an exclusive group for this MAC client */
6530 6781 if (i >= mip->mi_rx_group_count) {
6531 6782
6532 6783 if (!need_exclgrp)
6533 6784 return (NULL);
6534 6785
6535 6786 /*
6536 - * If we found a candidate group then we switch the
6537 - * MAC client from the candidate_group to the default
6538 - * group and give the group to this MAC client. If
6539 - * we didn't find a candidate_group, check if the
6540 - * primary is in its own group and if it can make way
6541 - * for this MAC client.
6787 + * If we found a candidate group then move the
6788 + * existing MAC client from the candidate_group to the
6789 + * default group and give the candidate_group to the
6790 + * new MAC client. If we didn't find a candidate
6791 + * group, then check if the primary is in its own
6792 + * group and if it can make way for this MAC client.
6542 6793 */
6543 6794 if (candidate_grp == NULL &&
6544 6795 donorgrp != MAC_DEFAULT_RX_GROUP(mip) &&
6545 6796 donorgrp->mrg_cur_count >= need_rings) {
6546 6797 candidate_grp = donorgrp;
6547 6798 }
6548 6799 if (candidate_grp != NULL) {
6549 6800 boolean_t prim_grp = B_FALSE;
6550 6801
6551 6802 /*
6552 - * Switch the MAC client from the candidate group
6553 - * to the default group.. If this group was the
6554 - * donor group, then after the switch we need
6555 - * to update the donor group too.
6803 + * Switch the existing MAC client from the
6804 + * candidate group to the default group. If
6805 + * the candidate group is the donor group,
6806 + * then after the switch we need to update the
6807 + * donor group too.
6556 6808 */
6557 6809 grp = candidate_grp;
6558 - gclient = MAC_GROUP_ONLY_CLIENT(grp);
6559 - if (gclient == NULL)
6560 - gclient = mac_get_grp_primary(grp);
6810 + gclient = grp->mrg_clients->mgc_client;
6811 + VERIFY3P(gclient, !=, NULL);
6561 6812 if (grp == mip->mi_rx_donor_grp)
6562 6813 prim_grp = B_TRUE;
6563 6814 if (mac_rx_switch_group(gclient, grp,
6564 6815 MAC_DEFAULT_RX_GROUP(mip)) != 0) {
6565 6816 return (NULL);
6566 6817 }
6567 6818 if (prim_grp) {
6568 6819 mip->mi_rx_donor_grp =
6569 6820 MAC_DEFAULT_RX_GROUP(mip);
6570 6821 donorgrp = MAC_DEFAULT_RX_GROUP(mip);
6571 6822 }
6572 6823
6573 -
6574 6824 /*
6575 6825 * Now give this group with the required rings
6576 6826 * to this MAC client.
6577 6827 */
6578 6828 ASSERT(grp->mrg_state == MAC_GROUP_STATE_REGISTERED);
6579 6829 if (mac_start_group(grp) != 0)
6580 6830 return (NULL);
6581 6831
6582 6832 if (mip->mi_rx_group_type != MAC_GROUP_TYPE_DYNAMIC)
6583 6833 return (grp);
6584 6834
6585 6835 donor_grp_rcnt = donorgrp->mrg_cur_count - 1;
6586 6836 ASSERT(grp->mrg_cur_count == 0);
6587 6837 ASSERT(donor_grp_rcnt >= need_rings);
6588 6838 err = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_RX,
6589 6839 donorgrp, grp, share, need_rings);
6590 6840 if (err == 0) {
6591 6841 /*
6592 6842 * For a share i_mac_group_allocate_rings gets
6593 6843 * the rings from the driver, let's populate
6594 6844 * the property for the client now.
6595 6845 */
6596 6846 if (share != 0) {
6597 6847 mac_client_set_rings(
6598 6848 (mac_client_handle_t)mcip,
6599 6849 grp->mrg_cur_count, -1);
6600 6850 }
6601 6851 DTRACE_PROBE2(rx__group__reserved,
6602 6852 char *, mip->mi_name, int, grp->mrg_index);
6603 6853 return (grp);
6604 6854 }
6605 6855 DTRACE_PROBE3(rx__group__reserve__alloc__rings, char *,
6606 6856 mip->mi_name, int, grp->mrg_index, int, err);
6607 6857 mac_stop_group(grp);
6608 6858 }
6609 6859 return (NULL);
6610 6860 }
↓ open down ↓ |
27 lines elided |
↑ open up ↑ |
6611 6861 ASSERT(grp != NULL);
6612 6862
6613 6863 DTRACE_PROBE2(rx__group__reserved,
6614 6864 char *, mip->mi_name, int, grp->mrg_index);
6615 6865 return (grp);
6616 6866 }
6617 6867
6618 6868 /*
6619 6869 * mac_rx_release_group()
6620 6870 *
6621 - * This is called when there are no clients left for the group.
6622 - * The group is stopped and marked MAC_GROUP_STATE_REGISTERED,
6623 - * and if it is a non default group, the shares are removed and
6624 - * all rings are assigned back to default group.
6871 + * Release the group when it has no remaining clients. The group is
6872 + * stopped and its shares are removed and all rings are assigned back
6873 + * to default group. This should never be called against the default
6874 + * group.
6625 6875 */
6626 6876 void
6627 6877 mac_release_rx_group(mac_client_impl_t *mcip, mac_group_t *group)
6628 6878 {
6629 6879 mac_impl_t *mip = mcip->mci_mip;
6630 6880 mac_ring_t *ring;
6631 6881
6632 6882 ASSERT(group != MAC_DEFAULT_RX_GROUP(mip));
6883 + ASSERT(MAC_GROUP_NO_CLIENT(group) == B_TRUE);
6633 6884
6634 6885 if (mip->mi_rx_donor_grp == group)
6635 6886 mip->mi_rx_donor_grp = MAC_DEFAULT_RX_GROUP(mip);
6636 6887
6637 6888 /*
6638 6889 * This is the case where there are no clients left. Any
6639 6890 * SRS etc on this group have also be quiesced.
6640 6891 */
6641 6892 for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) {
6642 6893 if (ring->mr_classify_type == MAC_HW_CLASSIFIER) {
6643 6894 ASSERT(group->mrg_state == MAC_GROUP_STATE_RESERVED);
6644 6895 /*
6645 6896 * Remove the SRS associated with the HW ring.
6646 6897 * As a result, polling will be disabled.
6647 6898 */
6648 6899 ring->mr_srs = NULL;
6649 6900 }
6650 6901 ASSERT(group->mrg_state < MAC_GROUP_STATE_RESERVED ||
6651 6902 ring->mr_state == MR_INUSE);
6652 6903 if (ring->mr_state == MR_INUSE) {
6653 6904 mac_stop_ring(ring);
6654 6905 ring->mr_flag = 0;
6655 6906 }
6656 6907 }
6657 6908
6658 6909 /* remove group from share */
6659 6910 if (mcip->mci_share != 0) {
6660 6911 mip->mi_share_capab.ms_sremove(mcip->mci_share,
6661 6912 group->mrg_driver);
6662 6913 }
6663 6914
6664 6915 if (mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
6665 6916 mac_ring_t *ring;
6666 6917
6667 6918 /*
6668 6919 * Rings were dynamically allocated to group.
6669 6920 * Move rings back to default group.
6670 6921 */
6671 6922 while ((ring = group->mrg_rings) != NULL) {
6672 6923 (void) mac_group_mov_ring(mip, mip->mi_rx_donor_grp,
6673 6924 ring);
↓ open down ↓ |
31 lines elided |
↑ open up ↑ |
6674 6925 }
6675 6926 }
6676 6927 mac_stop_group(group);
6677 6928 /*
6678 6929 * Possible improvement: See if we can assign the group just released
6679 6930 * to a another client of the mip
6680 6931 */
6681 6932 }
6682 6933
6683 6934 /*
6684 - * When we move the primary's mac address between groups, we need to also
6685 - * take all the clients sharing the same mac address along with it (VLANs)
6686 - * We remove the mac address for such clients from the group after quiescing
6687 - * them. When we add the mac address we restart the client. Note that
6688 - * the primary's mac address is removed from the group after all the
6689 - * other clients sharing the address are removed. Similarly, the primary's
6690 - * mac address is added before all the other client's mac address are
6691 - * added. While grp is the group where the clients reside, tgrp is
6692 - * the group where the addresses have to be added.
6935 + * Move the MAC address from fgrp to tgrp.
6693 6936 */
6694 -static void
6695 -mac_rx_move_macaddr_prim(mac_client_impl_t *mcip, mac_group_t *grp,
6696 - mac_group_t *tgrp, uint8_t *maddr, boolean_t add)
6697 -{
6698 - mac_impl_t *mip = mcip->mci_mip;
6699 - mac_grp_client_t *mgcp = grp->mrg_clients;
6700 - mac_client_impl_t *gmcip;
6701 - boolean_t prim;
6702 -
6703 - prim = (mcip->mci_state_flags & MCIS_UNICAST_HW) != 0;
6704 -
6705 - /*
6706 - * If the clients are in a non-default group, we just have to
6707 - * walk the group's client list. If it is in the default group
6708 - * (which will be shared by other clients as well, we need to
6709 - * check if the unicast address matches mcip's unicast.
6710 - */
6711 - while (mgcp != NULL) {
6712 - gmcip = mgcp->mgc_client;
6713 - if (gmcip != mcip &&
6714 - (grp != MAC_DEFAULT_RX_GROUP(mip) ||
6715 - mcip->mci_unicast == gmcip->mci_unicast)) {
6716 - if (!add) {
6717 - mac_rx_client_quiesce(
6718 - (mac_client_handle_t)gmcip);
6719 - (void) mac_remove_macaddr(mcip->mci_unicast);
6720 - } else {
6721 - (void) mac_add_macaddr(mip, tgrp, maddr, prim);
6722 - mac_rx_client_restart(
6723 - (mac_client_handle_t)gmcip);
6724 - }
6725 - }
6726 - mgcp = mgcp->mgc_next;
6727 - }
6728 -}
6729 -
6730 -
6731 -/*
6732 - * Move the MAC address from fgrp to tgrp. If this is the primary client,
6733 - * we need to take any VLANs etc. together too.
6734 - */
6735 6937 static int
6736 6938 mac_rx_move_macaddr(mac_client_impl_t *mcip, mac_group_t *fgrp,
6737 6939 mac_group_t *tgrp)
6738 6940 {
6739 6941 mac_impl_t *mip = mcip->mci_mip;
6740 6942 uint8_t maddr[MAXMACADDRLEN];
6741 6943 int err = 0;
6742 - boolean_t prim;
6743 - boolean_t multiclnt = B_FALSE;
6944 + uint16_t vid;
6945 + mac_unicast_impl_t *muip;
6946 + boolean_t use_hw;
6744 6947
6745 6948 mac_rx_client_quiesce((mac_client_handle_t)mcip);
6746 - ASSERT(mcip->mci_unicast != NULL);
6949 + VERIFY3P(mcip->mci_unicast, !=, NULL);
6747 6950 bcopy(mcip->mci_unicast->ma_addr, maddr, mcip->mci_unicast->ma_len);
6748 6951
6749 - prim = (mcip->mci_state_flags & MCIS_UNICAST_HW) != 0;
6750 - if (mcip->mci_unicast->ma_nusers > 1) {
6751 - mac_rx_move_macaddr_prim(mcip, fgrp, NULL, maddr, B_FALSE);
6752 - multiclnt = B_TRUE;
6753 - }
6754 - ASSERT(mcip->mci_unicast->ma_nusers == 1);
6755 - err = mac_remove_macaddr(mcip->mci_unicast);
6952 + /*
6953 + * Does the client require MAC address hardware classifiction?
6954 + */
6955 + use_hw = (mcip->mci_state_flags & MCIS_UNICAST_HW) != 0;
6956 + vid = i_mac_flow_vid(mcip->mci_flent);
6957 +
6958 + /*
6959 + * You can never move an address that is shared by multiple
6960 + * clients. mac_datapath_setup() ensures that clients sharing
6961 + * an address are placed on the default group. This guarantees
6962 + * that a non-default group will only ever have one client and
6963 + * thus make full use of HW filters.
6964 + */
6965 + if (mac_check_macaddr_shared(mcip->mci_unicast))
6966 + return (EINVAL);
6967 +
6968 + err = mac_remove_macaddr_vlan(mcip->mci_unicast, vid);
6969 +
6756 6970 if (err != 0) {
6757 6971 mac_rx_client_restart((mac_client_handle_t)mcip);
6758 - if (multiclnt) {
6759 - mac_rx_move_macaddr_prim(mcip, fgrp, fgrp, maddr,
6760 - B_TRUE);
6761 - }
6762 6972 return (err);
6763 6973 }
6974 +
6764 6975 /*
6765 - * Program the H/W Classifier first, if this fails we need
6766 - * not proceed with the other stuff.
6976 + * If this isn't the primary MAC address then the
6977 + * mac_address_t has been freed by the last call to
6978 + * mac_remove_macaddr_vlan(). In any case, NULL the reference
6979 + * to avoid a dangling pointer.
6767 6980 */
6768 - if ((err = mac_add_macaddr(mip, tgrp, maddr, prim)) != 0) {
6981 + mcip->mci_unicast = NULL;
6982 +
6983 + /*
6984 + * We also have to NULL all the mui_map references -- sun4v
6985 + * strikes again!
6986 + */
6987 + rw_enter(&mcip->mci_rw_lock, RW_WRITER);
6988 + for (muip = mcip->mci_unicast_list; muip != NULL; muip = muip->mui_next)
6989 + muip->mui_map = NULL;
6990 + rw_exit(&mcip->mci_rw_lock);
6991 +
6992 + /*
6993 + * Program the H/W Classifier first, if this fails we need not
6994 + * proceed with the other stuff.
6995 + */
6996 + if ((err = mac_add_macaddr_vlan(mip, tgrp, maddr, vid, use_hw)) != 0) {
6997 + int err2;
6998 +
6769 6999 /* Revert back the H/W Classifier */
6770 - if ((err = mac_add_macaddr(mip, fgrp, maddr, prim)) != 0) {
6771 - /*
6772 - * This should not fail now since it worked earlier,
6773 - * should we panic?
6774 - */
6775 - cmn_err(CE_WARN,
6776 - "mac_rx_switch_group: switching %p back"
6777 - " to group %p failed!!", (void *)mcip,
6778 - (void *)fgrp);
7000 + err2 = mac_add_macaddr_vlan(mip, fgrp, maddr, vid, use_hw);
7001 +
7002 + if (err2 != 0) {
7003 + cmn_err(CE_WARN, "Failed to revert HW classification"
7004 + " on MAC %s, for client %s: %d.", mip->mi_name,
7005 + mcip->mci_name, err2);
6779 7006 }
7007 +
6780 7008 mac_rx_client_restart((mac_client_handle_t)mcip);
6781 - if (multiclnt) {
6782 - mac_rx_move_macaddr_prim(mcip, fgrp, fgrp, maddr,
6783 - B_TRUE);
6784 - }
6785 7009 return (err);
6786 7010 }
7011 +
7012 + /*
7013 + * Get a reference to the new mac_address_t and update the
7014 + * client's reference. Then restart the client and add the
7015 + * other clients of this MAC addr (if they exsit).
7016 + */
6787 7017 mcip->mci_unicast = mac_find_macaddr(mip, maddr);
7018 + rw_enter(&mcip->mci_rw_lock, RW_WRITER);
7019 + for (muip = mcip->mci_unicast_list; muip != NULL; muip = muip->mui_next)
7020 + muip->mui_map = mcip->mci_unicast;
7021 + rw_exit(&mcip->mci_rw_lock);
6788 7022 mac_rx_client_restart((mac_client_handle_t)mcip);
6789 - if (multiclnt)
6790 - mac_rx_move_macaddr_prim(mcip, fgrp, tgrp, maddr, B_TRUE);
6791 - return (err);
7023 + return (0);
6792 7024 }
6793 7025
6794 7026 /*
6795 7027 * Switch the MAC client from one group to another. This means we need
6796 7028 * to remove the MAC address from the group, remove the MAC client,
6797 7029 * teardown the SRSs and revert the group state. Then, we add the client
6798 7030 * to the destination group, set the SRSs, and add the MAC address to the
6799 7031 * group.
6800 7032 */
6801 7033 int
6802 7034 mac_rx_switch_group(mac_client_impl_t *mcip, mac_group_t *fgrp,
6803 7035 mac_group_t *tgrp)
6804 7036 {
6805 7037 int err;
6806 7038 mac_group_state_t next_state;
6807 7039 mac_client_impl_t *group_only_mcip;
6808 7040 mac_client_impl_t *gmcip;
6809 7041 mac_impl_t *mip = mcip->mci_mip;
6810 7042 mac_grp_client_t *mgcp;
6811 7043
6812 - ASSERT(fgrp == mcip->mci_flent->fe_rx_ring_group);
7044 + VERIFY3P(fgrp, ==, mcip->mci_flent->fe_rx_ring_group);
6813 7045
6814 7046 if ((err = mac_rx_move_macaddr(mcip, fgrp, tgrp)) != 0)
6815 7047 return (err);
6816 7048
6817 7049 /*
6818 - * The group might be reserved, but SRSs may not be set up, e.g.
6819 - * primary and its vlans using a reserved group.
7050 + * If the group is marked as reserved and in use by a single
7051 + * client, then there is an SRS to teardown.
6820 7052 */
6821 7053 if (fgrp->mrg_state == MAC_GROUP_STATE_RESERVED &&
6822 7054 MAC_GROUP_ONLY_CLIENT(fgrp) != NULL) {
6823 7055 mac_rx_srs_group_teardown(mcip->mci_flent, B_TRUE);
6824 7056 }
7057 +
7058 + /*
7059 + * If we are moving the client from a non-default group, then
7060 + * we know that any additional clients on this group share the
7061 + * same MAC address. Since we moved the MAC address filter, we
7062 + * need to move these clients too.
7063 + *
7064 + * If we are moving the client from the default group and its
7065 + * MAC address has VLAN clients, then we must move those
7066 + * clients as well.
7067 + *
7068 + * In both cases the idea is the same: we moved the MAC
7069 + * address filter to the tgrp, so we must move all clients
7070 + * using that MAC address to tgrp as well.
7071 + */
6825 7072 if (fgrp != MAC_DEFAULT_RX_GROUP(mip)) {
6826 7073 mgcp = fgrp->mrg_clients;
6827 7074 while (mgcp != NULL) {
6828 7075 gmcip = mgcp->mgc_client;
6829 7076 mgcp = mgcp->mgc_next;
6830 7077 mac_group_remove_client(fgrp, gmcip);
6831 7078 mac_group_add_client(tgrp, gmcip);
6832 7079 gmcip->mci_flent->fe_rx_ring_group = tgrp;
6833 7080 }
6834 7081 mac_release_rx_group(mcip, fgrp);
6835 - ASSERT(MAC_GROUP_NO_CLIENT(fgrp));
7082 + VERIFY3B(MAC_GROUP_NO_CLIENT(fgrp), ==, B_TRUE);
6836 7083 mac_set_group_state(fgrp, MAC_GROUP_STATE_REGISTERED);
6837 7084 } else {
6838 7085 mac_group_remove_client(fgrp, mcip);
6839 7086 mac_group_add_client(tgrp, mcip);
6840 7087 mcip->mci_flent->fe_rx_ring_group = tgrp;
7088 +
6841 7089 /*
6842 7090 * If there are other clients (VLANs) sharing this address
6843 - * we should be here only for the primary.
7091 + * then move them too.
6844 7092 */
6845 - if (mcip->mci_unicast->ma_nusers > 1) {
7093 + if (mac_check_macaddr_shared(mcip->mci_unicast)) {
6846 7094 /*
6847 7095 * We need to move all the clients that are using
6848 - * this h/w address.
7096 + * this MAC address.
6849 7097 */
6850 7098 mgcp = fgrp->mrg_clients;
6851 7099 while (mgcp != NULL) {
6852 7100 gmcip = mgcp->mgc_client;
6853 7101 mgcp = mgcp->mgc_next;
6854 7102 if (mcip->mci_unicast == gmcip->mci_unicast) {
6855 7103 mac_group_remove_client(fgrp, gmcip);
6856 7104 mac_group_add_client(tgrp, gmcip);
6857 7105 gmcip->mci_flent->fe_rx_ring_group =
6858 7106 tgrp;
6859 7107 }
6860 7108 }
6861 7109 }
7110 +
6862 7111 /*
6863 - * The default group will still take the multicast,
6864 - * broadcast traffic etc., so it won't go to
7112 + * The default group still handles multicast and
7113 + * broadcast traffic; it won't transition to
6865 7114 * MAC_GROUP_STATE_REGISTERED.
6866 7115 */
6867 7116 if (fgrp->mrg_state == MAC_GROUP_STATE_RESERVED)
6868 7117 mac_rx_group_unmark(fgrp, MR_CONDEMNED);
6869 7118 mac_set_group_state(fgrp, MAC_GROUP_STATE_SHARED);
6870 7119 }
7120 +
6871 7121 next_state = mac_group_next_state(tgrp, &group_only_mcip,
6872 7122 MAC_DEFAULT_RX_GROUP(mip), B_TRUE);
6873 7123 mac_set_group_state(tgrp, next_state);
7124 +
6874 7125 /*
6875 - * If the destination group is reserved, setup the SRSs etc.
7126 + * If the destination group is reserved, then setup the SRSes.
7127 + * Otherwise make sure to use SW classification.
6876 7128 */
6877 7129 if (tgrp->mrg_state == MAC_GROUP_STATE_RESERVED) {
6878 7130 mac_rx_srs_group_setup(mcip, mcip->mci_flent, SRST_LINK);
6879 7131 mac_fanout_setup(mcip, mcip->mci_flent,
6880 7132 MCIP_RESOURCE_PROPS(mcip), mac_rx_deliver, mcip, NULL,
6881 7133 NULL);
6882 7134 mac_rx_group_unmark(tgrp, MR_INCIPIENT);
6883 7135 } else {
6884 7136 mac_rx_switch_grp_to_sw(tgrp);
6885 7137 }
7138 +
6886 7139 return (0);
6887 7140 }
6888 7141
6889 7142 /*
6890 7143 * Reserves a TX group for the specified share. Invoked by mac_tx_srs_setup()
6891 7144 * when a share was allocated to the client.
6892 7145 */
6893 7146 mac_group_t *
6894 7147 mac_reserve_tx_group(mac_client_impl_t *mcip, boolean_t move)
6895 7148 {
6896 7149 mac_impl_t *mip = mcip->mci_mip;
6897 7150 mac_group_t *grp = NULL;
6898 7151 int rv;
6899 7152 int i;
6900 7153 int err;
6901 7154 mac_group_t *defgrp;
6902 7155 mac_share_handle_t share = mcip->mci_share;
6903 7156 mac_resource_props_t *mrp = MCIP_RESOURCE_PROPS(mcip);
6904 7157 int nrings;
6905 7158 int defnrings;
↓ open down ↓ |
10 lines elided |
↑ open up ↑ |
6906 7159 boolean_t need_exclgrp = B_FALSE;
6907 7160 int need_rings = 0;
6908 7161 mac_group_t *candidate_grp = NULL;
6909 7162 mac_client_impl_t *gclient;
6910 7163 mac_resource_props_t *gmrp;
6911 7164 boolean_t txhw = mrp->mrp_mask & MRP_TX_RINGS;
6912 7165 boolean_t unspec = mrp->mrp_mask & MRP_TXRINGS_UNSPEC;
6913 7166 boolean_t isprimary;
6914 7167
6915 7168 isprimary = mcip->mci_flent->fe_type & FLOW_PRIMARY_MAC;
7169 +
6916 7170 /*
6917 7171 * When we come here for a VLAN on the primary (dladm create-vlan),
6918 7172 * we need to pair it along with the primary (to keep it consistent
6919 7173 * with the RX side). So, we check if the primary is already assigned
6920 7174 * to a group and return the group if so. The other way is also
6921 7175 * true, i.e. the VLAN is already created and now we are plumbing
6922 7176 * the primary.
6923 7177 */
6924 7178 if (!move && isprimary) {
6925 7179 for (gclient = mip->mi_clients_list; gclient != NULL;
6926 7180 gclient = gclient->mci_client_next) {
6927 7181 if (gclient->mci_flent->fe_type & FLOW_PRIMARY_MAC &&
6928 7182 gclient->mci_flent->fe_tx_ring_group != NULL) {
6929 7183 return (gclient->mci_flent->fe_tx_ring_group);
6930 7184 }
6931 7185 }
6932 7186 }
6933 7187
6934 7188 if (mip->mi_tx_groups == NULL || mip->mi_tx_group_count == 0)
6935 7189 return (NULL);
6936 7190
6937 7191 /* For dynamic groups, default unspec to 1 */
6938 7192 if (txhw && unspec &&
6939 7193 mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
6940 7194 mrp->mrp_ntxrings = 1;
6941 7195 }
6942 7196 /*
6943 7197 * For static grouping we allow only specifying rings=0 and
6944 7198 * unspecified
6945 7199 */
6946 7200 if (txhw && mrp->mrp_ntxrings > 0 &&
6947 7201 mip->mi_tx_group_type == MAC_GROUP_TYPE_STATIC) {
6948 7202 return (NULL);
6949 7203 }
6950 7204
6951 7205 if (txhw) {
6952 7206 /*
6953 7207 * We have explicitly asked for a group (with ntxrings,
6954 7208 * if unspec).
6955 7209 */
6956 7210 if (unspec || mrp->mrp_ntxrings > 0) {
6957 7211 need_exclgrp = B_TRUE;
6958 7212 need_rings = mrp->mrp_ntxrings;
6959 7213 } else if (mrp->mrp_ntxrings == 0) {
6960 7214 /*
6961 7215 * We have asked for a software group.
6962 7216 */
6963 7217 return (NULL);
6964 7218 }
6965 7219 }
6966 7220 defgrp = MAC_DEFAULT_TX_GROUP(mip);
6967 7221 /*
6968 7222 * The number of rings that the default group can donate.
6969 7223 * We need to leave at least one ring - the default ring - in
6970 7224 * this group.
6971 7225 */
6972 7226 defnrings = defgrp->mrg_cur_count - 1;
6973 7227
6974 7228 /*
6975 7229 * Primary gets default group unless explicitly told not
6976 7230 * to (i.e. rings > 0).
6977 7231 */
6978 7232 if (isprimary && !need_exclgrp)
6979 7233 return (NULL);
6980 7234
6981 7235 nrings = (mrp->mrp_mask & MRP_TX_RINGS) != 0 ? mrp->mrp_ntxrings : 1;
6982 7236 for (i = 0; i < mip->mi_tx_group_count; i++) {
6983 7237 grp = &mip->mi_tx_groups[i];
6984 7238 if ((grp->mrg_state == MAC_GROUP_STATE_RESERVED) ||
6985 7239 (grp->mrg_state == MAC_GROUP_STATE_UNINIT)) {
6986 7240 /*
↓ open down ↓ |
61 lines elided |
↑ open up ↑ |
6987 7241 * Select a candidate for replacement if we don't
6988 7242 * get an exclusive group. A candidate group is one
6989 7243 * that didn't ask for an exclusive group, but got
6990 7244 * one and it has enough rings (combined with what
6991 7245 * the default group can donate) for the new MAC
6992 7246 * client.
6993 7247 */
6994 7248 if (grp->mrg_state == MAC_GROUP_STATE_RESERVED &&
6995 7249 candidate_grp == NULL) {
6996 7250 gclient = MAC_GROUP_ONLY_CLIENT(grp);
6997 - if (gclient == NULL)
6998 - gclient = mac_get_grp_primary(grp);
7251 + VERIFY3P(gclient, !=, NULL);
6999 7252 gmrp = MCIP_RESOURCE_PROPS(gclient);
7000 7253 if (gclient->mci_share == 0 &&
7001 7254 (gmrp->mrp_mask & MRP_TX_RINGS) == 0 &&
7002 7255 (unspec ||
7003 7256 (grp->mrg_cur_count + defnrings) >=
7004 7257 need_rings)) {
7005 7258 candidate_grp = grp;
7006 7259 }
7007 7260 }
7008 7261 continue;
7009 7262 }
7010 7263 /*
7011 7264 * If the default can't donate let's just walk and
7012 7265 * see if someone can vacate a group, so that we have
7013 7266 * enough rings for this.
7014 7267 */
7015 7268 if (mip->mi_tx_group_type != MAC_GROUP_TYPE_DYNAMIC ||
7016 7269 nrings <= defnrings) {
7017 7270 if (grp->mrg_state == MAC_GROUP_STATE_REGISTERED) {
7018 7271 rv = mac_start_group(grp);
7019 7272 ASSERT(rv == 0);
7020 7273 }
7021 7274 break;
7022 7275 }
7023 7276 }
7024 7277
↓ open down ↓ |
16 lines elided |
↑ open up ↑ |
7025 7278 /* The default group */
7026 7279 if (i >= mip->mi_tx_group_count) {
7027 7280 /*
7028 7281 * If we need an exclusive group and have identified a
7029 7282 * candidate group we switch the MAC client from the
7030 7283 * candidate group to the default group and give the
7031 7284 * candidate group to this client.
7032 7285 */
7033 7286 if (need_exclgrp && candidate_grp != NULL) {
7034 7287 /*
7035 - * Switch the MAC client from the candidate group
7036 - * to the default group.
7288 + * Switch the MAC client from the candidate
7289 + * group to the default group. We know the
7290 + * candidate_grp came from a reserved group
7291 + * and thus only has one client.
7037 7292 */
7038 7293 grp = candidate_grp;
7039 7294 gclient = MAC_GROUP_ONLY_CLIENT(grp);
7040 - if (gclient == NULL)
7041 - gclient = mac_get_grp_primary(grp);
7295 + VERIFY3P(gclient, !=, NULL);
7042 7296 mac_tx_client_quiesce((mac_client_handle_t)gclient);
7043 7297 mac_tx_switch_group(gclient, grp, defgrp);
7044 7298 mac_tx_client_restart((mac_client_handle_t)gclient);
7045 7299
7046 7300 /*
7047 7301 * Give the candidate group with the specified number
7048 7302 * of rings to this MAC client.
7049 7303 */
7050 7304 ASSERT(grp->mrg_state == MAC_GROUP_STATE_REGISTERED);
7051 7305 rv = mac_start_group(grp);
7052 7306 ASSERT(rv == 0);
7053 7307
7054 7308 if (mip->mi_tx_group_type != MAC_GROUP_TYPE_DYNAMIC)
7055 7309 return (grp);
7056 7310
7057 7311 ASSERT(grp->mrg_cur_count == 0);
7058 7312 ASSERT(defgrp->mrg_cur_count > need_rings);
7059 7313
7060 7314 err = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_TX,
7061 7315 defgrp, grp, share, need_rings);
7062 7316 if (err == 0) {
7063 7317 /*
7064 7318 * For a share i_mac_group_allocate_rings gets
7065 7319 * the rings from the driver, let's populate
7066 7320 * the property for the client now.
7067 7321 */
7068 7322 if (share != 0) {
7069 7323 mac_client_set_rings(
7070 7324 (mac_client_handle_t)mcip, -1,
7071 7325 grp->mrg_cur_count);
7072 7326 }
7073 7327 mip->mi_tx_group_free--;
7074 7328 return (grp);
7075 7329 }
7076 7330 DTRACE_PROBE3(tx__group__reserve__alloc__rings, char *,
7077 7331 mip->mi_name, int, grp->mrg_index, int, err);
7078 7332 mac_stop_group(grp);
7079 7333 }
7080 7334 return (NULL);
7081 7335 }
7082 7336 /*
7083 7337 * We got an exclusive group, but it is not dynamic.
7084 7338 */
7085 7339 if (mip->mi_tx_group_type != MAC_GROUP_TYPE_DYNAMIC) {
7086 7340 mip->mi_tx_group_free--;
7087 7341 return (grp);
7088 7342 }
7089 7343
7090 7344 rv = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_TX, defgrp, grp,
7091 7345 share, nrings);
7092 7346 if (rv != 0) {
7093 7347 DTRACE_PROBE3(tx__group__reserve__alloc__rings,
7094 7348 char *, mip->mi_name, int, grp->mrg_index, int, rv);
7095 7349 mac_stop_group(grp);
7096 7350 return (NULL);
7097 7351 }
7098 7352 /*
7099 7353 * For a share i_mac_group_allocate_rings gets the rings from the
7100 7354 * driver, let's populate the property for the client now.
7101 7355 */
7102 7356 if (share != 0) {
7103 7357 mac_client_set_rings((mac_client_handle_t)mcip, -1,
7104 7358 grp->mrg_cur_count);
7105 7359 }
7106 7360 mip->mi_tx_group_free--;
7107 7361 return (grp);
7108 7362 }
7109 7363
7110 7364 void
7111 7365 mac_release_tx_group(mac_client_impl_t *mcip, mac_group_t *grp)
7112 7366 {
7113 7367 mac_impl_t *mip = mcip->mci_mip;
7114 7368 mac_share_handle_t share = mcip->mci_share;
7115 7369 mac_ring_t *ring;
7116 7370 mac_soft_ring_set_t *srs = MCIP_TX_SRS(mcip);
7117 7371 mac_group_t *defgrp;
7118 7372
7119 7373 defgrp = MAC_DEFAULT_TX_GROUP(mip);
7120 7374 if (srs != NULL) {
7121 7375 if (srs->srs_soft_ring_count > 0) {
7122 7376 for (ring = grp->mrg_rings; ring != NULL;
7123 7377 ring = ring->mr_next) {
7124 7378 ASSERT(mac_tx_srs_ring_present(srs, ring));
7125 7379 mac_tx_invoke_callbacks(mcip,
7126 7380 (mac_tx_cookie_t)
7127 7381 mac_tx_srs_get_soft_ring(srs, ring));
7128 7382 mac_tx_srs_del_ring(srs, ring);
7129 7383 }
7130 7384 } else {
7131 7385 ASSERT(srs->srs_tx.st_arg2 != NULL);
7132 7386 srs->srs_tx.st_arg2 = NULL;
7133 7387 mac_srs_stat_delete(srs);
7134 7388 }
7135 7389 }
7136 7390 if (share != 0)
7137 7391 mip->mi_share_capab.ms_sremove(share, grp->mrg_driver);
7138 7392
7139 7393 /* move the ring back to the pool */
7140 7394 if (mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC) {
7141 7395 while ((ring = grp->mrg_rings) != NULL)
7142 7396 (void) mac_group_mov_ring(mip, defgrp, ring);
7143 7397 }
7144 7398 mac_stop_group(grp);
7145 7399 mip->mi_tx_group_free++;
7146 7400 }
7147 7401
7148 7402 /*
7149 7403 * Disassociate a MAC client from a group, i.e go through the rings in the
7150 7404 * group and delete all the soft rings tied to them.
7151 7405 */
7152 7406 static void
7153 7407 mac_tx_dismantle_soft_rings(mac_group_t *fgrp, flow_entry_t *flent)
7154 7408 {
7155 7409 mac_client_impl_t *mcip = flent->fe_mcip;
7156 7410 mac_soft_ring_set_t *tx_srs;
7157 7411 mac_srs_tx_t *tx;
7158 7412 mac_ring_t *ring;
7159 7413
7160 7414 tx_srs = flent->fe_tx_srs;
7161 7415 tx = &tx_srs->srs_tx;
7162 7416
7163 7417 /* Single ring case we haven't created any soft rings */
7164 7418 if (tx->st_mode == SRS_TX_BW || tx->st_mode == SRS_TX_SERIALIZE ||
7165 7419 tx->st_mode == SRS_TX_DEFAULT) {
7166 7420 tx->st_arg2 = NULL;
7167 7421 mac_srs_stat_delete(tx_srs);
7168 7422 /* Fanout case, where we have to dismantle the soft rings */
7169 7423 } else {
7170 7424 for (ring = fgrp->mrg_rings; ring != NULL;
7171 7425 ring = ring->mr_next) {
7172 7426 ASSERT(mac_tx_srs_ring_present(tx_srs, ring));
7173 7427 mac_tx_invoke_callbacks(mcip,
7174 7428 (mac_tx_cookie_t)mac_tx_srs_get_soft_ring(tx_srs,
7175 7429 ring));
7176 7430 mac_tx_srs_del_ring(tx_srs, ring);
7177 7431 }
7178 7432 ASSERT(tx->st_arg2 == NULL);
7179 7433 }
7180 7434 }
7181 7435
7182 7436 /*
7183 7437 * Switch the MAC client from one group to another. This means we need
7184 7438 * to remove the MAC client, teardown the SRSs and revert the group state.
7185 7439 * Then, we add the client to the destination roup, set the SRSs etc.
7186 7440 */
7187 7441 void
7188 7442 mac_tx_switch_group(mac_client_impl_t *mcip, mac_group_t *fgrp,
7189 7443 mac_group_t *tgrp)
7190 7444 {
7191 7445 mac_client_impl_t *group_only_mcip;
7192 7446 mac_impl_t *mip = mcip->mci_mip;
7193 7447 flow_entry_t *flent = mcip->mci_flent;
7194 7448 mac_group_t *defgrp;
7195 7449 mac_grp_client_t *mgcp;
7196 7450 mac_client_impl_t *gmcip;
7197 7451 flow_entry_t *gflent;
7198 7452
↓ open down ↓ |
147 lines elided |
↑ open up ↑ |
7199 7453 defgrp = MAC_DEFAULT_TX_GROUP(mip);
7200 7454 ASSERT(fgrp == flent->fe_tx_ring_group);
7201 7455
7202 7456 if (fgrp == defgrp) {
7203 7457 /*
7204 7458 * If this is the primary we need to find any VLANs on
7205 7459 * the primary and move them too.
7206 7460 */
7207 7461 mac_group_remove_client(fgrp, mcip);
7208 7462 mac_tx_dismantle_soft_rings(fgrp, flent);
7209 - if (mcip->mci_unicast->ma_nusers > 1) {
7463 + if (mac_check_macaddr_shared(mcip->mci_unicast)) {
7210 7464 mgcp = fgrp->mrg_clients;
7211 7465 while (mgcp != NULL) {
7212 7466 gmcip = mgcp->mgc_client;
7213 7467 mgcp = mgcp->mgc_next;
7214 7468 if (mcip->mci_unicast != gmcip->mci_unicast)
7215 7469 continue;
7216 7470 mac_tx_client_quiesce(
7217 7471 (mac_client_handle_t)gmcip);
7218 7472
7219 7473 gflent = gmcip->mci_flent;
7220 7474 mac_group_remove_client(fgrp, gmcip);
7221 7475 mac_tx_dismantle_soft_rings(fgrp, gflent);
7222 7476
7223 7477 mac_group_add_client(tgrp, gmcip);
7224 7478 gflent->fe_tx_ring_group = tgrp;
7225 7479 /* We could directly set this to SHARED */
7226 7480 tgrp->mrg_state = mac_group_next_state(tgrp,
7227 7481 &group_only_mcip, defgrp, B_FALSE);
7228 7482
7229 7483 mac_tx_srs_group_setup(gmcip, gflent,
7230 7484 SRST_LINK);
7231 7485 mac_fanout_setup(gmcip, gflent,
7232 7486 MCIP_RESOURCE_PROPS(gmcip), mac_rx_deliver,
7233 7487 gmcip, NULL, NULL);
7234 7488
7235 7489 mac_tx_client_restart(
7236 7490 (mac_client_handle_t)gmcip);
7237 7491 }
7238 7492 }
7239 7493 if (MAC_GROUP_NO_CLIENT(fgrp)) {
7240 7494 mac_ring_t *ring;
7241 7495 int cnt;
7242 7496 int ringcnt;
7243 7497
7244 7498 fgrp->mrg_state = MAC_GROUP_STATE_REGISTERED;
7245 7499 /*
7246 7500 * Additionally, we also need to stop all
7247 7501 * the rings in the default group, except
7248 7502 * the default ring. The reason being
7249 7503 * this group won't be released since it is
7250 7504 * the default group, so the rings won't
7251 7505 * be stopped otherwise.
7252 7506 */
7253 7507 ringcnt = fgrp->mrg_cur_count;
7254 7508 ring = fgrp->mrg_rings;
7255 7509 for (cnt = 0; cnt < ringcnt; cnt++) {
7256 7510 if (ring->mr_state == MR_INUSE &&
7257 7511 ring !=
7258 7512 (mac_ring_t *)mip->mi_default_tx_ring) {
7259 7513 mac_stop_ring(ring);
7260 7514 ring->mr_flag = 0;
7261 7515 }
7262 7516 ring = ring->mr_next;
7263 7517 }
7264 7518 } else if (MAC_GROUP_ONLY_CLIENT(fgrp) != NULL) {
7265 7519 fgrp->mrg_state = MAC_GROUP_STATE_RESERVED;
7266 7520 } else {
7267 7521 ASSERT(fgrp->mrg_state == MAC_GROUP_STATE_SHARED);
7268 7522 }
7269 7523 } else {
7270 7524 /*
7271 7525 * We could have VLANs sharing the non-default group with
7272 7526 * the primary.
7273 7527 */
7274 7528 mgcp = fgrp->mrg_clients;
7275 7529 while (mgcp != NULL) {
7276 7530 gmcip = mgcp->mgc_client;
7277 7531 mgcp = mgcp->mgc_next;
7278 7532 if (gmcip == mcip)
7279 7533 continue;
7280 7534 mac_tx_client_quiesce((mac_client_handle_t)gmcip);
7281 7535 gflent = gmcip->mci_flent;
7282 7536
7283 7537 mac_group_remove_client(fgrp, gmcip);
7284 7538 mac_tx_dismantle_soft_rings(fgrp, gflent);
7285 7539
7286 7540 mac_group_add_client(tgrp, gmcip);
7287 7541 gflent->fe_tx_ring_group = tgrp;
7288 7542 /* We could directly set this to SHARED */
7289 7543 tgrp->mrg_state = mac_group_next_state(tgrp,
7290 7544 &group_only_mcip, defgrp, B_FALSE);
7291 7545 mac_tx_srs_group_setup(gmcip, gflent, SRST_LINK);
7292 7546 mac_fanout_setup(gmcip, gflent,
7293 7547 MCIP_RESOURCE_PROPS(gmcip), mac_rx_deliver,
7294 7548 gmcip, NULL, NULL);
7295 7549
7296 7550 mac_tx_client_restart((mac_client_handle_t)gmcip);
7297 7551 }
7298 7552 mac_group_remove_client(fgrp, mcip);
7299 7553 mac_release_tx_group(mcip, fgrp);
7300 7554 fgrp->mrg_state = MAC_GROUP_STATE_REGISTERED;
7301 7555 }
7302 7556
7303 7557 /* Add it to the tgroup */
7304 7558 mac_group_add_client(tgrp, mcip);
7305 7559 flent->fe_tx_ring_group = tgrp;
7306 7560 tgrp->mrg_state = mac_group_next_state(tgrp, &group_only_mcip,
7307 7561 defgrp, B_FALSE);
7308 7562
7309 7563 mac_tx_srs_group_setup(mcip, flent, SRST_LINK);
7310 7564 mac_fanout_setup(mcip, flent, MCIP_RESOURCE_PROPS(mcip),
7311 7565 mac_rx_deliver, mcip, NULL, NULL);
7312 7566 }
7313 7567
7314 7568 /*
7315 7569 * This is a 1-time control path activity initiated by the client (IP).
7316 7570 * The mac perimeter protects against other simultaneous control activities,
7317 7571 * for example an ioctl that attempts to change the degree of fanout and
7318 7572 * increase or decrease the number of softrings associated with this Tx SRS.
7319 7573 */
7320 7574 static mac_tx_notify_cb_t *
7321 7575 mac_client_tx_notify_add(mac_client_impl_t *mcip,
7322 7576 mac_tx_notify_t notify, void *arg)
7323 7577 {
7324 7578 mac_cb_info_t *mcbi;
7325 7579 mac_tx_notify_cb_t *mtnfp;
7326 7580
7327 7581 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
7328 7582
7329 7583 mtnfp = kmem_zalloc(sizeof (mac_tx_notify_cb_t), KM_SLEEP);
7330 7584 mtnfp->mtnf_fn = notify;
7331 7585 mtnfp->mtnf_arg = arg;
7332 7586 mtnfp->mtnf_link.mcb_objp = mtnfp;
7333 7587 mtnfp->mtnf_link.mcb_objsize = sizeof (mac_tx_notify_cb_t);
7334 7588 mtnfp->mtnf_link.mcb_flags = MCB_TX_NOTIFY_CB_T;
7335 7589
7336 7590 mcbi = &mcip->mci_tx_notify_cb_info;
7337 7591 mutex_enter(mcbi->mcbi_lockp);
7338 7592 mac_callback_add(mcbi, &mcip->mci_tx_notify_cb_list, &mtnfp->mtnf_link);
7339 7593 mutex_exit(mcbi->mcbi_lockp);
7340 7594 return (mtnfp);
7341 7595 }
7342 7596
7343 7597 static void
7344 7598 mac_client_tx_notify_remove(mac_client_impl_t *mcip, mac_tx_notify_cb_t *mtnfp)
7345 7599 {
7346 7600 mac_cb_info_t *mcbi;
7347 7601 mac_cb_t **cblist;
7348 7602
7349 7603 ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip));
7350 7604
7351 7605 if (!mac_callback_find(&mcip->mci_tx_notify_cb_info,
7352 7606 &mcip->mci_tx_notify_cb_list, &mtnfp->mtnf_link)) {
7353 7607 cmn_err(CE_WARN,
7354 7608 "mac_client_tx_notify_remove: callback not "
7355 7609 "found, mcip 0x%p mtnfp 0x%p", (void *)mcip, (void *)mtnfp);
7356 7610 return;
7357 7611 }
7358 7612
7359 7613 mcbi = &mcip->mci_tx_notify_cb_info;
7360 7614 cblist = &mcip->mci_tx_notify_cb_list;
7361 7615 mutex_enter(mcbi->mcbi_lockp);
7362 7616 if (mac_callback_remove(mcbi, cblist, &mtnfp->mtnf_link))
7363 7617 kmem_free(mtnfp, sizeof (mac_tx_notify_cb_t));
7364 7618 else
7365 7619 mac_callback_remove_wait(&mcip->mci_tx_notify_cb_info);
7366 7620 mutex_exit(mcbi->mcbi_lockp);
7367 7621 }
7368 7622
7369 7623 /*
7370 7624 * mac_client_tx_notify():
7371 7625 * call to add and remove flow control callback routine.
7372 7626 */
7373 7627 mac_tx_notify_handle_t
7374 7628 mac_client_tx_notify(mac_client_handle_t mch, mac_tx_notify_t callb_func,
7375 7629 void *ptr)
7376 7630 {
7377 7631 mac_client_impl_t *mcip = (mac_client_impl_t *)mch;
7378 7632 mac_tx_notify_cb_t *mtnfp = NULL;
7379 7633
7380 7634 i_mac_perim_enter(mcip->mci_mip);
7381 7635
7382 7636 if (callb_func != NULL) {
7383 7637 /* Add a notify callback */
7384 7638 mtnfp = mac_client_tx_notify_add(mcip, callb_func, ptr);
7385 7639 } else {
7386 7640 mac_client_tx_notify_remove(mcip, (mac_tx_notify_cb_t *)ptr);
7387 7641 }
7388 7642 i_mac_perim_exit(mcip->mci_mip);
7389 7643
7390 7644 return ((mac_tx_notify_handle_t)mtnfp);
7391 7645 }
7392 7646
7393 7647 void
7394 7648 mac_bridge_vectors(mac_bridge_tx_t txf, mac_bridge_rx_t rxf,
7395 7649 mac_bridge_ref_t reff, mac_bridge_ls_t lsf)
7396 7650 {
7397 7651 mac_bridge_tx_cb = txf;
7398 7652 mac_bridge_rx_cb = rxf;
7399 7653 mac_bridge_ref_cb = reff;
7400 7654 mac_bridge_ls_cb = lsf;
7401 7655 }
7402 7656
7403 7657 int
7404 7658 mac_bridge_set(mac_handle_t mh, mac_handle_t link)
7405 7659 {
7406 7660 mac_impl_t *mip = (mac_impl_t *)mh;
7407 7661 int retv;
7408 7662
7409 7663 mutex_enter(&mip->mi_bridge_lock);
7410 7664 if (mip->mi_bridge_link == NULL) {
7411 7665 mip->mi_bridge_link = link;
7412 7666 retv = 0;
7413 7667 } else {
7414 7668 retv = EBUSY;
7415 7669 }
7416 7670 mutex_exit(&mip->mi_bridge_lock);
7417 7671 if (retv == 0) {
7418 7672 mac_poll_state_change(mh, B_FALSE);
7419 7673 mac_capab_update(mh);
7420 7674 }
7421 7675 return (retv);
7422 7676 }
7423 7677
7424 7678 /*
7425 7679 * Disable bridging on the indicated link.
7426 7680 */
7427 7681 void
7428 7682 mac_bridge_clear(mac_handle_t mh, mac_handle_t link)
7429 7683 {
7430 7684 mac_impl_t *mip = (mac_impl_t *)mh;
7431 7685
7432 7686 mutex_enter(&mip->mi_bridge_lock);
7433 7687 ASSERT(mip->mi_bridge_link == link);
7434 7688 mip->mi_bridge_link = NULL;
7435 7689 mutex_exit(&mip->mi_bridge_lock);
7436 7690 mac_poll_state_change(mh, B_TRUE);
7437 7691 mac_capab_update(mh);
7438 7692 }
7439 7693
7440 7694 void
7441 7695 mac_no_active(mac_handle_t mh)
7442 7696 {
7443 7697 mac_impl_t *mip = (mac_impl_t *)mh;
7444 7698
↓ open down ↓ |
225 lines elided |
↑ open up ↑ |
7445 7699 i_mac_perim_enter(mip);
7446 7700 mip->mi_state_flags |= MIS_NO_ACTIVE;
7447 7701 i_mac_perim_exit(mip);
7448 7702 }
7449 7703
7450 7704 /*
7451 7705 * Walk the primary VLAN clients whenever the primary's rings property
7452 7706 * changes and update the mac_resource_props_t for the VLAN's client.
7453 7707 * We need to do this since we don't support setting these properties
7454 7708 * on the primary's VLAN clients, but the VLAN clients have to
7455 - * follow the primary w.r.t the rings property;
7709 + * follow the primary w.r.t the rings property.
7456 7710 */
7457 7711 void
7458 7712 mac_set_prim_vlan_rings(mac_impl_t *mip, mac_resource_props_t *mrp)
7459 7713 {
7460 7714 mac_client_impl_t *vmcip;
7461 7715 mac_resource_props_t *vmrp;
7462 7716
7463 7717 for (vmcip = mip->mi_clients_list; vmcip != NULL;
7464 7718 vmcip = vmcip->mci_client_next) {
7465 7719 if (!(vmcip->mci_flent->fe_type & FLOW_PRIMARY_MAC) ||
7466 7720 mac_client_vid((mac_client_handle_t)vmcip) ==
7467 7721 VLAN_ID_NONE) {
7468 7722 continue;
7469 7723 }
7470 7724 vmrp = MCIP_RESOURCE_PROPS(vmcip);
7471 7725
7472 7726 vmrp->mrp_nrxrings = mrp->mrp_nrxrings;
7473 7727 if (mrp->mrp_mask & MRP_RX_RINGS)
7474 7728 vmrp->mrp_mask |= MRP_RX_RINGS;
7475 7729 else if (vmrp->mrp_mask & MRP_RX_RINGS)
7476 7730 vmrp->mrp_mask &= ~MRP_RX_RINGS;
7477 7731
7478 7732 vmrp->mrp_ntxrings = mrp->mrp_ntxrings;
7479 7733 if (mrp->mrp_mask & MRP_TX_RINGS)
7480 7734 vmrp->mrp_mask |= MRP_TX_RINGS;
7481 7735 else if (vmrp->mrp_mask & MRP_TX_RINGS)
7482 7736 vmrp->mrp_mask &= ~MRP_TX_RINGS;
7483 7737
7484 7738 if (mrp->mrp_mask & MRP_RXRINGS_UNSPEC)
7485 7739 vmrp->mrp_mask |= MRP_RXRINGS_UNSPEC;
7486 7740 else
7487 7741 vmrp->mrp_mask &= ~MRP_RXRINGS_UNSPEC;
7488 7742
7489 7743 if (mrp->mrp_mask & MRP_TXRINGS_UNSPEC)
7490 7744 vmrp->mrp_mask |= MRP_TXRINGS_UNSPEC;
7491 7745 else
7492 7746 vmrp->mrp_mask &= ~MRP_TXRINGS_UNSPEC;
7493 7747 }
7494 7748 }
7495 7749
7496 7750 /*
7497 7751 * We are adding or removing ring(s) from a group. The source for taking
7498 7752 * rings is the default group. The destination for giving rings back is
7499 7753 * the default group.
7500 7754 */
7501 7755 int
7502 7756 mac_group_ring_modify(mac_client_impl_t *mcip, mac_group_t *group,
7503 7757 mac_group_t *defgrp)
7504 7758 {
7505 7759 mac_resource_props_t *mrp = MCIP_RESOURCE_PROPS(mcip);
7506 7760 uint_t modify;
7507 7761 int count;
7508 7762 mac_ring_t *ring;
7509 7763 mac_ring_t *next;
7510 7764 mac_impl_t *mip = mcip->mci_mip;
7511 7765 mac_ring_t **rings;
7512 7766 uint_t ringcnt;
7513 7767 int i = 0;
7514 7768 boolean_t rx_group = group->mrg_type == MAC_RING_TYPE_RX;
7515 7769 int start;
7516 7770 int end;
7517 7771 mac_group_t *tgrp;
7518 7772 int j;
7519 7773 int rv = 0;
7520 7774
7521 7775 /*
7522 7776 * If we are asked for just a group, we give 1 ring, else
7523 7777 * the specified number of rings.
7524 7778 */
7525 7779 if (rx_group) {
7526 7780 ringcnt = (mrp->mrp_mask & MRP_RXRINGS_UNSPEC) ? 1:
7527 7781 mrp->mrp_nrxrings;
7528 7782 } else {
7529 7783 ringcnt = (mrp->mrp_mask & MRP_TXRINGS_UNSPEC) ? 1:
7530 7784 mrp->mrp_ntxrings;
7531 7785 }
7532 7786
7533 7787 /* don't allow modifying rings for a share for now. */
7534 7788 ASSERT(mcip->mci_share == 0);
7535 7789
7536 7790 if (ringcnt == group->mrg_cur_count)
7537 7791 return (0);
7538 7792
7539 7793 if (group->mrg_cur_count > ringcnt) {
7540 7794 modify = group->mrg_cur_count - ringcnt;
7541 7795 if (rx_group) {
7542 7796 if (mip->mi_rx_donor_grp == group) {
7543 7797 ASSERT(mac_is_primary_client(mcip));
7544 7798 mip->mi_rx_donor_grp = defgrp;
7545 7799 } else {
7546 7800 defgrp = mip->mi_rx_donor_grp;
7547 7801 }
7548 7802 }
7549 7803 ring = group->mrg_rings;
7550 7804 rings = kmem_alloc(modify * sizeof (mac_ring_handle_t),
7551 7805 KM_SLEEP);
7552 7806 j = 0;
7553 7807 for (count = 0; count < modify; count++) {
7554 7808 next = ring->mr_next;
7555 7809 rv = mac_group_mov_ring(mip, defgrp, ring);
7556 7810 if (rv != 0) {
7557 7811 /* cleanup on failure */
7558 7812 for (j = 0; j < count; j++) {
7559 7813 (void) mac_group_mov_ring(mip, group,
7560 7814 rings[j]);
7561 7815 }
7562 7816 break;
7563 7817 }
7564 7818 rings[j++] = ring;
7565 7819 ring = next;
7566 7820 }
7567 7821 kmem_free(rings, modify * sizeof (mac_ring_handle_t));
7568 7822 return (rv);
7569 7823 }
7570 7824 if (ringcnt >= MAX_RINGS_PER_GROUP)
7571 7825 return (EINVAL);
7572 7826
7573 7827 modify = ringcnt - group->mrg_cur_count;
7574 7828
7575 7829 if (rx_group) {
7576 7830 if (group != mip->mi_rx_donor_grp)
7577 7831 defgrp = mip->mi_rx_donor_grp;
7578 7832 else
7579 7833 /*
7580 7834 * This is the donor group with all the remaining
7581 7835 * rings. Default group now gets to be the donor
7582 7836 */
7583 7837 mip->mi_rx_donor_grp = defgrp;
7584 7838 start = 1;
7585 7839 end = mip->mi_rx_group_count;
7586 7840 } else {
7587 7841 start = 0;
7588 7842 end = mip->mi_tx_group_count - 1;
7589 7843 }
7590 7844 /*
7591 7845 * If the default doesn't have any rings, lets see if we can
7592 7846 * take rings given to an h/w client that doesn't need it.
7593 7847 * For now, we just see if there is any one client that can donate
↓ open down ↓ |
128 lines elided |
↑ open up ↑ |
7594 7848 * all the required rings.
7595 7849 */
7596 7850 if (defgrp->mrg_cur_count < (modify + 1)) {
7597 7851 for (i = start; i < end; i++) {
7598 7852 if (rx_group) {
7599 7853 tgrp = &mip->mi_rx_groups[i];
7600 7854 if (tgrp == group || tgrp->mrg_state <
7601 7855 MAC_GROUP_STATE_RESERVED) {
7602 7856 continue;
7603 7857 }
7604 - mcip = MAC_GROUP_ONLY_CLIENT(tgrp);
7605 - if (mcip == NULL)
7606 - mcip = mac_get_grp_primary(tgrp);
7607 - ASSERT(mcip != NULL);
7608 - mrp = MCIP_RESOURCE_PROPS(mcip);
7609 - if ((mrp->mrp_mask & MRP_RX_RINGS) != 0)
7858 + if (i_mac_clients_hw(tgrp, MRP_RX_RINGS))
7610 7859 continue;
7860 + mcip = tgrp->mrg_clients->mgc_client;
7861 + VERIFY3P(mcip, !=, NULL);
7611 7862 if ((tgrp->mrg_cur_count +
7612 7863 defgrp->mrg_cur_count) < (modify + 1)) {
7613 7864 continue;
7614 7865 }
7615 7866 if (mac_rx_switch_group(mcip, tgrp,
7616 7867 defgrp) != 0) {
7617 7868 return (ENOSPC);
7618 7869 }
7619 7870 } else {
7620 7871 tgrp = &mip->mi_tx_groups[i];
7621 7872 if (tgrp == group || tgrp->mrg_state <
7622 7873 MAC_GROUP_STATE_RESERVED) {
7623 7874 continue;
7624 7875 }
7625 - mcip = MAC_GROUP_ONLY_CLIENT(tgrp);
7626 - if (mcip == NULL)
7627 - mcip = mac_get_grp_primary(tgrp);
7628 - mrp = MCIP_RESOURCE_PROPS(mcip);
7629 - if ((mrp->mrp_mask & MRP_TX_RINGS) != 0)
7876 + if (i_mac_clients_hw(tgrp, MRP_TX_RINGS))
7630 7877 continue;
7878 + mcip = tgrp->mrg_clients->mgc_client;
7879 + VERIFY3P(mcip, !=, NULL);
7631 7880 if ((tgrp->mrg_cur_count +
7632 7881 defgrp->mrg_cur_count) < (modify + 1)) {
7633 7882 continue;
7634 7883 }
7635 7884 /* OK, we can switch this to s/w */
7636 7885 mac_tx_client_quiesce(
7637 7886 (mac_client_handle_t)mcip);
7638 7887 mac_tx_switch_group(mcip, tgrp, defgrp);
7639 7888 mac_tx_client_restart(
7640 7889 (mac_client_handle_t)mcip);
7641 7890 }
7642 7891 }
7643 7892 if (defgrp->mrg_cur_count < (modify + 1))
7644 7893 return (ENOSPC);
7645 7894 }
7646 7895 if ((rv = i_mac_group_allocate_rings(mip, group->mrg_type, defgrp,
7647 7896 group, mcip->mci_share, modify)) != 0) {
7648 7897 return (rv);
7649 7898 }
7650 7899 return (0);
7651 7900 }
7652 7901
7653 7902 /*
7654 7903 * Given the poolname in mac_resource_props, find the cpupart
7655 7904 * that is associated with this pool. The cpupart will be used
7656 7905 * later for finding the cpus to be bound to the networking threads.
7657 7906 *
7658 7907 * use_default is set B_TRUE if pools are enabled and pool_default
7659 7908 * is returned. This avoids a 2nd lookup to set the poolname
7660 7909 * for pool-effective.
7661 7910 *
7662 7911 * returns:
7663 7912 *
7664 7913 * NULL - pools are disabled or if the 'cpus' property is set.
7665 7914 * cpupart of pool_default - pools are enabled and the pool
7666 7915 * is not available or poolname is blank
7667 7916 * cpupart of named pool - pools are enabled and the pool
7668 7917 * is available.
7669 7918 */
7670 7919 cpupart_t *
7671 7920 mac_pset_find(mac_resource_props_t *mrp, boolean_t *use_default)
7672 7921 {
7673 7922 pool_t *pool;
7674 7923 cpupart_t *cpupart;
7675 7924
7676 7925 *use_default = B_FALSE;
7677 7926
7678 7927 /* CPUs property is set */
7679 7928 if (mrp->mrp_mask & MRP_CPUS)
7680 7929 return (NULL);
7681 7930
7682 7931 ASSERT(pool_lock_held());
7683 7932
7684 7933 /* Pools are disabled, no pset */
7685 7934 if (pool_state == POOL_DISABLED)
7686 7935 return (NULL);
7687 7936
7688 7937 /* Pools property is set */
7689 7938 if (mrp->mrp_mask & MRP_POOL) {
7690 7939 if ((pool = pool_lookup_pool_by_name(mrp->mrp_pool)) == NULL) {
7691 7940 /* Pool not found */
7692 7941 DTRACE_PROBE1(mac_pset_find_no_pool, char *,
7693 7942 mrp->mrp_pool);
7694 7943 *use_default = B_TRUE;
7695 7944 pool = pool_default;
7696 7945 }
7697 7946 /* Pools property is not set */
7698 7947 } else {
7699 7948 *use_default = B_TRUE;
7700 7949 pool = pool_default;
7701 7950 }
7702 7951
7703 7952 /* Find the CPU pset that corresponds to the pool */
7704 7953 mutex_enter(&cpu_lock);
7705 7954 if ((cpupart = cpupart_find(pool->pool_pset->pset_id)) == NULL) {
7706 7955 DTRACE_PROBE1(mac_find_pset_no_pset, psetid_t,
7707 7956 pool->pool_pset->pset_id);
7708 7957 }
7709 7958 mutex_exit(&cpu_lock);
7710 7959
7711 7960 return (cpupart);
7712 7961 }
7713 7962
7714 7963 void
7715 7964 mac_set_pool_effective(boolean_t use_default, cpupart_t *cpupart,
7716 7965 mac_resource_props_t *mrp, mac_resource_props_t *emrp)
7717 7966 {
7718 7967 ASSERT(pool_lock_held());
7719 7968
7720 7969 if (cpupart != NULL) {
7721 7970 emrp->mrp_mask |= MRP_POOL;
7722 7971 if (use_default) {
7723 7972 (void) strcpy(emrp->mrp_pool,
7724 7973 "pool_default");
7725 7974 } else {
7726 7975 ASSERT(strlen(mrp->mrp_pool) != 0);
7727 7976 (void) strcpy(emrp->mrp_pool,
7728 7977 mrp->mrp_pool);
7729 7978 }
7730 7979 } else {
7731 7980 emrp->mrp_mask &= ~MRP_POOL;
7732 7981 bzero(emrp->mrp_pool, MAXPATHLEN);
7733 7982 }
7734 7983 }
7735 7984
7736 7985 struct mac_pool_arg {
7737 7986 char mpa_poolname[MAXPATHLEN];
7738 7987 pool_event_t mpa_what;
7739 7988 };
7740 7989
7741 7990 /*ARGSUSED*/
7742 7991 static uint_t
7743 7992 mac_pool_link_update(mod_hash_key_t key, mod_hash_val_t *val, void *arg)
7744 7993 {
7745 7994 struct mac_pool_arg *mpa = arg;
7746 7995 mac_impl_t *mip = (mac_impl_t *)val;
7747 7996 mac_client_impl_t *mcip;
7748 7997 mac_resource_props_t *mrp, *emrp;
7749 7998 boolean_t pool_update = B_FALSE;
7750 7999 boolean_t pool_clear = B_FALSE;
7751 8000 boolean_t use_default = B_FALSE;
7752 8001 cpupart_t *cpupart = NULL;
7753 8002
7754 8003 mrp = kmem_zalloc(sizeof (*mrp), KM_SLEEP);
7755 8004 i_mac_perim_enter(mip);
7756 8005 for (mcip = mip->mi_clients_list; mcip != NULL;
7757 8006 mcip = mcip->mci_client_next) {
7758 8007 pool_update = B_FALSE;
7759 8008 pool_clear = B_FALSE;
7760 8009 use_default = B_FALSE;
7761 8010 mac_client_get_resources((mac_client_handle_t)mcip, mrp);
7762 8011 emrp = MCIP_EFFECTIVE_PROPS(mcip);
7763 8012
7764 8013 /*
7765 8014 * When pools are enabled
7766 8015 */
7767 8016 if ((mpa->mpa_what == POOL_E_ENABLE) &&
7768 8017 ((mrp->mrp_mask & MRP_CPUS) == 0)) {
7769 8018 mrp->mrp_mask |= MRP_POOL;
7770 8019 pool_update = B_TRUE;
7771 8020 }
7772 8021
7773 8022 /*
7774 8023 * When pools are disabled
7775 8024 */
7776 8025 if ((mpa->mpa_what == POOL_E_DISABLE) &&
7777 8026 ((mrp->mrp_mask & MRP_CPUS) == 0)) {
7778 8027 mrp->mrp_mask |= MRP_POOL;
7779 8028 pool_clear = B_TRUE;
7780 8029 }
7781 8030
7782 8031 /*
7783 8032 * Look for links with the pool property set and the poolname
7784 8033 * matching the one which is changing.
7785 8034 */
7786 8035 if (strcmp(mrp->mrp_pool, mpa->mpa_poolname) == 0) {
7787 8036 /*
7788 8037 * The pool associated with the link has changed.
7789 8038 */
7790 8039 if (mpa->mpa_what == POOL_E_CHANGE) {
7791 8040 mrp->mrp_mask |= MRP_POOL;
7792 8041 pool_update = B_TRUE;
7793 8042 }
7794 8043 }
7795 8044
7796 8045 /*
7797 8046 * This link is associated with pool_default and
7798 8047 * pool_default has changed.
7799 8048 */
7800 8049 if ((mpa->mpa_what == POOL_E_CHANGE) &&
7801 8050 (strcmp(emrp->mrp_pool, "pool_default") == 0) &&
7802 8051 (strcmp(mpa->mpa_poolname, "pool_default") == 0)) {
7803 8052 mrp->mrp_mask |= MRP_POOL;
7804 8053 pool_update = B_TRUE;
7805 8054 }
7806 8055
7807 8056 /*
7808 8057 * Get new list of cpus for the pool, bind network
7809 8058 * threads to new list of cpus and update resources.
7810 8059 */
7811 8060 if (pool_update) {
7812 8061 if (MCIP_DATAPATH_SETUP(mcip)) {
7813 8062 pool_lock();
7814 8063 cpupart = mac_pset_find(mrp, &use_default);
7815 8064 mac_fanout_setup(mcip, mcip->mci_flent, mrp,
7816 8065 mac_rx_deliver, mcip, NULL, cpupart);
7817 8066 mac_set_pool_effective(use_default, cpupart,
7818 8067 mrp, emrp);
7819 8068 pool_unlock();
7820 8069 }
7821 8070 mac_update_resources(mrp, MCIP_RESOURCE_PROPS(mcip),
7822 8071 B_FALSE);
7823 8072 }
7824 8073
7825 8074 /*
7826 8075 * Clear the effective pool and bind network threads
7827 8076 * to any available CPU.
7828 8077 */
7829 8078 if (pool_clear) {
7830 8079 if (MCIP_DATAPATH_SETUP(mcip)) {
7831 8080 emrp->mrp_mask &= ~MRP_POOL;
7832 8081 bzero(emrp->mrp_pool, MAXPATHLEN);
7833 8082 mac_fanout_setup(mcip, mcip->mci_flent, mrp,
7834 8083 mac_rx_deliver, mcip, NULL, NULL);
7835 8084 }
7836 8085 mac_update_resources(mrp, MCIP_RESOURCE_PROPS(mcip),
7837 8086 B_FALSE);
7838 8087 }
7839 8088 }
7840 8089 i_mac_perim_exit(mip);
7841 8090 kmem_free(mrp, sizeof (*mrp));
7842 8091 return (MH_WALK_CONTINUE);
7843 8092 }
7844 8093
7845 8094 static void
7846 8095 mac_pool_update(void *arg)
7847 8096 {
7848 8097 mod_hash_walk(i_mac_impl_hash, mac_pool_link_update, arg);
7849 8098 kmem_free(arg, sizeof (struct mac_pool_arg));
7850 8099 }
7851 8100
7852 8101 /*
7853 8102 * Callback function to be executed when a noteworthy pool event
7854 8103 * takes place.
7855 8104 */
7856 8105 /* ARGSUSED */
7857 8106 static void
7858 8107 mac_pool_event_cb(pool_event_t what, poolid_t id, void *arg)
7859 8108 {
7860 8109 pool_t *pool;
7861 8110 char *poolname = NULL;
7862 8111 struct mac_pool_arg *mpa;
7863 8112
7864 8113 pool_lock();
7865 8114 mpa = kmem_zalloc(sizeof (struct mac_pool_arg), KM_SLEEP);
7866 8115
7867 8116 switch (what) {
7868 8117 case POOL_E_ENABLE:
7869 8118 case POOL_E_DISABLE:
7870 8119 break;
7871 8120
7872 8121 case POOL_E_CHANGE:
7873 8122 pool = pool_lookup_pool_by_id(id);
7874 8123 if (pool == NULL) {
7875 8124 kmem_free(mpa, sizeof (struct mac_pool_arg));
7876 8125 pool_unlock();
7877 8126 return;
7878 8127 }
7879 8128 pool_get_name(pool, &poolname);
7880 8129 (void) strlcpy(mpa->mpa_poolname, poolname,
7881 8130 sizeof (mpa->mpa_poolname));
7882 8131 break;
7883 8132
7884 8133 default:
7885 8134 kmem_free(mpa, sizeof (struct mac_pool_arg));
7886 8135 pool_unlock();
7887 8136 return;
7888 8137 }
7889 8138 pool_unlock();
↓ open down ↓ |
249 lines elided |
↑ open up ↑ |
7890 8139
7891 8140 mpa->mpa_what = what;
7892 8141
7893 8142 mac_pool_update(mpa);
7894 8143 }
7895 8144
7896 8145 /*
7897 8146 * Set effective rings property. This could be called from datapath_setup/
7898 8147 * datapath_teardown or set-linkprop.
7899 8148 * If the group is reserved we just go ahead and set the effective rings.
7900 - * Additionally, for TX this could mean the default group has lost/gained
8149 + * Additionally, for TX this could mean the default group has lost/gained
7901 8150 * some rings, so if the default group is reserved, we need to adjust the
7902 8151 * effective rings for the default group clients. For RX, if we are working
7903 - * with the non-default group, we just need * to reset the effective props
8152 + * with the non-default group, we just need to reset the effective props
7904 8153 * for the default group clients.
7905 8154 */
7906 8155 void
7907 8156 mac_set_rings_effective(mac_client_impl_t *mcip)
7908 8157 {
7909 8158 mac_impl_t *mip = mcip->mci_mip;
7910 8159 mac_group_t *grp;
7911 8160 mac_group_t *defgrp;
7912 8161 flow_entry_t *flent = mcip->mci_flent;
7913 8162 mac_resource_props_t *emrp = MCIP_EFFECTIVE_PROPS(mcip);
7914 8163 mac_grp_client_t *mgcp;
7915 8164 mac_client_impl_t *gmcip;
7916 8165
7917 8166 grp = flent->fe_rx_ring_group;
7918 8167 if (grp != NULL) {
7919 8168 defgrp = MAC_DEFAULT_RX_GROUP(mip);
7920 8169 /*
7921 8170 * If we have reserved a group, set the effective rings
7922 8171 * to the ring count in the group.
7923 8172 */
7924 8173 if (grp->mrg_state == MAC_GROUP_STATE_RESERVED) {
7925 8174 emrp->mrp_mask |= MRP_RX_RINGS;
7926 8175 emrp->mrp_nrxrings = grp->mrg_cur_count;
7927 8176 }
7928 8177
7929 8178 /*
7930 8179 * We go through the clients in the shared group and
7931 8180 * reset the effective properties. It is possible this
7932 8181 * might have already been done for some client (i.e.
7933 8182 * if some client is being moved to a group that is
7934 8183 * already shared). The case where the default group is
7935 8184 * RESERVED is taken care of above (note in the RX side if
7936 8185 * there is a non-default group, the default group is always
7937 8186 * SHARED).
7938 8187 */
7939 8188 if (grp != defgrp || grp->mrg_state == MAC_GROUP_STATE_SHARED) {
7940 8189 if (grp->mrg_state == MAC_GROUP_STATE_SHARED)
7941 8190 mgcp = grp->mrg_clients;
7942 8191 else
7943 8192 mgcp = defgrp->mrg_clients;
7944 8193 while (mgcp != NULL) {
7945 8194 gmcip = mgcp->mgc_client;
7946 8195 emrp = MCIP_EFFECTIVE_PROPS(gmcip);
7947 8196 if (emrp->mrp_mask & MRP_RX_RINGS) {
7948 8197 emrp->mrp_mask &= ~MRP_RX_RINGS;
7949 8198 emrp->mrp_nrxrings = 0;
7950 8199 }
7951 8200 mgcp = mgcp->mgc_next;
7952 8201 }
7953 8202 }
7954 8203 }
7955 8204
7956 8205 /* Now the TX side */
7957 8206 grp = flent->fe_tx_ring_group;
7958 8207 if (grp != NULL) {
7959 8208 defgrp = MAC_DEFAULT_TX_GROUP(mip);
7960 8209
7961 8210 if (grp->mrg_state == MAC_GROUP_STATE_RESERVED) {
7962 8211 emrp->mrp_mask |= MRP_TX_RINGS;
7963 8212 emrp->mrp_ntxrings = grp->mrg_cur_count;
7964 8213 } else if (grp->mrg_state == MAC_GROUP_STATE_SHARED) {
7965 8214 mgcp = grp->mrg_clients;
7966 8215 while (mgcp != NULL) {
7967 8216 gmcip = mgcp->mgc_client;
7968 8217 emrp = MCIP_EFFECTIVE_PROPS(gmcip);
7969 8218 if (emrp->mrp_mask & MRP_TX_RINGS) {
7970 8219 emrp->mrp_mask &= ~MRP_TX_RINGS;
7971 8220 emrp->mrp_ntxrings = 0;
7972 8221 }
7973 8222 mgcp = mgcp->mgc_next;
7974 8223 }
7975 8224 }
7976 8225
7977 8226 /*
7978 8227 * If the group is not the default group and the default
7979 8228 * group is reserved, the ring count in the default group
7980 8229 * might have changed, update it.
7981 8230 */
7982 8231 if (grp != defgrp &&
7983 8232 defgrp->mrg_state == MAC_GROUP_STATE_RESERVED) {
7984 8233 gmcip = MAC_GROUP_ONLY_CLIENT(defgrp);
7985 8234 emrp = MCIP_EFFECTIVE_PROPS(gmcip);
7986 8235 emrp->mrp_ntxrings = defgrp->mrg_cur_count;
7987 8236 }
7988 8237 }
7989 8238 emrp = MCIP_EFFECTIVE_PROPS(mcip);
7990 8239 }
7991 8240
7992 8241 /*
7993 8242 * Check if the primary is in the default group. If so, see if we
7994 8243 * can give it a an exclusive group now that another client is
7995 8244 * being configured. We take the primary out of the default group
7996 8245 * because the multicast/broadcast packets for the all the clients
7997 8246 * will land in the default ring in the default group which means
7998 8247 * any client in the default group, even if it is the only on in
7999 8248 * the group, will lose exclusive access to the rings, hence
8000 8249 * polling.
8001 8250 */
8002 8251 mac_client_impl_t *
8003 8252 mac_check_primary_relocation(mac_client_impl_t *mcip, boolean_t rxhw)
8004 8253 {
8005 8254 mac_impl_t *mip = mcip->mci_mip;
8006 8255 mac_group_t *defgrp = MAC_DEFAULT_RX_GROUP(mip);
8007 8256 flow_entry_t *flent = mcip->mci_flent;
8008 8257 mac_resource_props_t *mrp = MCIP_RESOURCE_PROPS(mcip);
8009 8258 uint8_t *mac_addr;
8010 8259 mac_group_t *ngrp;
8011 8260
8012 8261 /*
8013 8262 * Check if the primary is in the default group, if not
8014 8263 * or if it is explicitly configured to be in the default
8015 8264 * group OR set the RX rings property, return.
8016 8265 */
8017 8266 if (flent->fe_rx_ring_group != defgrp || mrp->mrp_mask & MRP_RX_RINGS)
8018 8267 return (NULL);
8019 8268
8020 8269 /*
8021 8270 * If the new client needs an exclusive group and we
8022 8271 * don't have another for the primary, return.
↓ open down ↓ |
109 lines elided |
↑ open up ↑ |
8023 8272 */
8024 8273 if (rxhw && mip->mi_rxhwclnt_avail < 2)
8025 8274 return (NULL);
8026 8275
8027 8276 mac_addr = flent->fe_flow_desc.fd_dst_mac;
8028 8277 /*
8029 8278 * We call this when we are setting up the datapath for
8030 8279 * the first non-primary.
8031 8280 */
8032 8281 ASSERT(mip->mi_nactiveclients == 2);
8282 +
8033 8283 /*
8034 8284 * OK, now we have the primary that needs to be relocated.
8035 8285 */
8036 8286 ngrp = mac_reserve_rx_group(mcip, mac_addr, B_TRUE);
8037 8287 if (ngrp == NULL)
8038 8288 return (NULL);
8039 8289 if (mac_rx_switch_group(mcip, defgrp, ngrp) != 0) {
8040 8290 mac_stop_group(ngrp);
8041 8291 return (NULL);
8042 8292 }
8043 8293 return (mcip);
8044 8294 }
8045 8295
8046 8296 void
8047 8297 mac_transceiver_init(mac_impl_t *mip)
8048 8298 {
8049 8299 if (mac_capab_get((mac_handle_t)mip, MAC_CAPAB_TRANSCEIVER,
8050 8300 &mip->mi_transceiver)) {
8051 8301 /*
8052 8302 * The driver set a flag that we don't know about. In this case,
8053 8303 * we need to warn about that case and ignore this capability.
8054 8304 */
8055 8305 if (mip->mi_transceiver.mct_flags != 0) {
8056 8306 dev_err(mip->mi_dip, CE_WARN, "driver set transceiver "
8057 8307 "flags to invalid value: 0x%x, ignoring "
8058 8308 "capability", mip->mi_transceiver.mct_flags);
8059 8309 bzero(&mip->mi_transceiver,
8060 8310 sizeof (mac_capab_transceiver_t));
8061 8311 }
8062 8312 } else {
8063 8313 bzero(&mip->mi_transceiver,
8064 8314 sizeof (mac_capab_transceiver_t));
8065 8315 }
8066 8316 }
8067 8317
8068 8318 int
8069 8319 mac_transceiver_count(mac_handle_t mh, uint_t *countp)
8070 8320 {
8071 8321 mac_impl_t *mip = (mac_impl_t *)mh;
8072 8322
8073 8323 ASSERT(MAC_PERIM_HELD(mh));
8074 8324
8075 8325 if (mip->mi_transceiver.mct_ntransceivers == 0)
8076 8326 return (ENOTSUP);
8077 8327
8078 8328 *countp = mip->mi_transceiver.mct_ntransceivers;
8079 8329 return (0);
8080 8330 }
8081 8331
8082 8332 int
8083 8333 mac_transceiver_info(mac_handle_t mh, uint_t tranid, boolean_t *present,
8084 8334 boolean_t *usable)
8085 8335 {
8086 8336 int ret;
8087 8337 mac_transceiver_info_t info;
8088 8338
8089 8339 mac_impl_t *mip = (mac_impl_t *)mh;
8090 8340
8091 8341 ASSERT(MAC_PERIM_HELD(mh));
8092 8342
8093 8343 if (mip->mi_transceiver.mct_info == NULL ||
8094 8344 mip->mi_transceiver.mct_ntransceivers == 0)
8095 8345 return (ENOTSUP);
8096 8346
8097 8347 if (tranid >= mip->mi_transceiver.mct_ntransceivers)
8098 8348 return (EINVAL);
8099 8349
8100 8350 bzero(&info, sizeof (mac_transceiver_info_t));
8101 8351 if ((ret = mip->mi_transceiver.mct_info(mip->mi_driver, tranid,
8102 8352 &info)) != 0) {
8103 8353 return (ret);
8104 8354 }
8105 8355
8106 8356 *present = info.mti_present;
8107 8357 *usable = info.mti_usable;
8108 8358 return (0);
8109 8359 }
8110 8360
8111 8361 int
8112 8362 mac_transceiver_read(mac_handle_t mh, uint_t tranid, uint_t page, void *buf,
8113 8363 size_t nbytes, off_t offset, size_t *nread)
8114 8364 {
8115 8365 int ret;
8116 8366 size_t nr;
8117 8367 mac_impl_t *mip = (mac_impl_t *)mh;
8118 8368
8119 8369 ASSERT(MAC_PERIM_HELD(mh));
8120 8370
8121 8371 if (mip->mi_transceiver.mct_read == NULL)
8122 8372 return (ENOTSUP);
8123 8373
8124 8374 if (tranid >= mip->mi_transceiver.mct_ntransceivers)
8125 8375 return (EINVAL);
8126 8376
8127 8377 /*
8128 8378 * All supported pages today are 256 bytes wide. Make sure offset +
8129 8379 * nbytes never exceeds that.
8130 8380 */
8131 8381 if (offset < 0 || offset >= 256 || nbytes > 256 ||
8132 8382 offset + nbytes > 256)
8133 8383 return (EINVAL);
8134 8384
8135 8385 if (nread == NULL)
8136 8386 nread = &nr;
8137 8387 ret = mip->mi_transceiver.mct_read(mip->mi_driver, tranid, page, buf,
8138 8388 nbytes, offset, nread);
8139 8389 if (ret == 0 && *nread > nbytes) {
8140 8390 dev_err(mip->mi_dip, CE_PANIC, "driver wrote %lu bytes into "
8141 8391 "%lu byte sized buffer, possible memory corruption",
8142 8392 *nread, nbytes);
8143 8393 }
8144 8394
8145 8395 return (ret);
8146 8396 }
8147 8397
8148 8398 void
8149 8399 mac_led_init(mac_impl_t *mip)
8150 8400 {
8151 8401 mip->mi_led_modes = MAC_LED_DEFAULT;
8152 8402
8153 8403 if (!mac_capab_get((mac_handle_t)mip, MAC_CAPAB_LED, &mip->mi_led)) {
8154 8404 bzero(&mip->mi_led, sizeof (mac_capab_led_t));
8155 8405 return;
8156 8406 }
8157 8407
8158 8408 if (mip->mi_led.mcl_flags != 0) {
8159 8409 dev_err(mip->mi_dip, CE_WARN, "driver set led capability "
8160 8410 "flags to invalid value: 0x%x, ignoring "
8161 8411 "capability", mip->mi_transceiver.mct_flags);
8162 8412 bzero(&mip->mi_led, sizeof (mac_capab_led_t));
8163 8413 return;
8164 8414 }
8165 8415
8166 8416 if ((mip->mi_led.mcl_modes & ~MAC_LED_ALL) != 0) {
8167 8417 dev_err(mip->mi_dip, CE_WARN, "driver set led capability "
8168 8418 "supported modes to invalid value: 0x%x, ignoring "
8169 8419 "capability", mip->mi_transceiver.mct_flags);
8170 8420 bzero(&mip->mi_led, sizeof (mac_capab_led_t));
8171 8421 return;
8172 8422 }
8173 8423 }
8174 8424
8175 8425 int
8176 8426 mac_led_get(mac_handle_t mh, mac_led_mode_t *supported, mac_led_mode_t *active)
8177 8427 {
8178 8428 mac_impl_t *mip = (mac_impl_t *)mh;
8179 8429
8180 8430 ASSERT(MAC_PERIM_HELD(mh));
8181 8431
8182 8432 if (mip->mi_led.mcl_set == NULL)
8183 8433 return (ENOTSUP);
8184 8434
8185 8435 *supported = mip->mi_led.mcl_modes;
8186 8436 *active = mip->mi_led_modes;
8187 8437
8188 8438 return (0);
8189 8439 }
8190 8440
8191 8441 /*
8192 8442 * Update and multiplex the various LED requests. We only ever send one LED to
8193 8443 * the underlying driver at a time. As such, we end up multiplexing all
8194 8444 * requested states and picking one to send down to the driver.
8195 8445 */
8196 8446 int
8197 8447 mac_led_set(mac_handle_t mh, mac_led_mode_t desired)
8198 8448 {
8199 8449 int ret;
8200 8450 mac_led_mode_t driver;
8201 8451
8202 8452 mac_impl_t *mip = (mac_impl_t *)mh;
8203 8453
8204 8454 ASSERT(MAC_PERIM_HELD(mh));
8205 8455
8206 8456 /*
8207 8457 * If we've been passed a desired value of zero, that indicates that
8208 8458 * we're basically resetting to the value of zero, which is our default
8209 8459 * value.
8210 8460 */
8211 8461 if (desired == 0)
8212 8462 desired = MAC_LED_DEFAULT;
8213 8463
8214 8464 if (mip->mi_led.mcl_set == NULL)
8215 8465 return (ENOTSUP);
8216 8466
8217 8467 /*
8218 8468 * Catch both values that we don't know about and those that the driver
8219 8469 * doesn't support.
8220 8470 */
8221 8471 if ((desired & ~MAC_LED_ALL) != 0)
8222 8472 return (EINVAL);
8223 8473
8224 8474 if ((desired & ~mip->mi_led.mcl_modes) != 0)
8225 8475 return (ENOTSUP);
8226 8476
8227 8477 /*
8228 8478 * If we have the same value, then there is nothing to do.
8229 8479 */
8230 8480 if (desired == mip->mi_led_modes)
8231 8481 return (0);
8232 8482
8233 8483 /*
8234 8484 * Based on the desired value, determine what to send to the driver. We
8235 8485 * only will send a single bit to the driver at any given time. IDENT
8236 8486 * takes priority over OFF or ON. We also let OFF take priority over the
8237 8487 * rest.
8238 8488 */
8239 8489 if (desired & MAC_LED_IDENT) {
8240 8490 driver = MAC_LED_IDENT;
8241 8491 } else if (desired & MAC_LED_OFF) {
8242 8492 driver = MAC_LED_OFF;
8243 8493 } else if (desired & MAC_LED_ON) {
8244 8494 driver = MAC_LED_ON;
8245 8495 } else {
8246 8496 driver = MAC_LED_DEFAULT;
8247 8497 }
8248 8498
8249 8499 if ((ret = mip->mi_led.mcl_set(mip->mi_driver, driver, 0)) == 0) {
8250 8500 mip->mi_led_modes = desired;
8251 8501 }
8252 8502
8253 8503 return (ret);
8254 8504 }
↓ open down ↓ |
212 lines elided |
↑ open up ↑ |
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX