]>
Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | In Linux 2.5 kernels (and later), USB device drivers have additional control |
2 | over how DMA may be used to perform I/O operations. The APIs are detailed | |
3 | in the kernel usb programming guide (kerneldoc, from the source code). | |
4 | ||
5 | ||
6 | API OVERVIEW | |
7 | ||
8 | The big picture is that USB drivers can continue to ignore most DMA issues, | |
5872fb94 | 9 | though they still must provide DMA-ready buffers (see |
395cf969 | 10 | Documentation/DMA-API-HOWTO.txt). That's how they've worked through |
5872fb94 | 11 | the 2.4 (and earlier) kernels. |
1da177e4 LT |
12 | |
13 | OR: they can now be DMA-aware. | |
14 | ||
15 | - New calls enable DMA-aware drivers, letting them allocate dma buffers and | |
16 | manage dma mappings for existing dma-ready buffers (see below). | |
17 | ||
18 | - URBs have an additional "transfer_dma" field, as well as a transfer_flags | |
85bcb5ee AS |
19 | bit saying if it's valid. (Control requests also have "setup_dma", but |
20 | drivers must not use it.) | |
1da177e4 | 21 | |
85bcb5ee AS |
22 | - "usbcore" will map this DMA address, if a DMA-aware driver didn't do |
23 | it first and set URB_NO_TRANSFER_DMA_MAP. HCDs | |
1da177e4 LT |
24 | don't manage dma mappings for URBs. |
25 | ||
26 | - There's a new "generic DMA API", parts of which are usable by USB device | |
27 | drivers. Never use dma_set_mask() on any USB interface or device; that | |
28 | would potentially break all devices sharing that bus. | |
29 | ||
30 | ||
31 | ELIMINATING COPIES | |
32 | ||
33 | It's good to avoid making CPUs copy data needlessly. The costs can add up, | |
34 | and effects like cache-trashing can impose subtle penalties. | |
35 | ||
fbf54dd3 DB |
36 | - If you're doing lots of small data transfers from the same buffer all |
37 | the time, that can really burn up resources on systems which use an | |
38 | IOMMU to manage the DMA mappings. It can cost MUCH more to set up and | |
39 | tear down the IOMMU mappings with each request than perform the I/O! | |
40 | ||
41 | For those specific cases, USB has primitives to allocate less expensive | |
42 | memory. They work like kmalloc and kfree versions that give you the right | |
43 | kind of addresses to store in urb->transfer_buffer and urb->transfer_dma. | |
44 | You'd also set URB_NO_TRANSFER_DMA_MAP in urb->transfer_flags: | |
1da177e4 | 45 | |
997ea58e | 46 | void *usb_alloc_coherent (struct usb_device *dev, size_t size, |
1da177e4 LT |
47 | int mem_flags, dma_addr_t *dma); |
48 | ||
997ea58e | 49 | void usb_free_coherent (struct usb_device *dev, size_t size, |
1da177e4 LT |
50 | void *addr, dma_addr_t dma); |
51 | ||
fbf54dd3 DB |
52 | Most drivers should *NOT* be using these primitives; they don't need |
53 | to use this type of memory ("dma-coherent"), and memory returned from | |
54 | kmalloc() will work just fine. | |
55 | ||
1da177e4 LT |
56 | The memory buffer returned is "dma-coherent"; sometimes you might need to |
57 | force a consistent memory access ordering by using memory barriers. It's | |
58 | not using a streaming DMA mapping, so it's good for small transfers on | |
fbf54dd3 | 59 | systems where the I/O would otherwise thrash an IOMMU mapping. (See |
395cf969 | 60 | Documentation/DMA-API-HOWTO.txt for definitions of "coherent" and |
5872fb94 | 61 | "streaming" DMA mappings.) |
1da177e4 LT |
62 | |
63 | Asking for 1/Nth of a page (as well as asking for N pages) is reasonably | |
64 | space-efficient. | |
65 | ||
fbf54dd3 DB |
66 | On most systems the memory returned will be uncached, because the |
67 | semantics of dma-coherent memory require either bypassing CPU caches | |
68 | or using cache hardware with bus-snooping support. While x86 hardware | |
69 | has such bus-snooping, many other systems use software to flush cache | |
70 | lines to prevent DMA conflicts. | |
71 | ||
1da177e4 | 72 | - Devices on some EHCI controllers could handle DMA to/from high memory. |
1da177e4 | 73 | |
fbf54dd3 DB |
74 | Unfortunately, the current Linux DMA infrastructure doesn't have a sane |
75 | way to expose these capabilities ... and in any case, HIGHMEM is mostly a | |
76 | design wart specific to x86_32. So your best bet is to ensure you never | |
77 | pass a highmem buffer into a USB driver. That's easy; it's the default | |
78 | behavior. Just don't override it; e.g. with NETIF_F_HIGHDMA. | |
1da177e4 | 79 | |
fbf54dd3 DB |
80 | This may force your callers to do some bounce buffering, copying from |
81 | high memory to "normal" DMA memory. If you can come up with a good way | |
82 | to fix this issue (for x86_32 machines with over 1 GByte of memory), | |
83 | feel free to submit patches. | |
1da177e4 LT |
84 | |
85 | ||
86 | WORKING WITH EXISTING BUFFERS | |
87 | ||
88 | Existing buffers aren't usable for DMA without first being mapped into the | |
fbf54dd3 DB |
89 | DMA address space of the device. However, most buffers passed to your |
90 | driver can safely be used with such DMA mapping. (See the first section | |
395cf969 | 91 | of Documentation/DMA-API-HOWTO.txt, titled "What memory is DMA-able?") |
1da177e4 LT |
92 | |
93 | - When you're using scatterlists, you can map everything at once. On some | |
94 | systems, this kicks in an IOMMU and turns the scatterlists into single | |
95 | DMA transactions: | |
96 | ||
97 | int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe, | |
98 | struct scatterlist *sg, int nents); | |
99 | ||
100 | void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe, | |
101 | struct scatterlist *sg, int n_hw_ents); | |
102 | ||
103 | void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe, | |
104 | struct scatterlist *sg, int n_hw_ents); | |
105 | ||
106 | It's probably easier to use the new usb_sg_*() calls, which do the DMA | |
107 | mapping and apply other tweaks to make scatterlist i/o be fast. | |
108 | ||
109 | - Some drivers may prefer to work with the model that they're mapping large | |
110 | buffers, synchronizing their safe re-use. (If there's no re-use, then let | |
111 | usbcore do the map/unmap.) Large periodic transfers make good examples | |
112 | here, since it's cheaper to just synchronize the buffer than to unmap it | |
113 | each time an urb completes and then re-map it on during resubmission. | |
114 | ||
115 | These calls all work with initialized urbs: urb->dev, urb->pipe, | |
116 | urb->transfer_buffer, and urb->transfer_buffer_length must all be | |
117 | valid when these calls are used (urb->setup_packet must be valid too | |
118 | if urb is a control request): | |
119 | ||
120 | struct urb *usb_buffer_map (struct urb *urb); | |
121 | ||
122 | void usb_buffer_dmasync (struct urb *urb); | |
123 | ||
124 | void usb_buffer_unmap (struct urb *urb); | |
125 | ||
126 | The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP | |
85bcb5ee AS |
127 | so that usbcore won't map or unmap the buffer. They cannot be used for |
128 | setup_packet buffers in control requests. | |
fbf54dd3 DB |
129 | |
130 | Note that several of those interfaces are currently commented out, since | |
131 | they don't have current users. See the source code. Other than the dmasync | |
132 | calls (where the underlying DMA primitives have changed), most of them can | |
133 | easily be commented back in if you want to use them. |