]>
Commit | Line | Data |
---|---|---|
7df20f2d SD |
1 | The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low |
2 | level communications API across PCIe currently implemented for MIC. Currently | |
3 | SCIF provides inter-node communication within a single host platform, where a | |
4 | node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of | |
5 | communicating over the PCIe bus while providing an API that is symmetric | |
6 | across all the nodes in the PCIe network. An important design objective for SCIF | |
7 | is to deliver the maximum possible performance given the communication | |
8 | abilities of the hardware. SCIF has been used to implement an offload compiler | |
9 | runtime and OFED support for MPI implementations for MIC coprocessors. | |
10 | ||
11 | ==== SCIF API Components ==== | |
12 | The SCIF API has the following parts: | |
13 | 1. Connection establishment using a client server model | |
14 | 2. Byte stream messaging intended for short messages | |
15 | 3. Node enumeration to determine online nodes | |
16 | 4. Poll semantics for detection of incoming connections and messages | |
17 | 5. Memory registration to pin down pages | |
18 | 6. Remote memory mapping for low latency CPU accesses via mmap | |
19 | 7. Remote DMA (RDMA) for high bandwidth DMA transfers | |
20 | 8. Fence APIs for RDMA synchronization | |
21 | ||
22 | SCIF exposes the notion of a connection which can be used by peer processes on | |
23 | nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A | |
24 | process in a SCIF node initiates a SCIF connection to a peer process on a | |
25 | different node via a SCIF "endpoint". SCIF endpoints support messaging APIs | |
26 | which are similar to connection oriented socket APIs. Connected SCIF endpoints | |
27 | can also register local memory which is followed by data transfer using either | |
28 | DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and | |
29 | kernel mode clients which are functionally equivalent. | |
30 | ||
31 | ==== SCIF Performance for MIC ==== | |
32 | DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus | |
33 | SCIF shows the performance advantages of SCIF for HPC applications and runtimes. | |
34 | ||
35 | Comparison of TCP and SCIF based BW | |
36 | ||
37 | Throughput (GB/sec) | |
38 | 8 + PCIe Bandwidth ****** | |
39 | + TCP ###### | |
40 | 7 + ************************************** SCIF %%%%%% | |
41 | | %%%%%%%%%%%%%%%%%%% | |
42 | 6 + %%%% | |
43 | | %% | |
44 | | %%% | |
45 | 5 + %% | |
46 | | %% | |
47 | 4 + %% | |
48 | | %% | |
49 | 3 + %% | |
50 | | % | |
51 | 2 + %% | |
52 | | %% | |
53 | | % | |
54 | 1 + | |
55 | + ###################################### | |
56 | 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- | |
57 | 1 10 100 1000 10000 100000 | |
58 | Transfer Size (KBytes) | |
59 | ||
60 | SCIF allows memory sharing via mmap(..) between processes on different PCIe | |
61 | nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap | |
62 | latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. | |
63 | ||
64 | SCIF has a user space library which is a thin IOCTL wrapper providing a user | |
65 | space API similar to the kernel API in scif.h. The SCIF user space library | |
66 | is distributed @ https://software.intel.com/en-us/mic-developer | |
67 | ||
68 | Here is some pseudo code for an example of how two applications on two PCIe | |
69 | nodes would typically use the SCIF API: | |
70 | ||
71 | Process A (on node A) Process B (on node B) | |
72 | ||
73 | /* get online node information */ | |
74 | scif_get_node_ids(..) scif_get_node_ids(..) | |
75 | scif_open(..) scif_open(..) | |
76 | scif_bind(..) scif_bind(..) | |
77 | scif_listen(..) | |
78 | scif_accept(..) scif_connect(..) | |
79 | /* SCIF connection established */ | |
80 | ||
81 | /* Send and receive short messages */ | |
82 | scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..) | |
83 | ||
84 | /* Register memory */ | |
85 | scif_register(..) scif_register(..) | |
86 | ||
87 | /* RDMA */ | |
88 | scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..) | |
89 | ||
90 | /* Fence DMAs */ | |
91 | scif_fence_signal(..) scif_fence_signal(..) | |
92 | ||
93 | mmap(..) mmap(..) | |
94 | ||
95 | /* Access remote registered memory */ | |
96 | ||
97 | /* Close the endpoints */ | |
98 | scif_close(..) scif_close(..) |