@@ -37,6 +37,11 @@ This guide is incomplete. If something feels missing, check the bcc and kernel s
3737 - [ 1. bpf_trace_printk()] ( #1-bpf_trace_printk )
3838 - [ 2. BPF_PERF_OUTPUT] ( #2-bpf_perf_output )
3939 - [ 3. perf_submit()] ( #3-perf_submit )
40+ - [ 4. BPF_RINGBUF_OUTPUT] ( #4-bpf_ringbuf_output )
41+ - [ 5. ringbuf_output()] ( #5-ringbuf_output )
42+ - [ 6. ringbuf_reserve()] ( #6-ringbuf_reserve )
43+ - [ 7. ringbuf_submit()] ( #7-ringbuf_submit )
44+ - [ 8. ringbuf_discard()] ( #8-ringbuf_submit )
4045 - [ Maps] ( #maps )
4146 - [ 1. BPF_TABLE] ( #1-bpf_table )
4247 - [ 2. BPF_HASH] ( #2-bpf_hash )
@@ -81,6 +86,8 @@ This guide is incomplete. If something feels missing, check the bcc and kernel s
8186 - [ 2. trace_fields()] ( #2-trace_fields )
8287 - [ Output] ( #output )
8388 - [ 1. perf_buffer_poll()] ( #1-perf_buffer_poll )
89+ - [ 2. ring_buffer_poll()] ( #2-ring_buffer_poll )
90+ - [ 3. ring_buffer_consume()] ( #3-ring_buffer_consume )
8491 - [ Maps] ( #maps )
8592 - [ 1. get_table()] ( #1-get_table )
8693 - [ 2. open_perf_buffer()] ( #2-open_perf_buffer )
@@ -89,6 +96,7 @@ This guide is incomplete. If something feels missing, check the bcc and kernel s
8996 - [ 5. clear()] ( #5-clear )
9097 - [ 6. print_log2_hist()] ( #6-print_log2_hist )
9198 - [ 7. print_linear_hist()] ( #6-print_linear_hist )
99+ - [ 8. open_ring_buffer()] ( #8-open_ring_buffer )
92100 - [ Helpers] ( #helpers )
93101 - [ 1. ksym()] ( #1-ksym )
94102 - [ 2. ksymname()] ( #2-ksymname )
@@ -647,6 +655,131 @@ Examples in situ:
647655[search /examples](https://github.com/iovisor/bcc/search?q=perf_submit+path%3Aexamples&type=Code),
648656[search /tools](https://github.com/iovisor/bcc/search?q=perf_submit+path%3Atools&type=Code)
649657
658+ ### 4. BPF_RINGBUF_OUTPUT
659+
660+ Syntax: ```BPF_RINGBUF_OUTPUT(name, page_cnt)```
661+
662+ Creates a BPF table for pushing out custom event data to user space via a ringbuf ring buffer.
663+ ```BPF_RINGBUF_OUTPUT``` has several advantages over ```BPF_PERF_OUTPUT```, summarized as follows:
664+
665+ - Buffer is shared across all CPUs, meaning no per-CPU allocation
666+ - Supports two APIs for BPF programs
667+ - ```map.ringbuf_output()``` works like ```map.perf_submit()``` (covered in [ringbuf_output](#5-ringbuf_output))
668+ - ```map.ringbuf_reserve()```/```map.ringbuf_submit()```/```map.ringbuf_discard()```
669+ split the process of reserving buffer space and submitting events into two steps
670+ (covered in [ringbuf_reserve](#6-ringbuf_reserve), [ringbuf_submit](#7-ringbuf_submit), [ringbuf_discard](#8-ringbuf_submit))
671+ - BPF APIs do not require access to a CPU ctx argument
672+ - Superior performance and latency in userspace thanks to a shared ring buffer manager
673+ - Supports two ways of consuming data in userspace
674+
675+ Starting in Linux 5.8, this should be the preferred method for pushing per-event data to user space.
676+
677+ Example of both APIs:
678+
679+ ```C
680+ struct data_t {
681+ u32 pid;
682+ u64 ts;
683+ char comm[TASK_COMM_LEN];
684+ };
685+
686+ // Creates a ringbuf called events with 8 pages of space, shared across all CPUs
687+ BPF_RINGBUF_OUTPUT(events, 8);
688+
689+ int first_api_example(struct pt_regs *ctx) {
690+ struct data_t data = {};
691+
692+ data.pid = bpf_get_current_pid_tgid();
693+ data.ts = bpf_ktime_get_ns();
694+ bpf_get_current_comm(&data.comm, sizeof(data.comm));
695+
696+ events.ringbuf_output(&data, sizeof(data), 0 /* flags */);
697+
698+ return 0;
699+ }
700+
701+ int second_api_example(struct pt_regs *ctx) {
702+ struct data_t *data = events.ringbuf_reserve(sizeof(struct data_t));
703+ if (!data) { // Failed to reserve space
704+ return 1;
705+ }
706+
707+ data->pid = bpf_get_current_pid_tgid();
708+ data->ts = bpf_ktime_get_ns();
709+ bpf_get_current_comm(&data->comm, sizeof(data->comm));
710+
711+ events.ringbuf_submit(data, 0 /* flags */);
712+
713+ return 0;
714+ }
715+ ```
716+
717+ The output table is named ``` events ``` . Data is allocated via ``` events.ringbuf_reserve() ``` and pushed to it via ``` events.ringbuf_submit() ``` .
718+
719+ Examples in situ: <!-- TODO -->
720+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=BPF_RINGBUF_OUTPUT+path%3Aexamples&type=Code ) ,
721+
722+ ### 5. ringbuf_output()
723+
724+ Syntax: ``` int ringbuf_output((void *)data, u64 data_size, u64 flags) ```
725+
726+ Return: 0 on success
727+
728+ Flags:
729+ - ``` BPF_RB_NO_WAKEUP ``` : Do not sent notification of new data availability
730+ - ``` BPF_RB_FORCE_WAKEUP ``` : Send notification of new data availability unconditionally
731+
732+ A method of the BPF_RINGBUF_OUTPUT table, for submitting custom event data to user space. This method works like ``` perf_submit() ``` ,
733+ although it does not require a ctx argument.
734+
735+ Examples in situ: <!-- TODO -->
736+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ringbuf_output+path%3Aexamples&type=Code ) ,
737+
738+ ### 6. ringbuf_reserve()
739+
740+ Syntax: ``` void* ringbuf_reserve(u64 data_size) ```
741+
742+ Return: Pointer to data struct on success, NULL on failure
743+
744+ A method of the BPF_RINGBUF_OUTPUT table, for reserving space in the ring buffer and simultaenously
745+ allocating a data struct for output. Must be used with one of ``` ringbuf_submit ``` or ``` ringbuf_discard ``` .
746+
747+ Examples in situ: <!-- TODO -->
748+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ringbuf_reserve+path%3Aexamples&type=Code ) ,
749+
750+ ### 7. ringbuf_submit()
751+
752+ Syntax: ``` void ringbuf_submit((void *)data, u64 flags) ```
753+
754+ Return: Nothing, always succeeds
755+
756+ Flags:
757+ - ``` BPF_RB_NO_WAKEUP ``` : Do not sent notification of new data availability
758+ - ``` BPF_RB_FORCE_WAKEUP ``` : Send notification of new data availability unconditionally
759+
760+ A method of the BPF_RINGBUF_OUTPUT table, for submitting custom event data to user space. Must be preceded by a call to
761+ ``` ringbuf_reserve() ``` to reserve space for the data.
762+
763+ Examples in situ: <!-- TODO -->
764+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ringbuf_submit+path%3Aexamples&type=Code ) ,
765+
766+ ### 8. ringbuf_discard()
767+
768+ Syntax: ``` void ringbuf_discard((void *)data, u64 flags) ```
769+
770+ Return: Nothing, always succeeds
771+
772+ Flags:
773+ - ``` BPF_RB_NO_WAKEUP ``` : Do not sent notification of new data availability
774+ - ``` BPF_RB_FORCE_WAKEUP ``` : Send notification of new data availability unconditionally
775+
776+ A method of the BPF_RINGBUF_OUTPUT table, for discarding custom event data; userspace
777+ ignores the data associated with the discarded event. Must be preceded by a call to
778+ ``` ringbuf_reserve() ``` to reserve space for the data.
779+
780+ Examples in situ: <!-- TODO -->
781+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ringbuf_submit+path%3Aexamples&type=Code ) ,
782+
650783## Maps
651784
652785Maps are BPF data stores, and are the basis for higher level object types including tables, hashes, and histograms.
@@ -1451,6 +1584,55 @@ Examples in situ:
14511584[ search /examples] ( https://github.com/iovisor/bcc/search?q=perf_buffer_poll+path%3Aexamples+language%3Apython&type=Code ) ,
14521585[ search /tools] ( https://github.com/iovisor/bcc/search?q=perf_buffer_poll+path%3Atools+language%3Apython&type=Code )
14531586
1587+ ### 2. ring_buffer_poll()
1588+
1589+ Syntax: ``` BPF.ring_buffer_poll(timeout=T) ```
1590+
1591+ This polls from all open ringbuf ring buffers, calling the callback function that was provided when calling open_ring_buffer for each entry.
1592+
1593+ The timeout parameter is optional and measured in milliseconds. In its absence, polling continues until
1594+ there is no more data or the callback returns a negative value.
1595+
1596+ Example:
1597+
1598+ ``` Python
1599+ # loop with callback to print_event
1600+ b[" events" ].open_ring_buffer(print_event)
1601+ while 1 :
1602+ try :
1603+ b.ring_buffer_poll(30 )
1604+ except KeyboardInterrupt :
1605+ exit ();
1606+ ```
1607+
1608+ Examples in situ:
1609+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ring_buffer_poll+path%3Aexamples+language%3Apython&type=Code ) ,
1610+
1611+ ### 3. ring_buffer_consume()
1612+
1613+ Syntax: ``` BPF.ring_buffer_consume() ```
1614+
1615+ This consumes from all open ringbuf ring buffers, calling the callback function that was provided when calling open_ring_buffer for each entry.
1616+
1617+ Unlike ``` ring_buffer_poll ``` , this method ** does not poll for data** before attempting to consume.
1618+ This reduces latency at the expense of higher CPU consumption. If you are unsure which to use,
1619+ use ``` ring_buffer_poll ``` .
1620+
1621+ Example:
1622+
1623+ ``` Python
1624+ # loop with callback to print_event
1625+ b[" events" ].open_ring_buffer(print_event)
1626+ while 1 :
1627+ try :
1628+ b.ring_buffer_consume()
1629+ except KeyboardInterrupt :
1630+ exit ();
1631+ ```
1632+
1633+ Examples in situ:
1634+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=ring_buffer_consume+path%3Aexamples+language%3Apython&type=Code ) ,
1635+
14541636## Maps
14551637
14561638Maps are BPF data stores, and are used in bcc to implement a table, and then higher level objects on top of tables, including hashes and histograms.
@@ -1694,6 +1876,68 @@ Examples in situ:
16941876[ search /examples] ( https://github.com/iovisor/bcc/search?q=print_linear_hist+path%3Aexamples+language%3Apython&type=Code ) ,
16951877[ search /tools] ( https://github.com/iovisor/bcc/search?q=print_linear_hist+path%3Atools+language%3Apython&type=Code )
16961878
1879+ ### 8. open_ring_buffer()
1880+
1881+ Syntax: ``` table.open_ring_buffer(callback, ctx=None) ```
1882+
1883+ This operates on a table as defined in BPF as BPF_RINGBUF_OUTPUT(), and associates the callback Python function ``` callback ``` to be called when data is available in the ringbuf ring buffer. This is part of the new (Linux 5.8+) recommended mechanism for transferring per-event data from kernel to user space. Unlike perf buffers, ringbuf sizes are specified within the BPF program, as part of the ``` BPF_RINGBUF_OUTPUT ``` macro. If the callback is not processing data fast enough, some submitted data may be lost. In this case, the events should be polled more frequently and/or the size of the ring buffer should be increased.
1884+
1885+ Example:
1886+
1887+ ``` Python
1888+ # process event
1889+ def print_event (ctx , data , size ):
1890+ event = ct.cast(data, ct.POINTER(Data)).contents
1891+ [... ]
1892+
1893+ # loop with callback to print_event
1894+ b[" events" ].open_ring_buffer(print_event)
1895+ while 1 :
1896+ try :
1897+ b.ring_buffer_poll()
1898+ except KeyboardInterrupt :
1899+ exit ()
1900+ ```
1901+
1902+ Note that the data structure transferred will need to be declared in C in the BPF program. For example:
1903+
1904+ ``` C
1905+ // define output data structure in C
1906+ struct data_t {
1907+ u32 pid;
1908+ u64 ts;
1909+ char comm[TASK_COMM_LEN];
1910+ };
1911+ BPF_RINGBUF_OUTPUT (events, 8);
1912+ [ ...]
1913+ ```
1914+
1915+ In Python, you can either let bcc generate the data structure from C declaration automatically (recommended):
1916+
1917+ ```Python
1918+ def print_event(ctx, data, size):
1919+ event = b["events"].event(data)
1920+ [...]
1921+ ```
1922+
1923+ or define it manually:
1924+
1925+ ``` Python
1926+ # define output data structure in Python
1927+ TASK_COMM_LEN = 16 # linux/sched.h
1928+ class Data (ct .Structure ):
1929+ _fields_ = [(" pid" , ct.c_ulonglong),
1930+ (" ts" , ct.c_ulonglong),
1931+ (" comm" , ct.c_char * TASK_COMM_LEN )]
1932+
1933+ def print_event (ctx , data , size ):
1934+ event = ct.cast(data, ct.POINTER(Data)).contents
1935+ [... ]
1936+ ```
1937+
1938+ Examples in situ:
1939+ [ search /examples] ( https://github.com/iovisor/bcc/search?q=open_ring_buffer+path%3Aexamples+language%3Apython&type=Code ) ,
1940+
16971941## Helpers
16981942
16991943Some helper methods provided by bcc. Note that since we're in Python, we can import any Python library and their methods, including, for example, the libraries: argparse, collections, ctypes, datetime, re, socket, struct, subprocess, sys, and time.
0 commit comments